Session-replay telemetry: what behavioral vendors record and how it's scored

Open the network tab on a checkout page that runs a session-replay script and you will see it: a small request firing every few seconds, payload compressed, going to a path like /rec/bundle. That request is not a page view or an analytics ping in the ordinary sense. It is a serialized slice of everything that happened in the DOM since the last one fired. Every node that mutated, every place the cursor went, every field that changed, every scroll offset, timestamped to the millisecond. Played back later, it reconstructs the visit as a movie the visitor never knew was being filmed.

The interesting question is not whether this is creepy. It plainly is, and regulators have said so. The interesting question is mechanical: what exactly gets recorded, how is it encoded on the wire, how does a server turn that stream into a verdict, and why does the same telemetry that powers a product-analytics dashboard also sit underneath bank-grade fraud detection? The answer runs through one open-source library, two waves of commercial vendors with very different motives, and a 2017 research report that named names.

This post traces the pipeline end to end. First the recording model, the full DOM snapshot plus incremental mutations that nearly every vendor now uses. Then the event taxonomy, what a mouse-move record or an input record actually contains. Then masking, the client-side redaction that is supposed to keep passwords out of the stream and frequently does not. Then the split: the FullStory and Hotjar lineage that records to help you fix your funnel, versus the BioCatch-style fraud stack that records to decide whether you are you. And finally the leaks and the law, because this is the rare fingerprinting topic where the courtroom has moved faster than the countermeasure.

The recording model: one snapshot, then deltas

A session replay is not a video. It would be far too large if it were. The dominant approach instead serializes the live DOM into a JSON tree once, at the start of the session, then records every subsequent change as a small diff. Play the snapshot, apply the diffs in timestamp order against a sandboxed iframe, and you have a pixel-faithful reconstruction at a tiny fraction of a video’s size. A half-hour session that would be hundreds of megabytes as screen capture lands around one to five megabytes gzipped.

The reference implementation of this idea is rrweb, an open-source library that records and replays the web. It matters out of proportion to its star count because the commercial market standardized on its model. The same two-phase design, full snapshot plus incremental snapshots, sits underneath FullStory, Microsoft Clarity, LogRocket, PostHog’s replay feature, Sentry’s, and most self-hosted setups. Vendors who predate rrweb or who built their own recorder still converged on the same shape, because there is essentially one efficient way to do this and rrweb found it first in public.

The mechanism rrweb uses to keep snapshot and diff in sync is a mirror. At snapshot time every DOM node is assigned a stable integer id and the serializer walks the tree depth-first, emitting node type, tag, attributes, and children. The recorder keeps a map from live node to id; the replayer keeps the inverse. When a mutation fires later, it does not ship the node, it ships the id plus the change. Node 4187 gained an attribute. Node 4191’s text became this. A child was inserted under node 88 before node 90. The replayer looks each id up in its mirror and applies the edit. This is why the format is compact and also why it is fragile to anything that desynchronizes the two trees.

*The session-replay encoding: one full DOM serialization at the start, then a timestamped stream of typed deltas keyed by node id. The enum values shown are rrweb's.*

The wire format is a sequence of events, each with a type and a timestamp. In rrweb’s vocabulary the EventType enum is a plain numeric enum: DomContentLoaded is 0, Load is 1, FullSnapshot is 2, IncrementalSnapshot is 3, Meta is 4, Custom is 5, with Plugin and Asset added later as 6 and 7. The interesting traffic is almost all type 3. Each incremental event carries a data.source field, an IncrementalSource, that says what kind of change it is.

The event taxonomy: what a delta actually contains

The IncrementalSource enum is the real catalogue of what session replay watches. In rrweb’s type definitions it runs: Mutation (0), MouseMove (1), MouseInteraction (2), Scroll (3), ViewportResize (4), Input (5), TouchMove (6), MediaInteraction (7), StyleSheetRule (8), CanvasMutation (9), Font (10), and then a tail of later additions: Log, Drag, StyleDeclaration, Selection, AdoptedStyleSheet, CustomElement. Commercial recorders carry their own enums with different numbers, but the categories are the same because the browser only exposes so many ways to observe a page. The exact field layout that FullStory or Clarity puts on the wire is not public; what follows is the rrweb structure, which the commercial formats resemble closely enough that researchers reverse-engineering them describe near-identical shapes.

Take mouse movement, the signal people picture first. A MouseMove record does not ship one coordinate per event. That would flood the stream. Instead it carries a positions array, where each entry is { x, y, id, timeOffset }. The id is the mirror id of the element under the pointer; timeOffset is a negative millisecond delta back from the event’s own timestamp. So a single record encodes a short burst of the cursor’s path with sub-event timing, and the replayer interpolates between the points to draw a smooth track. rrweb throttles this in two layers: it samples a coordinate at most once every 20 milliseconds, and flushes the accumulated batch at most once every 500 milliseconds. Those two numbers set the temporal resolution of the recorded mouse path, and they are tunable. A fraud vendor that wants finer kinematics turns the sampling up; a product-analytics vendor that wants smaller bundles turns it down.

Clicks and the like are a separate source, MouseInteraction, with its own sub-type enum. rrweb’s MouseInteractions runs MouseUp (0), MouseDown (1), Click (2), ContextMenu (3), DblClick (4), Focus (5), Blur (6), TouchStart (7), and the touch tail. Focus and blur living in the same enum as click is worth pausing on. The recorder knows not only where you clicked but the order in which fields took and lost focus, which is the skeleton of how someone moves through a form. Combine focus and blur timestamps with Input events and you can reconstruct dwell time per field without any explicit keystroke capture.

That Input source is the one that does the privacy damage, and its behavior is the single most misunderstood thing about session replay. A naive reading is that these scripts log keystrokes. Most do not, at least not as keydown/keyup. FullStory’s own documentation is explicit that it does not capture keystrokes in the keyboard-event sense but rather input change events that show the resulting text in a field. The distinction sounds reassuring and mostly is not, because the recorded text value is the sensitive part. Whether the stream contains the event pressed S, pressed E, pressed C or the value secret makes no difference to a server that wanted the password. And because the recorder reads the field’s value rather than the keyboard, masking has to happen on the value, which is where it gets hard.

*A single mouse-move record carries a short path, not one point. The per-point timeOffset is what makes velocity and acceleration recoverable downstream.*

Two more sources are worth naming because they are where the format leaks more than people expect. StyleSheetRule and AdoptedStyleSheet record CSS changes, which matters because CSS can encode state, a :hover rule or a class toggle can reveal interaction the DOM alone would not. And CanvasMutation records draw calls on <canvas>, off by default in most deployments because it is expensive, but when on it captures the actual rendered content of charts, signatures, and anything else painted to a canvas. The Font source captures custom font loads, which is minor on its own but contributes to the device fingerprint the same telemetry doubles as.

Bundling, streaming, and the shape on the wire

Events do not leave the browser one at a time. The recorder queues them and flushes on an interval. FullStory’s client bundles events and posts them to /rec/bundle every five seconds, after compressing the payload by upward of sixty percent. The script that does this, fs.js, is roughly 60 KB, loads asynchronously so it does not block render, fetches per-org settings on startup, and sets a first-party fs_uid cookie to stitch a visitor’s sessions together. The bootstrap takes under 300 milliseconds, which is the point: the whole apparatus is designed to be imperceptible.

The five-second cadence is a deliberate tradeoff. Buffer too long and you lose the tail of a session when the user closes the tab before a flush. Buffer too little and you pay in request overhead and battery. Most replay vendors land in the two-to-ten-second range and add a flush on visibilitychange or pagehide to catch the exit. The first-party cookie matters for a different reason: because the bundle posts to the site’s own domain (or a CNAME’d subdomain) rather than a third-party host, it survives third-party-cookie blocking and reads as first-party traffic to the browser. That is a feature for the vendor and a problem for anyone hoping their tracker blocker will catch it.

If you want the broader picture of how a client payload like this is assembled and what a server does with it on arrival, the DataDome JS tag and Akamai sensor_data writeups cover the anti-bot cousins of this same collect-bundle-post pattern, which share a lot of the DNA even though their goal is detection rather than playback.

Masking: the redaction that runs in the browser, mostly

Because the recorder reads field values, the only place to keep a secret out of the stream is to strip it before the bundle is sent. Every serious vendor does client-side masking, and the honest ones say so plainly. Sentry’s replay SDK redacts all HTML text nodes and all images before they leave the browser by default, replaces every keypress-derived value with asterisks, and records mouse movement only as endpoints rather than full paths unless you opt in. You unmask deliberately, element by element, by tagging known-safe nodes; the safe default is paranoid.

rrweb exposes the same controls as configuration. maskAllInputs blanks every form field. maskTextClass masks the text of any element carrying a named class, blockClass removes matching elements from the recording entirely, and ignoreClass stops tracking input on them. There are HTML-attribute equivalents so a page author can annotate the markup directly: an element with class rr-block is dropped, rr-ignore is not tracked, rr-mask has its text masked. Sentry’s flavor of the same idea uses data-sentry-mask, data-sentry-block, and data-sentry-unmask. The model across vendors is identical. Masking is opt-out per element from a masked-by-default baseline, or opt-in per element from an unmasked baseline, and which baseline a given deployment runs is a config decision the visitor cannot see.

*Masking is a client-side heuristic. When the heuristic keys off input type or field name, anything that changes those, like a show-password toggle, defeats it.*

The trouble is that heuristic masking has edge cases, and the edge cases are exactly the sensitive fields. This is not theoretical. It is the finding of the most cited research in the area.

The 2017 no-boundaries report and what it proved

In November 2017 a Princeton group published No boundaries: Exfiltration of personal data by session-replay scripts. They measured seven of the largest replay vendors of the era: Yandex, FullStory, Hotjar, UserReplay, Smartlook, Clicktale, and SessionCam. The headline that traveled was that scripts on hundreds of top sites record every keystroke and mouse move. The detail that mattered to engineers was the catalogue of how the masking failed.

FullStory at the time redacted credit-card fields by the autocomplete="cc-number" attribute, and collected card numbers from any field that lacked it. Yandex Metrica used a -metrika-nokeys class to exclude fields. Across vendors the masking keyed off input type, field name, or an explicit annotation, and any field the site author forgot to annotate flowed through in the clear. On Walgreens.com, which did use FullStory’s redaction features, the researchers found medical conditions, prescriptions, and real names still reaching the recorder. On Bonobos.com, full card details. The recordings could not reasonably be expected to stay anonymous, because they contained the exact data a checkout form collects.

The follow-up three months later was worse, because it hit passwords. In No boundaries for credentials, the same authors showed passwords leaking to Mixpanel, FullStory, SessionCam, and UserReplay through mechanisms that no amount of field annotation cleanly fixes. Mobile-friendly show-password toggles flip an input from type=password to type=text, sliding straight past a filter that only excludes password-type fields. Browser extensions that unmask passwords do the same to the DOM. And they found that roughly fifteen percent of the 36,972 password fields they surveyed would not match a substring filter looking for the word “pass,” because the fields were named something else. Their conclusion was blunt: there is no foolproof way for these third-party scripts to prevent password collection given what they are built to do. The functionality and the leak are the same feature.

That sentence is the whole privacy problem in one line. A recorder that faithfully captures form state will capture sensitive form state, and the only defenses are heuristics that, by construction, have gaps. Server-side scrubbing after the fact does not help, because the secret has already crossed the network and been written to the vendor’s storage. The cold-start and entropy tradeoffs that show up across behavioral biometrics apply here too: the more faithfully you capture behavior, the more identifying and the more sensitive the capture becomes, and you cannot have the fidelity without the exposure.

Two lineages, same telemetry

Here is the fork that makes this topic interesting. The exact same stream of mouse paths, focus timings, and input events feeds two industries with opposite relationships to the user.

The product-analytics lineage is the one most people mean by session replay. It descends through Clicktale, founded in 2006, which pioneered recording visits to study usability, then Hotjar, FullStory, and the rest. The motive is to watch where users rage-click, where they abandon a funnel, where a layout bug eats conversions. Consolidation has been heavy. Contentsquare acquired Clicktale in 2019 and Hotjar on September 1, 2021, building a single experience-analytics conglomerate out of what had been competitors. Microsoft entered the bottom of the market with Clarity, free and rrweb-based, which made replay ubiquitous on small sites that would never have paid for it. The scoring this lineage does is aggregate. Heatmaps, funnel drop-off rates, frustration signals like rapid back-and-forth clicking. It scores the page, not the person.

The fraud lineage uses the same kinematics to score the person. Behavioral-biometrics vendors, BioCatch the best known, treat the mouse path and the typing rhythm as a biometric, a signature of how a specific human moves, and ask on every session whether the human at the keyboard is the account’s real owner. BioCatch says it collects and analyzes over 2,000 behavioral parameters: mouse dynamics, keystroke rhythm, navigational patterns, and what it calls cognitive traits. The verdict comes back as a risk score and reason codes in under 300 milliseconds, fast enough for a bank to step up authentication or block a transfer mid-session. This is continuous authentication, identity checked from login to logout rather than once at the door.

*One collection model, two motives. The analytics branch asks what the page is doing wrong; the fraud branch asks whether the person is who the account says they are.*

The scoring techniques on the fraud side go beyond passive observation. BioCatch’s patented invisible-challenge approach injects subtle, unannounced perturbations into a session, a small change the user reacts to subconsciously without noticing anything, and reads the reaction. A legitimate account holder and an account-takeover bot respond differently to the same nudge. The user feels nothing; the system gets a labeled probe. This is the same instinct behind the timing and input traps that anti-bot honeypots and keystroke-dynamics systems use, applied to live banking sessions rather than bot screening. The fidelity the fraud branch needs is higher than analytics needs, which is why these vendors sample mouse kinematics more aggressively and capture pointer pressure and device-orientation signals that a Hotjar deployment would never bother with.

How a server turns the stream into a verdict

On the analytics side the scoring is mostly aggregation and pattern detection over reconstructed sessions: count rage-clicks, measure time-to-first-interaction, cluster sessions by path, flag dead clicks on non-interactive elements. None of it needs to identify the individual; it needs the population. The replay itself is the deliverable, watched by a human or summarized by a model.

On the fraud side the pipeline is a classifier. Raw events become features. From the positions arrays you derive velocity, acceleration, jerk, curvature, the straightness of paths, pause distributions, and the characteristic overshoot-and-correct of a real hand approaching a target, the Fitts’s-law signature that is genuinely hard to synthesize and that we cover in why a real mouse path is hard to fake. From focus and blur timings you derive per-field dwell and the order of traversal. From input timing you derive typing cadence even without keystroke events, because the value updates carry their own timestamps. Those features feed a model that outputs a score, and the score gates an action. The exact model architecture and feature weights are trade secrets at every vendor; the feature families are well documented because they are the standard mouse-and-keystroke-dynamics literature applied at scale.

The detail that separates a fraud-grade pipeline from an analytics one is the reference. Analytics scores a session against a population baseline. Fraud scores a session against the account’s own history, a per-user behavioral profile built from prior legitimate sessions, then flags deviation. That is what makes it authentication rather than analytics, and it is also what makes the cold-start problem acute. A brand-new account has no profile to deviate from, which is why these systems lean hardest on population-level bot-versus-human signals for first sessions and only get personal once they have history.

The law caught up before the bypass did

Session replay is unusual among fingerprinting topics because the most active countermeasure is not technical, it is litigation. Since 2022 plaintiffs’ firms in California have filed an enormous volume of claims, by some estimates 50,000 to 100,000 or more, under the California Invasion of Privacy Act, Penal Code sections 631 and 632.7, arguing that recording a visitor’s interactions is wiretapping a communication without consent. CIPA offers statutory damages of $5,000 per violation with no need to prove actual harm, which is what makes the math work for mass filing.

The courts have been uneven and are trending skeptical. In Love v. Ladder Financial the plaintiffs sued both the website operator and FullStory directly; both motions to dismiss were granted on January 11, 2024. In Torres v. Prudential Financial, a Northern District of California court held at summary judgment in April 2025 that CIPA liability requires evidence that a party actually read or tried to read the contents of a communication while it was in transit, and that mere capture during a session does not clear that bar. The through-line of the 2024 to 2025 decisions is that recording is not automatically interception, and that the vendor acting as a tool of the site operator is often not a third-party eavesdropper in the statutory sense. The theory is not dead, but it is narrower than the early filings assumed.

In Europe the analysis runs through consent rather than wiretapping. Session replay that records personal data needs a lawful basis, and regulators have leaned toward requiring explicit opt-in before recording starts rather than accepting a legitimate-interest claim for what is, functionally, comprehensive surveillance of a visit. The cookie-consent enforcement wave gives the shape of it: large fines for non-compliant consent mechanisms have made clear that pre-ticked boxes and buried opt-outs do not count, and a replay script that fires before consent is captured is collecting on the wrong side of that line.

What it adds up to

Strip away the two industries and session replay is one modest technical idea pushed to its logical end. Serialize the DOM once, ship the diffs, and you can reconstruct anything that happened on a page. The idea is genuinely elegant. The same elegance is the problem, because a recorder that captures the page faithfully captures the secrets typed into the page faithfully, and the only thing standing between a password and a vendor’s storage is a class-name heuristic that a show-password toggle defeats. The Princeton researchers said in 2018 that there is no foolproof fix given what these scripts are for, and nothing in the eight years since has falsified them. The masking got better. The gaps are still gaps.

What changed is who is watching and why. The analytics lineage wants your aggregate behavior and mostly does not care which person you are. The fraud lineage wants exactly that, the signature of your specific hand on a specific mouse, checked continuously against your own past so a bank can tell you from someone who stole your password. The telemetry is identical down to the positions array. The difference is the reference it gets scored against, a population on one side and a single person on the other, and that difference is the whole distance between a heatmap and a biometric. When you find a /rec/bundle request on a banking login page rather than a marketing page, you have learned which one you are dealing with.

Sources & further reading

Englehardt, Acar, Narayanan (2017), No boundaries: Exfiltration of personal data by session-replay scripts — the foundational measurement study naming seven vendors and the masking failures on Walgreens, Bonobos, and others.
Englehardt, Acar, Narayanan (2018), No boundaries for credentials: Password leaks to Mixpanel and session replay companies — the follow-up documenting password leakage through show-password toggles and non-standard field names.
rrweb (2024), rrweb-io/rrweb on GitHub — the reference open-source recorder; the snapshot-plus-incremental model the commercial market standardized on.
rrweb (2024), types/src/index.ts — exact EventType, IncrementalSource, and MouseInteractions enum values and the mousePosition and mouseInteractionData type definitions.
rrweb (2024), Dive into events — the data shapes for each incremental source, including the positions array with timeOffset.
FullStory (2024), How does FullStory capture data to recreate my users’ experience? — the /rec/bundle five-second flush, ~60 KB fs.js, fs_uid cookie, and the input-change vs keystroke distinction.
Sentry (2024), Protecting user privacy in session replay — masked-by-default client-side redaction, asterisk replacement of keypresses, and the data-sentry-mask / -block / -unmask attributes.
Hotjar (2024), How do recordings work: advanced explanation — MutationObserver-based capture and the vendor’s stated automatic anonymization of fields like credit cards.
Contentsquare (2021), Contentsquare acquires Hotjar — the September 1, 2021 acquisition that, with the 2019 Clicktale deal, consolidated the analytics lineage.
BioCatch (2024), Behavioral biometrics: a primer on dynamic fraud detection — the ~2,000-parameter model, sub-300ms risk scoring, and the invisible-challenge approach to continuous authentication.
Decter / Frankfurt Kurnit (2024), California federal court holds session replay software does not violate CIPA — the Love v. Ladder Financial dismissal and the narrowing of the wiretap theory.
Inside Class Actions (2026), Website wiretapping roundup: 2025 decisions and developments — the Torres v. Prudential summary-judgment standard and the 2025 trend in CIPA replay litigation.