Skip to content

Synthesizing human-like input events without tripping behavioral detectors

· 20 min read
Copyright: MIT
Wordmark reading isTrusted: true with the word true in orange and an orange underline bar

Dispatch a click event from JavaScript and the button visibly does nothing useful. The handler runs, but the form behind it refuses to submit, the typeahead stays closed, the modal will not open. The page is not broken. It is reading one boolean on the event object, isTrusted, and your fabricated event carries false. So you reach for the Chrome DevTools Protocol, inject the click one layer deeper in the browser, and now the same event reads isTrusted: true. The form submits. You have the flag. And the behavioral detector still flags you within a handful of interactions.

That gap is the subject of this post. The isTrusted boolean is the first and crudest line a site can draw between a person and a script, and it is genuinely hard to forge from page JavaScript. But it is one bit. A bit is cheap to satisfy and tells the defender almost nothing once you can satisfy it. The interesting engineering is everything the bit does not cover: the order browsers fire pointer and mouse events in, the telemetry fields a real input device populates that a fabricated one leaves empty or zero, and the timing distribution that no amount of correct sequencing fixes. This is a defensive walkthrough of how that machinery works and why fabricated input keeps getting caught. There is no working evasion toolkit here, by design.

The sections run in order. First, what isTrusted actually is, where it lives on the object, and why the [Unforgeable] decision from 2013 still matters. Then the event sequence a real click produces, because order is a signal in its own right. Then the layer where CDP injects events and why that flips the flag to true. Then the part detectors actually lean on now: the fields that come out zeroed or absent, and the timing that gives the game away. A closing section on where this leaves the contest in 2026.

The one bit: what isTrusted is and where it lives

The DOM standard defines isTrusted in one sentence. It is “a boolean value that is true when the event was generated by the user agent (including via user actions and programmatic methods such as HTMLElement.focus()), and false when the event was dispatched via EventTarget.dispatchEvent().” That parenthetical matters more than it looks. A handful of programmatic calls produce trusted events, focus() among them, because the user agent itself generates them. The general case does not. Construct a MouseEvent and feed it to dispatchEvent, and the event you get back is marked untrusted.

There is a specific trap inside this. HTMLElement.click() looks like it should be the honest way to click an element from script, and it does fire a real click event with the activation behavior attached. But the spec is explicit that “the click event fired through HTMLElement.click() sets the isTrusted property to false.” So the convenience method that exists precisely to click things programmatically still brands its output as synthetic. There is no page-level API that produces a trusted click. That is the design, not an oversight.

For most of the web this boolean does nothing. The original Chromium discussion about implementing it, back in June 2015, said as much. Ken Buchanan described it as “primarily intended to be used by extensions,” with Google’s own Password Alert wanting to know whether an event “originated from a script running in the main world, or if it came from the user.” The expectation was “limited usefulness on the open web, which is why it had not been implemented to date, because scripts should be able to achieve the same result with using custom attributes.” Other vendors had it, so Chrome added it for parity. Bot detection was not the motivating use case. It became one anyway, because the property turned out to be the single cheapest server-trustable tell that an interaction was scripted from inside the page.

The reason it is trustable at all comes down to a decision made in early 2013. The first draft put isTrusted on the prototype as an ordinary readonly accessor, which sounds safe until you remember that a readonly accessor on a prototype is still configurable. Bug 21068, filed against the DOM spec on 21 February 2013, pointed out the obvious consequence: a configurable property could be redefined with Object.defineProperty to report true on a forged event, “pretty useless in code like a popup blocker.” The fix was to mark the attribute [Unforgeable]. In WebIDL terms that moves it off the prototype and onto each instance as a non-configurable own property. You cannot redefine it on the instance, you cannot delete it, and there is no prototype getter to override that would change what event.isTrusted returns for a real event. Boris Zbarsky put the resulting guarantee plainly in the Chromium thread: “the isTrusted property is specified to be a non-configurable own property on Event objects.”

2013 draft: prototype getter (configurable) Event.prototype.isTrusted Object.defineProperty(ev, 'isTrusted', {value:true}) overrides the inherited getter: forgeable post-fix: [Unforgeable] own property ev.isTrusted (own, sealed) defineProperty throws / is silently ignored non-configurable, no proto getter: unforgeable for real events The fix did not make events trusted. It made the flag impossible to repaint on a given event. *Why isTrusted is server-trustable: the property moved off the prototype, so there is no inherited getter to override and no instance slot to redefine.*

One caveat the same discussion raised, and it is worth keeping in view because it bounds how much the flag can ever prove. Marking the attribute unforgeable stops you redefining it on a real event object. It does not stop you handing a handler a different object entirely. As Buchanan noted, “scripts could still do the same with a differently named attribute,” and more to the point a hostile script can swap out addEventListener and feed listeners fake objects whose isTrusted getter always returns true. So the flag is trustworthy from the perspective of code that receives genuine Event instances through an uncompromised path. It is not a cryptographic guarantee about the whole page. For a third-party detection script running in the same main world as the automation, that distinction is the whole game, which is one reason serious anti-bot vendors do not rest the verdict on this bit at all. The flag’s relationship to where stealth patches reach is the same tension covered in the chromium-source-patch-vs-runtime-injection piece.

Order is a signal: the event sequence a real click produces

Suppose you have solved the flag. You can now produce a click that reads isTrusted: true. The next thing a defender looks at is whether that click arrived with the cortege of events that always accompanies a real one, and in the right order.

A genuine mouse click is not one event. It is a sequence, and the sequence is specified. The Pointer Events recommendation lays out the pointer-and-mouse ordering, and for a plain mouse press and release on an element the prescribed run is roughly: pointerover, pointerenter, pointermove, pointerdown, then the compatibility mousedown, then pointerup, the compatibility mouseup, and finally click. A second press inside the double-click window adds another mousedown/mouseup/click and then a dblclick. The UI Events spec describes the mouse half of that as its own ordered list: mousedown, optional mousemoves, mouseup, click, and so on. Real input also rarely fires pointerdown from nowhere. The pointer was somewhere a moment ago, which means there are pointermove and mousemove events leading into the press, with coordinates that trace an actual path across the element.

A naive automation that calls element.click() produces exactly one event, the click, and nothing around it. No pointerdown. No mousedown. No movement leading in. That alone is a sharper tell than the trust flag, because it does not depend on isTrusted at all; a handler can simply count what it received. So the better automation frameworks fire the whole sequence. And here the order becomes the trap, because the sequence has rules that are easy to get subtly wrong. pointerdown precedes mousedown. mouseup precedes click. A click with no preceding mousedown on the same target, or a mousedown that never had a pointerdown, or a press with no movement anywhere before it, are each inconsistencies a behavioral script can check cheaply.

real click: ordered cortege pointermove pointerdown mousedown pointerup mouseup click naive synthetic: the click, alone click No pointerdown, no mousedown, no lead-in movement. Countable without reading isTrusted. *The order is specified, which makes it checkable. A click with no mousedown ancestor is an inconsistency a handler can detect without trusting any single flag.*

The same spec text that defines the order also defines what isTrusted controls beyond the flag itself, and that is where the order question and the trust question meet. From Chrome 53, shipped after a June 2016 intent led by Dave Tapuska, Chromium stopped performing default actions for untrusted events, bringing it in line with other engines. The motivation listed in that intent was straightforward interoperability plus closing some focus-stealing tricks, and the compatibility data was reassuring enough to ship: default actions fired from untrusted events on only “0.0000786182249882 %” of page visits over a seven-day dev-channel sample. The legacy exception was click, kept firing its activation behavior even when untrusted so that existing element.click() code did not break. So the practical situation since 2016 is that a synthetic, untrusted event will run your handlers but will not, for example, submit a form by itself or follow a link, except for the grandfathered click activation. That is exactly why scripted automation that needs the side effect reaches past the page into CDP.

The layer where CDP makes the flag true

The reason CDP-injected input reads as trusted is that it does not go through dispatchEvent at all. It enters the browser as a synthesized hardware-level input event, upstream of the point where the renderer decides trust.

Playwright, Puppeteer, and Selenium’s Chromium driver all drive input through the DevTools Protocol’s Input domain, with Input.dispatchMouseEvent and Input.dispatchKeyEvent as the workhorses. These do not construct a DOM MouseEvent and call dispatchEvent. They hand the browser a WebInputEvent, the same internal representation a real mouse or keyboard produces after the OS hands input to Chromium. From there the event flows through the normal renderer pipeline: hit testing, pointer-event manager, compatibility mouse events, the lot. Because it entered as a genuine WebInputEvent rather than a script-dispatched DOM event, the renderer marks the resulting DOM events trusted. That is the whole trick, and it is also why operating-system-level tools like AutoHotkey or AppleScript produce trusted events with no special browser support at all: they inject below the browser, so by the time Chromium sees the input there is nothing synthetic about it from the browser’s point of view.

This is the layer where the trust question genuinely closes. A CDP-driven click can fire the full ordered sequence, with isTrusted: true on each event, with default actions intact. If isTrusted and event order were the whole defense, CDP automation would be invisible. They are not the whole defense, and the reason is that injecting at the WebInputEvent layer reproduces the shape of input without reproducing all of its content. The renderer trusts the event. The telemetry attached to the event still has to be filled in by whatever generated it, and that is where the fabrication shows.

It is worth being precise about what is and is not public here. The general fact that CDP Input commands produce trusted events is documented behavior and easily observed. The exact internal struct layout of WebInputEvent and the precise fields Chromium populates for an OS mouse move versus a CDP-synthesized one are implementation details that shift between versions; what follows about specific zeroed fields is grounded in Chromium’s own tracker and the public W3C issues, not in a leaked spec. Where a detail is inferred from observed behavior rather than documented, it is flagged as such. The mechanics of CDP as a detection surface in general are covered in the CDP detection vector post; this section is only about the input path.

What comes out zeroed: telemetry that input carries and fabrication forgets

Here is the most concrete, least speculative tell, and it comes straight from Chromium’s own issue tracker. Real PointerEvents on Chrome and Edge populate movementX and movementY with the delta from the previous pointer position. CDP-synthesized pointer events do not. A long-standing conformance issue, w3c/pointerevents #131, opened on 5 August 2016, records that in Chrome and Edge “they are always 0 in those implementations for PointerEvents as oppose to compat MouseEvents.” That is a quirk for ordinary developers and a gift for detectors. A stream of pointermove events whose movementX/movementY are uniformly zero, while the clientX/clientY coordinates change between them, is a contradiction a real device never produces. The pointer moved, by the page coordinates, yet reported no movement delta. Synthetic.

The same shape repeats in coalesced events. Modern browsers batch high-frequency pointer moves and expose the intermediate samples through getCoalescedEvents(), so a single dispatched pointermove can carry a fan of sub-samples that a 120 Hz mouse generated between two animation frames. A real mouse drag produces these constantly. CDP’s Input.dispatchMouseEvent does not synthesize coalesced samples at the Chromium level; an event injected that way arrives with an empty or trivial coalesced list. Calling getCoalescedEvents() on it returns just the one event back. The public discussion around this is candid that the only complete fix is a binary patch to Chromium’s pointer-event manager, because page-level JavaScript that fabricates fake coalesced points can itself be caught by comparing the function’s toString() output, by inspecting the event from inside a cross-origin iframe, or by validating in a worker, all places a runtime patch tends not to reach. That last category is its own topic, covered in the iframe-and-worker context fingerprinting post.

pointermove real device CDP-injected clientX / clientY changing changing movementX / Y non-zero deltas always 0 getCoalesced...() fan of samples [self] only pressure 0.5 (mouse down) default unless set A position that changes while movementX stays 0 is a contradiction no real pointer produces. (Field availability shifts between Chromium versions; treat exact values as version-dependent.) *The trust flag can read true while the telemetry beneath it reads as fabricated. movementX-always-zero is the cleanest documented example.*

Pointer geometry and pressure tell a quieter version of the same story. A real mouse pointerdown reports pressure around 0.5 and a contact geometry of roughly a single pixel; a pen reports variable pressure and tilt; touch reports a contact patch with real width and height. An injected event takes whatever defaults the injecting code bothered to set, which for a tool aiming at “mouse” is often a static value repeated across every event. A constant is not how a real sensor behaves, even a coarse one. None of these single fields is decisive on its own, and a determined fabricator can set any one of them. The defensive point is that there are many of them, they have to agree with each other and with the coordinate stream, and getting all of them mutually consistent on every event in a long session is harder than getting any one of them right. This is the same telemetry stream the big behavioral vendors collect; Akamai’s sensor_data payload, for instance, carries mouse movement samples, click coordinates and timing, and keystroke cadence, the fields detailed in the Akamai sensor_data post.

There is a keyboard-side analogue worth naming. CDP’s Input.insertText drops text into a field without firing keydown or keyup at all, so a form that watches for per-character key events sees text appear with no keystrokes behind it. Input.dispatchKeyEvent does fire the key events but can get characters wrong across keyboard layouts, which is why automation often wraps it with extra synthetic KeyboardEvents to fix the event log, and those extra events are dispatched, untrusted, sitting in the same stream as the trusted ones. A handler that checks isTrusted per event, rather than once, sees a mix. Mixed trust across a single logical keystroke is not something a real keyboard produces.

Timing is the part you cannot zero-fill

Everything above is about shape and content of individual events. Timing is about the distribution across many of them, and it is the signal that survives even a perfect reproduction of fields and order. A separate Crawlex post, detecting automation via timing, goes deep on the latency side; the point here is narrower and about input specifically.

Real human input has structure that comes from the body producing it. The interval between keydown and keyup on a single key, the hold time, clusters in a range and varies with the key and the finger. The gap between successive keystrokes, the flight time, depends on which two keys and trends with typing skill. Akamai’s behavioral telemetry, as the public reversing of its payload describes, tracks exactly this: “the time between specific key presses, how long each key is held down, and how often the backspace key is used.” On the mouse side it records movement speed, the pauses, where the cursor hovered before it committed to a click. These are not arbitrary. They are the fingerprint of a neuromotor system, and the academic literature has spent fifteen years modeling it. A 2020 study reported text-CAPTCHA-style classifiers separating human from bot keystroke and mouse streams at 99.98 and 99.72 percent, which tells you how much separable structure is in the raw timing alone.

Fabricated timing tends to fail in one of two opposite directions. Either it is too regular, with keystroke intervals or mouse-sample spacing that cluster too tightly because they came from a fixed delay or a uniform sampler, or it is regular in the wrong way, smooth where a human is jerky. The hard case the literature keeps circling back to is a synthesizer good enough to reproduce the irregularity. The most direct recent example is a diffusion-model trajectory generator described in an October 2024 paper, which deliberately reproduces “slow initiation and directional force differences” in mouse paths and reports cutting bot-detection accuracy by 4.75 to 9.73 percent against the models it was tested on. Read that number the right way. It is a real, measurable reduction from a serious modeling effort, and it still leaves the detector working well above chance. State-of-the-art synthesis moves the needle by single-digit percentages against a single detector. It does not make the stream indistinguishable, and it certainly does not generalize across the many detectors a real session passes through. The trajectory side of this, why a real mouse path resists faking at the level of jitter and Fitts’s-law timing, is the companion to this post: why a real mouse path is hard to fake.

The deeper reason timing is the durable signal is that it is the one part of the input that is not stored in a field for the injector to set. Order can be replayed from a spec. Field values can be looked up and filled in. The trust flag can be acquired by injecting at the right layer. But the timing distribution of a real session is generated continuously, by a person, across hundreds of interactions that all have to remain mutually plausible. You are not filling in a struct. You are simulating a motor-control system in real time and betting it holds up against a classifier trained on millions of real ones. That bet is winnable for a few interactions and gets harder the longer the session runs, which is the opposite of how a field-spoof scales.

Where this leaves the contest in 2026

The shape of the whole problem is a ladder, and each rung is cheaper for the defender than the rung above. The bottom rung is isTrusted. Forging it from page script is hard by design, but acquiring it by injecting below the renderer is routine, so on its own it filters out only the laziest automation. The next rung is event sequence and the presence of the full cortege, which catches the element.click()-style scripts that fire a bare click, and which a competent framework satisfies by replaying the specified order. Above that is telemetry consistency, the movementX deltas and coalesced samples and pressure values that have to agree with each other and with the coordinate stream, where fabrication starts leaving fingerprints that need binary patches rather than page-level tricks to erase. At the top is timing, which no amount of correct fields fixes because it is generated, not stored.

What is genuinely different in 2026 is that the top two rungs are where the contest has moved, and the defenders are scoring the whole ladder at once rather than gating on any single rung. DataDome’s own description of its 2025 direction is intent-based detection that will flag a session “even with perfect browser fingerprints if navigation patterns suggest automated data collection.” That is a defender saying out loud that it has stopped caring whether you can satisfy the individual bits and started scoring the behavior they were proxies for. The trust flag was always a proxy for “a human did this.” Once an attacker can set the proxy, the defender goes back to measuring the thing the proxy stood in for.

So the honest read on synthesizing input is this. The isTrusted boolean is solved in the narrow sense that you can make events carry it, and unsolved in the sense that making it carry true buys you almost nothing against a detector that was never really counting on that one bit. The fabrication that remains visible is not in any field you forgot to set. It is in the relationships between fields, and most of all in the timing, because timing is the only part of human input that has to be produced rather than copied. A struct can be filled in. A nervous system has to be simulated, and the best published simulators in 2026 still move the detection rate by single digits.


Sources & further reading

Further reading