Device-orientation and accelerometer signals in mobile bot detection
A phone in a human hand is never still. Hold it up to read, and the accelerometer logs a slow drift on every axis: the micro-tremor of the wrist, the sag when an arm tires, the sharp spike when a thumb taps the screen and the whole chassis recoils a fraction of a millimetre. A phone sitting in a server rack, pretending to be that handheld device, reads gravity on one axis and nothing on the other two. Forever. That gap is the whole story of motion-sensor bot detection, and it is a surprisingly hard gap to close.
The question this post answers is narrow and specific. When an anti-bot script can read DeviceMotion and DeviceOrientation, what exactly does it learn, and how does that signal tell a real device from a headless or emulated one? The answer pulls in MEMS hardware behaviour, a privacy permission model that changed three times between 2018 and 2020, a calibration side-channel that survives a factory reset, and the awkward fact that the most useful version of this signal is also the easiest one to fake badly.
Here is the route. First, what the two event interfaces actually carry and in what units. Then the permission model on iOS and Android, because no detector reads a single byte of motion data without first clearing that gate. Then the core detection logic: the statistical shape of human motion versus the flat, looped, or absent stream an emulator produces. Then calibration fingerprinting, a separate and stranger signal that identifies a specific handset rather than just proving it is real. We close on where this fits in a stack that already has TLS, canvas, and HTTP/2 to lean on, and why motion is a corroborating signal rather than a verdict on its own.
What the events carry
Two DOM event interfaces expose motion to a web page, and they predate the cleaner Generic Sensor API that came later. DeviceMotionEvent fires on window and carries four things. There is acceleration, the linear acceleration along the device’s X, Y, and Z axes in metres per second squared, with gravity already subtracted. There is accelerationIncludingGravity, the same three axes with gravity left in, which is what you get on hardware that cannot separate the two. There is rotationRate, the angular velocity around three axes in degrees per second, sourced from the gyroscope. And there is interval, the time in milliseconds between successive samples, which on most phones lands somewhere around 16ms, matching a 60Hz sampling rate.
DeviceOrientationEvent is the other half. It reports the device’s attitude as three Euler angles: alpha, rotation about the Z axis in the range [0, 360); beta, rotation about the X axis in [-180, 180); and gamma, rotation about the Y axis in [-90, 90). These are computed values, fused from the accelerometer, gyroscope, and on absolute orientation from the magnetometer too. A flat phone face-up on a desk reads beta and gamma near zero, with alpha pointing wherever the compass says north is.
The W3C Device Orientation and Motion specification, which reached Candidate Recommendation Draft status in February 2025, pins down those units and ranges. It also bakes in a privacy decision worth noting up front: precision is deliberately capped. Orientation angles are limited to 0.1 degrees of resolution and acceleration to 0.1 m/s², specifically to blunt the fingerprinting risk that high-precision readings carry. That cap matters later, because the calibration attack we will get to depends on reading sensor output finely enough to recover factory-set constants, and the rounding is the spec’s answer to it.
*The two legacy motion event interfaces and their fields. The accent marks `interval`, the field that quietly leaks the sampling rate a detector uses to spot replayed or interpolated streams.*The newer Generic Sensor API exposes the same hardware through cleaner classes: Accelerometer, Gyroscope, LinearAccelerationSensor, GravitySensor, AbsoluteOrientationSensor, and RelativeOrientationSensor, with Magnetometer and AmbientLightSensor still behind flags in Chromium. These let a script set a sampling frequency and read structured values rather than catching events. For detection purposes the data is the same; the choice between the two APIs mostly tells you which era the script was written in. Most anti-bot tags in the wild still listen for devicemotion because it works everywhere a phone browser does, including iOS Safari, where the Generic Sensor API does not ship.
The permission gate
No motion data reaches a script until two conditions hold: a secure context, and on the platforms that require it, an explicit user grant. Both conditions are detection signals in their own right, because the way a client navigates the permission gate is itself observable.
Secure context came first. Chrome restricted both DeviceOrientationEvent and DeviceMotionEvent to HTTPS in M74, which shipped in March 2019. The interfaces stay exposed on window and you can still attach event listeners over plain HTTP, but the events never fire. At the time the change shipped, Chrome telemetry showed roughly 0.82% of page loads using DeviceOrientationEvent over an insecure connection, low enough that the breakage was tolerable. The deprecation warnings had been running since 2015, so this was a long-telegraphed move rather than a surprise.
Then Apple gated the data behind user consent. Since iOS 12.2, released in March 2019, Safari ships with a “Motion & Orientation Access” toggle under Settings, Safari, Privacy and Security, and it defaults to off. With the toggle off, motion and orientation events simply do not fire. This broke a swathe of WebVR and 360-photo sites overnight, and it broke them silently, because the events just stopped arriving with no error. iOS 13 replaced the blunt toggle with a proper permission prompt. A page calls DeviceMotionEvent.requestPermission() or DeviceOrientationEvent.requestPermission(), which returns a promise resolving to the string "granted" or "denied".
The detail that matters for detection is the gesture requirement. On iOS, requestPermission() can only be called from inside a user-gesture handler, a real tap or click. Call it on page load and you get a NotAllowedError with the message that requesting device orientation or motion access requires a user gesture to prompt. This single constraint does a lot of quiet work. A headless automation harness that fires events programmatically rather than dispatching genuine input often cannot satisfy the gesture check, so the permission never resolves to granted and the motion stream never starts. The detector does not even need to inspect the data; the absence of a successful grant on a device whose user agent claims to be iOS Safari is already odd.
Android Chrome takes a softer line for the legacy events. It does not throw a permission prompt for devicemotion and deviceorientation the way iOS does; on a visible HTTPS page the stream just starts. The Generic Sensor API classes are governed instead by Permissions Policy, with directive names accelerometer, gyroscope, and magnetometer. By default only the top-level document and same-origin subframes can read sensors; a cross-origin iframe needs an explicit allow="accelerometer" attribute from its parent. This is why an anti-bot vendor that wants motion data from inside its own iframe has to be granted the policy by the host page, and why a stray third-party script cannot silently slurp sensor data from a frame it does not control.
There is one more gating behaviour that detectors lean on hard: visibility. Both the legacy events and the Generic Sensor classes only deliver readings while the document is visible and, in Chromium, focused. Background a tab and the stream stops. A scraping setup that runs many headless contexts in parallel, none of them actually foregrounded, can find that the motion stream never delivers a single sample, not because the device is fake but because no context is ever visible. The detector sees a phone user agent with zero motion events and scores it accordingly.
The shape of human motion
Once the gate is cleared, the interesting work begins. The detector now has a time series of accelerometer and gyroscope readings, and the question is whether that series came from a hand or from a machine pretending to be one. The distinction is statistical, and it has been studied long before anyone called it bot detection.
The foundational observation comes from malware analysis, not web scraping. In 2014, work presented at ASIA CCS by Vidas and Christin showed that accelerometer output from a real Android handset is statistically distinguishable from what an emulator produces. The reasoning was the same then as now. Malware that wanted to dodge automated analysis sandboxes could check whether the device ever moved; a sandbox running in a data centre does not, so a flat or absent accelerometer reading became a reliable “I am being analysed” signal. Anti-bot vendors inverted the same logic. If a flat accelerometer means an emulator to a piece of malware, it means an emulator to a fraud detector too.
What does the human signal actually look like? Even a phone held as still as a person can manage carries a noise floor of physiological tremor, roughly in the 8 to 12 Hz band, on top of slow postural drift. Tap the screen and the chassis jolts: a real touch produces a sharp, brief acceleration transient that correlates in time with the touch event, because the finger that delivered the tap also delivered a tiny mechanical impulse to the case. Walk while scrolling and the accelerometer paints a clean gait signature, a periodic waveform around 2 Hz that no emulator generates by accident. Set the phone down and the entire character of the signal changes: the tremor vanishes, the axes settle, and gravity redistributes as the device finds a flat resting attitude. That transition, motion to stillness to motion, is hard to script convincingly because it has to stay consistent with everything else the page observes, including touch timing and scroll velocity.
*Top: a handheld trace, with tremor, drift, and two tap impulses that line up with touch events. Bottom: a stock emulator's idea of "at rest", a dead-flat line. The variance alone separates them; the timing correlation with touch is the harder thing to forge.*The failure modes of a fake stream fall into three buckets. The first is absence: no motion events at all, which on a device claiming to be a modern phone is already suspicious. The second is flatness: events arrive, but every sample is identical, or the only non-zero value is a constant gravity component on one axis. Early Android emulators returned exactly this, often a clean [0, 9.8, 0] and nothing else. The third, and the one detectors care most about now, is the looped or synthetic stream. A more sophisticated harness injects pre-recorded sensor values or generates plausible-looking noise. This defeats the naive variance check, so the detector moves to harder questions.
There is a subtler tell in how the numbers themselves are distributed. A real MEMS sensor quantises its output at a fixed bit depth, so successive readings fall onto a discrete lattice with a characteristic step size, and the spec’s 0.1-unit rounding sits on top of that. A naive synthetic stream drawn from a continuous random generator does not land on the same lattice; its values are too smooth, or quantised at the wrong granularity, in a way that a histogram of the least significant digits exposes. Detectors that bother to look at the digit distribution rather than just the variance pick up generators that never thought about quantisation at all. It is the kind of check that costs nothing to run and is annoying to defeat, because getting it right means modelling the exact ADC behaviour of the specific sensor the user agent claims to carry.
Those harder questions are where a synthetic stream tends to fall apart. Does the sampling interval match what the claimed hardware reports, and is it stable, or does it carry the jitter a real sensor pipeline introduces? Does the gyroscope’s rotationRate stay physically consistent with the integral of orientation change, since a real gyro and a real accelerometer are measuring the same rigid body and cannot disagree about which way it turned? Above all, does the motion correlate with the input events the page already sees? A tap with no accompanying chassis jolt, a scroll with no hand tremor, a “walking” gait signature on a device whose touch timing says it is sitting on a desk: each is a contradiction between two signals that a real device keeps in sync for free and a fake one has to coordinate deliberately. Cross-signal consistency is the expensive part to fake, which is exactly why detectors weight it.
A note on honesty here. The precise feature set any given commercial vendor extracts from a motion stream, the exact statistical tests, the thresholds, the model weights, are not public. What is documented is the raw signal, the academic basis for distinguishing real from emulated motion, and the observable behaviour of these scripts in the wild. The specifics of, say, how Akamai or HUMAN weights a motion feature against a TLS feature live in closed source and private model files. Where this post describes detection logic, it is describing the mechanism the public record supports, not a leaked rule table.
What the wild actually does
The clearest public measurement of motion sensors used for detection comes from a 2018 study presented at ACM CCS, which crawled the Alexa top 100,000 sites and recorded every script that touched a sensor API. Motion sensors were accessed on 2,653 of those sites, by scripts from 384 distinct domains. The breakdown of what those scripts were doing is the useful part. About 36.8% of them performed tracking, analytics, fingerprinting, or audience recognition, and a clear slice, 17.7%, were differentiating bots from real devices. Motion access correlated heavily with other fingerprinting: 62.7% of scripts that read motion sensors also did browser fingerprinting, more than half of them canvas fingerprinting specifically.
The study named names in a way that maps directly onto the anti-bot industry. Scripts from perimeterx.net were observed reading motion data and sending encoded sensor information to remote servers across dozens of sites. PerimeterX is now HUMAN, and its sensor collection is documented in our writeup of PerimeterX’s VID, sensor payload, and the bello challenge. The same pattern, collect motion in the client, encode it, ship it to a scoring backend, is exactly how the bigger vendors structure their pipelines. Akamai’s client tag packs telemetry into its sensor_data payload, and on a mobile session that payload has room for accelerometer and orientation samples alongside touch and timing data. The motion stream is one field among many in a blob whose whole purpose is to be replayed server-side and scored.
That architecture matters because it tells you how the signal is used. The client does not decide bot-or-not from motion alone. It collects the raw stream, or a summary of it, and hands the decision to a server-side scoring pipeline that has the TLS fingerprint, the HTTP/2 frame ordering, the IP reputation, and a dozen other inputs sitting next to the motion features. Motion is a column in a feature vector. A flat accelerometer on its own might cost a session a few points; a flat accelerometer on a device whose user agent says iPhone, whose TLS handshake says a Python client, and whose IP belongs to a cloud ASN is what actually trips a block.
This is also why motion is a particularly awkward signal to fake well. To make a flat stream look real you have to generate plausible motion. To make plausible motion survive scrutiny you have to keep it consistent with the touch events, the scroll timeline, and the device the user agent claims to be. And the moment you are generating coordinated, physically consistent multi-sensor telemetry that agrees with synthesised input events, you are most of the way to the problem described in synthesizing human-like input events: the hard part is never one signal, it is the joint distribution across all of them.
Calibration fingerprinting: a different signal entirely
There is a second, stranger thing a motion sensor leaks, and it is worth separating cleanly from the real-versus-fake question because it answers a different one. Calibration fingerprinting does not ask “is this a real device.” It asks “is this the same specific device I saw before,” and it can answer with startling precision.
The mechanism, demonstrated in the SensorID work presented at the 2019 IEEE Symposium on Security and Privacy by researchers from the University of Cambridge, exploits a manufacturing reality. MEMS accelerometers and gyroscopes come off the line with small per-unit errors, so the factory writes per-device calibration constants into firmware to correct them. Those constants are baked into the sensor output. By collecting raw readings and analysing them carefully, you can recover the calibration values, and because every handset’s constants are slightly different, the recovered values form a fingerprint. The attack needs fewer than 100 samples and under one second of data. It requires no permission, because at the time the sensors were freely readable. And the fingerprint survives a factory reset, because the calibration data lives below the level anything a reset touches.
The platform split is specific. On iPhones the recoverable fingerprint came from the gyroscope and magnetometer calibration; the iPhone 6S was estimated at around 67 bits of entropy, which is effectively a globally unique identifier. On the Google Pixel 2 and Pixel 3, the accelerometer was the leaky sensor. The asymmetry comes down to which sensors each vendor factory-calibrates and how.
*Calibration fingerprinting answers "same device?" not "real device?". The same MEMS hardware that proves a phone is real can also pin it to one specific unit, which is why the spec now rounds the readings.*Apple patched this in iOS 12.2 in March 2019, both by adding noise to the sensor output and by gating Safari’s motion access behind the toggle described earlier, which is the same change that broke all those WebVR sites. The two privacy fixes, consent gating and the resolution cap, were aimed partly at this exact attack. The W3C spec’s 0.1-unit rounding is the standards-track version of the same defence: round coarsely enough and the calibration constants blur below recoverability.
For a detection stack, calibration fingerprinting is double-edged. As a defender, a stable per-device ID that survives reset is a gift for catching an attacker who rotates cookies and IPs but keeps reusing the same physical phone farm. As a privacy matter, it is exactly the persistent cross-origin identifier the spec authors spent years trying to kill, which is why post-mitigation it is mostly a historical capability on patched devices rather than a live signal you can count on today. It sits alongside the other hardware-entropy tricks we have covered, from AudioContext fingerprinting to the broader question of how much unique signal a detector can safely spend, the entropy budget every system balances.
Where motion sits in the stack
Motion sensors are a corroborating signal, never a verdict. That framing is the single most important thing to keep straight, because it explains both why detectors bother collecting motion and why they do not block on it alone.
The signal is strong in one direction and weak in the other. A flat, absent, or physically impossible motion stream on a device claiming to be a modern phone is good evidence of automation, especially when it lines up with other tells. But a rich, realistic motion stream is not proof of a human, because it can be replayed from a recording, and because plenty of legitimate automation runs on real phones in real device farms that move. The asymmetry is the point. Motion is most useful for catching the lazy fake, the cloud-hosted emulator that never bothered to generate sensor data, and least useful against an attacker who has invested in real hardware. That is the same shape as most fingerprinting signals, and it is why no serious stack relies on any one of them.
What makes motion harder to fake than it first looks is consistency, not the raw values. Generating a plausible accelerometer trace is easy. Generating one that agrees with the touch events, the scroll velocity, the gyroscope’s own integral, the claimed sampling rate, and the device model in the user agent, all at once and over a whole session, is a coordination problem that grows with every additional signal the page collects. A detector does not need to prove the motion is fake. It needs to find one contradiction, and a session that synthesises five signals independently tends to leave one. The cost curve favours the defender here in a way it does not for, say, a single static fingerprint value that can be copied wholesale.
The honest closing note is about reach. Motion only exists where there is a phone with sensors and a user willing to grant access, and the permission models of the last few years have narrowed that window deliberately. On iOS the data does not flow without a real tap, which means a detector reading motion has already confirmed a genuine gesture happened, arguably the more valuable fact than anything in the trace itself. On desktop there are no motion sensors to read at all. So this signal is sharp but local: it cuts cleanly through the cheapest mobile emulation, it corroborates a human gesture on iOS, and it stays silent everywhere else. A scraper that runs real phones, foregrounds them, and lets the sensors run will sail straight past it, which is precisely why the detector keeps the TLS handshake, the HTTP/2 fingerprint, and the IP reputation in the same feature vector. Motion is one good column. The verdict is the whole table.
Sources & further reading
- W3C Devices and Sensors Working Group (2025), Device Orientation and Motion — Candidate Recommendation Draft defining the event interfaces, units, permission integration, and the 0.1-unit precision cap.
- MDN (2024), DeviceMotionEvent — field reference for acceleration, accelerationIncludingGravity, rotationRate, interval, and the requestPermission method.
- Chrome for Developers (2023), Sensors for the web — the Generic Sensor API classes, secure-context rule, Permissions Policy directives, and visibility throttling.
- Chromium blink-dev (2019), Intent to Remove: Insecure usage of DeviceOrientationEvent and DeviceMotionEvent — the M74 decision to restrict motion events to HTTPS, with the 0.82% insecure-usage figure.
- Zhang, Beresford, Sheret (2019), SensorID: Sensor Calibration Fingerprinting for Smartphones — IEEE S&P paper recovering per-device calibration constants from gyroscope, magnetometer, and accelerometer output.
- Das, Acar, Borisov, Pradeep (2018), The Web’s Sixth Sense: A Study of Scripts Accessing Smartphone Sensors — ACM CCS crawl of 100K sites measuring which scripts read motion sensors and why, including the 17.7% bot-differentiation share.
- Vidas, Christin (2014), Evading Android Runtime Analysis via Sandbox Detection — ASIA CCS paper showing accelerometer output statistically separates real Android devices from emulators.
- W3C accelerometer issue #54 (2019), device calibration of accelerometers may reveal precise hardware fingerprint — the standards-track discussion that led to the 0.1 m/s² rounding mitigation.
- MacRumors (2019), Apple to Limit Accelerometer and Gyroscope Access in Safari on iOS 12.2 — coverage of the Motion & Orientation Access toggle defaulting to off.
- 9to5Google (2019), Sensor calibration attack can track Android devices — summary of the SensorID disclosure, entropy figures, and Apple’s iOS 12.2 fix.
- Android Open Source Project, Sensor types — reference for accelerometer, gyroscope, and reporting modes underlying the web events.
Further reading
Detecting automation via timing: how event latency reveals a bot
Traces how anti-bot systems read the clock instead of the cursor: event-dispatch latency, requestAnimationFrame cadence, input-to-action gaps, and why synthetic interaction keeps a suspiciously clean beat.
·18 min readWhy a real mouse path is hard to fake: trajectory, jitter, and Fitts's law
Traces how pointer motion becomes a biometric for bot detection: Fitts's law, bell-shaped velocity profiles, the two-thirds power law, micro-jitter and overshoot, and why straight-line and Bezier synthetic paths get flagged.
·19 min readBehavioral biometrics in fraud detection: mouse, keystroke, and touch dynamics
Traces what mouse, keystroke, and touch dynamics actually measure, how continuous authentication differs from a login check, how BioCatch and BehavioSec build the profile, and why behavioral data sits in a regulatory grey zone.
·23 min read