Touchscreen biometrics: pressure, swipe velocity, and gesture signatures

Hand someone else your unlocked phone and ask them to scroll through your photos. Watch the screen. They scroll wrong. The strokes land in different places, the flicks carry a different speed, the finger lifts at a different angle. Nothing about it is illegal or even suspicious, but it does not look like you. A classifier can be trained to notice the same thing, and that is the whole premise of touch-dynamics biometrics: the way a finger moves across glass is consistent enough within one person, and different enough between people, to function as an identity signal without anyone typing a password.

The interesting part is what the phone actually measures. A touchscreen is not just an X/Y digitizer. It reports how hard you pressed, how much of the pad of your finger made contact, how fast the contact point moved, and how often it sampled all of that. Each of those raw channels carries a little entropy. Stacked together over a few strokes they carry enough to separate people, and enough to separate a human finger from a synthetic touch event injected by automation. This post walks the signal from the hardware up to the classifier, then turns it around to look at the same data from the bot-detection side.

The road map: first the raw channels a mobile OS and a browser expose, with the exact API field names and their ranges. Then the feature set that the canonical academic work built on top of those channels, and the error rates it reported. Then the harder, less-discussed questions of how stable the signal is across sessions, devices, and time, where the published evaluations cheated themselves, and finally how the auth machinery gets repurposed to tell a person from a script.

The raw channels: what the screen actually reports

A capacitive touchscreen senses a finger as a disturbance in a grid of capacitance measurements. The controller resolves that disturbance into a contact: a centroid position, an estimate of how large the contact patch is, and on some hardware a pressure estimate. Those three are the primitives. Everything downstream is derived from them plus timestamps.

On the web, two specifications expose this. The older Touch Events model, now a Community Group report that its own editors mark as legacy with no further work intended, gives each Touch object a small set of attributes beyond the coordinates. force is a relative pressure value in the range 0 to 1, defined as “the highest level of pressure the touch device is capable of sensing” at 1, and it returns 0 if no value is known. The contact patch is described as an ellipse through radiusX, radiusY, and rotationAngle, the last being the clockwise rotation of that ellipse in degrees, at least 0 and less than 90. When the device cannot measure radius, both radii read 0. That zero-when-unknown default matters later, because a great deal of hardware reports exactly zero.

The newer and now-preferred model is Pointer Events, currently a Level 4 Working Draft edited by Patrick Lauke and Robert Flack. It unifies mouse, pen, and touch behind one PointerEvent interface and exposes a richer set of fields. The relevant ones for a finger:

*The PointerEvent surface a finger fills in. Pressure, contact size, and timing carry the signal; the pen-oriented angle fields stay at their spec defaults for a bare finger.*

The default values are a quiet trap and worth dwelling on. pressure is normalized 0 to 1, but for hardware that cannot detect pressure the spec mandates 0.5 while a button is active and 0 otherwise. So a pressure of exactly 0.5, never varying, is not a featureless flat press. It is the hardware telling you it has no pressure sensor at all. Capacitive phones without a force layer fall into this bucket. A detector that sees pressure glued to 0.5 across an entire session learns something real about the device, just not about the finger.

Native code sees more than the browser does, and earlier. On Android a MotionEvent carries getPressure() and getSize() per pointer, where pressure is an abstract value the framework documents as “typically” between 0 and 1 but which can exceed 1 depending on calibration, and size is a scaled contact-area estimate. The browser sits on top of this and reshapes it into the normalized web fields. The translation throws away precision. A continuous-authentication SDK embedded in a banking app gets the native stream; a fraud script running in mobile Safari gets the normalized, often-zeroed web version. That asymmetry shapes which side of the fence a given detector can stand on.

One more channel deserves attention because it governs everything timing-related: the sample rate. The browser coalesces pointer moves for performance, firing one pointermove per animation frame while the underlying hardware sampled several positions in between. getCoalescedEvents() hands back the merged-away samples, and without it your velocity and curvature estimates are quantized to the frame clock rather than the touch controller’s true rate. The controller might sample at 120 or 240 Hz; the un-coalesced event stream might deliver 60. For a biometric that lives on the shape of a velocity curve, that difference is not cosmetic.

2013: Touchalytics and the 30-feature vector

The work that turned this from folklore into a measured result is Touchalytics, published in IEEE Transactions on Information Forensics and Security in 2013 by Mario Frank, Ralf Biedert, Eugene Ma, Ivan Martinovic, and Dawn Song. The question they posed was narrow and answerable: can a classifier authenticate a user continuously from nothing but the way they scroll? Not a deliberate gesture password. Ordinary up-down and left-right navigation while reading.

Their unit of analysis is the stroke. A stroke begins when the finger touches and ends when it lifts, and it is a sequence of samples each carrying position, timestamp, pressure, contact area, finger orientation, and the phone’s orientation. From that sequence they compute 30 features. Some are geometric: start and end coordinates, the direct end-to-end distance, the direction of the end-to-end line, the largest deviation of the actual path from that straight line and which side of it the deviation falls on. Some are kinematic: velocity at several points, the 20th-percentile of stroke velocity, acceleration, mid-stroke pressure, mid-stroke contact area. Some are temporal: stroke duration, and the inter-stroke time between consecutive strokes.

*The single stroke is the atom of touch dynamics. Most of the 30 Touchalytics features are simple geometric or kinematic readings off this one trajectory.*

What they found about which features carry the load is the part worth memorizing. Ranked by mutual information, the most informative single features were the contact area covered by the fingertip, the 20th-percentile of stroke velocity, fingertip pressure, and stroke direction. So the two channels that the web platform most often zeroes out, area and pressure, are precisely the two that the foundational study found most discriminative. That is the central tension of the whole field stated in one sentence.

The numbers were good enough to be taken seriously and modest enough to be honest. The classifier hit a median equal error rate of 0% for intra-session authentication, meaning within one continuous sitting it essentially never confused users. Across different sessions the EER rose to 2 to 3%, and when the test was run a week after enrollment it stayed below 4%. The authors were explicit that this disqualifies touch dynamics as a standalone long-term authenticator. It is a second factor, a way to extend a screen-lock timeout, or one modality inside a fused system. Not a password replacement. That framing has held up; nearly every serious deployment since treats touch as one signal among many rather than the gate itself.

If you want to go deeper on the sibling modalities that get fused with touch, the same logic of dwell, flight, and rhythm applies to typing in keystroke dynamics, and the velocity-and-curvature analysis maps almost directly onto the desktop cursor in mouse-movement biometrics.

What the signal actually encodes

It helps to be precise about why this works at all, because “everyone swipes differently” is true but unsatisfying. The discriminative power comes from a stack of fairly mundane physical facts that happen to be stable per person and variable across people.

Hand and finger geometry set the contact area and its rotation. A larger thumb lays down a larger, differently oriented ellipse than a slender index finger, and people tend to use the same digit for the same gesture out of habit. Grip and reach set where on the screen strokes start and end, and how far they travel before the thumb runs out of comfortable arc. Neuromotor control sets the velocity profile, the smoothness of the curve, and how the stroke decelerates into its endpoint. There is a reason the 20th-percentile velocity ranked so high: it captures the slow part of the motion, the controlled approach and release, which is more personal than the fast ballistic middle.

Pressure is the channel people overweight in intuition and the hardware underdelivers in practice. When a true force sensor is present, mid-stroke pressure is genuinely discriminative, because how firmly someone presses is a stable personal habit. But most capacitive phones never had a dedicated force layer, and Apple removed 3D Touch from iPhones starting with the XR and 11 generation, so the web force/pressure value on a huge installed base is either a flat default or a coarse area-derived proxy. This is exactly the gap a 2019 BioCatch patent claims to fill. The press release describes a method for “estimating force applied to a touch surface” without dedicated hardware, letting a software agent infer pressure from signals the device already exposes. The patent text does not spell out the inference, and the public materials do not either, so the precise mechanism is not documented. The plausible route, inferred from what is measurable, is to derive a pressure proxy from contact-area changes and from the micro-motion the press induces in the accelerometer and gyroscope. A firmer press spreads the contact ellipse and shoves the device a little harder, and both are observable without a force sensor.

That accelerometer link is not hypothetical, and it is where touch biometrics bleeds into the motion-sensor side. A press, a tap, and a scroll each jolt the phone in a slightly different way, and those jolts are readable. The TouchSignatures work showed that JavaScript in a mobile web page could read device motion and orientation streams, at the time without any permission prompt, and from them classify which touch action a user performed, tap versus scroll versus hold, and in a second phase recover digits of a PIN. The browser-delivered sample rate ran several times slower than a native app could achieve, yet the attack still worked. The relationship between touch and motion is symmetric: the same coupling that lets a defender estimate pressure from the accelerometer lets an attacker estimate keystrokes from it. For the motion-sensor side of this in a bot-detection context, see device-orientation and accelerometer signals.

The browser response to that class of attack was to gate the sensors. Since iOS 12.2 in-browser motion and orientation access defaulted off, and iOS 13 in 2019 replaced the toggle with a requestPermission() call on DeviceMotionEvent and DeviceOrientationEvent that returns a promise and must be triggered by a user gesture. That permission wall is now the main reason a web-based touch biometric cannot freely reach for the accelerometer to firm up its pressure estimate, while a native SDK inside an app faces no such wall.

Stability: the part the demos skip

A biometric is only as good as its consistency, and touch dynamics is less consistent than a controlled study makes it look. The signal drifts for reasons that have nothing to do with whether you are you.

Posture changes the whole stroke distribution. Swiping one-handed with a thumb while walking produces shorter, more curved, lower-velocity strokes than swiping two-handed with an index finger while sitting. The same person generates two different signatures depending on context, and if enrollment happened in one posture, verification in the other looks like an impostor. Device geometry matters too. Move to a larger screen and the reachable region shifts, the comfortable stroke length grows, and the start/end coordinate features that ranked highly in Touchalytics no longer mean what they meant. Even a screen protector or a case that changes the touch surface can move the contact-area readings.

This is where a 2022 evaluation paper called FETA earns its place, because it is one of the few that went looking for the ways the field flatters itself. The authors reviewed 30 touch-dynamics papers and reported that every single one overlooked at least one methodological pitfall that inflates accuracy. Their list of pitfalls is a checklist of how to lie to yourself: too few users or sessions, mixing different phone models into training data so the classifier learns to recognize the phone instead of the person, drawing training and test data from non-contiguous time periods so they share short-term artifacts, leaking attacker samples into training, and aggregating multiple swipes per decision in a way that quietly relaxes the threat model. To measure the effects honestly they collected a longitudinal set: a remote dataset of 470 users producing 6,017 sessions and over 1.16 million unique swipes, plus a smaller in-person set of 45 users.

*The FETA pitfalls. The worst one is training across mixed phone models: the classifier scores well because it learned the device, and that score evaporates the moment users share a model.*

The mixed-phone-model pitfall is the sharpest. If your training set has user A on a Pixel and user B on a Samsung, a classifier can separate them by learning the sensor calibration of each phone, then post a beautiful accuracy number that has nothing to do with how A and B swipe. The mechanism is mundane: different touch controllers quantize coordinates differently, report contact area on different scales, and clamp pressure with different curves, so two devices leave distinguishable fingerprints on the data before a human ever touches them. Put both users on the same model and the trick collapses. Several headline results in the literature lean on this without saying so. The lesson for anyone reading a touch-dynamics accuracy claim is to ask immediately whether the users shared hardware, and to distrust any number where they did not. This is the touch-modality version of a general truth about behavioral biometrics, that uniqueness and stability pull in opposite directions and every detector lives on a budget between them, which is the subject of the entropy-budget post.

The temporal pitfalls are subtler and arguably worse, because they survive even when everyone uses identical phones. Drawing the training and test swipes from one continuous recording session lets both halves share transient state: the same grip, the same sitting posture, the same battery temperature, even the same slightly oily screen. A model trained and tested inside that bubble looks excellent and then degrades the first time a real day passes between enrollment and verification. The honest design is to enroll on one day and test on a later one, which is exactly why FETA’s longitudinal collection ran across 31 days rather than a single sitting. Almost no consumer-facing claim is evaluated that way, and the ones that are quote distinctly less flattering numbers. A practical reading of all this: treat any single published EER as an upper bound on a good day, and assume the production number under posture drift, device change, and a week of elapsed time sits several points worse.

Turning auth around: telling a human from a script

Everything above was built to verify a known user. The same machinery, pointed at a different question, asks whether the toucher is human at all. That is the bot-detection use, and it is where touch dynamics meets the anti-bot stacks.

The detection-friendly fact is that synthetic touches look wrong in the raw channels before any sophisticated modeling. A touch injected through Android’s input system or a debugging bridge tends to arrive with pressure and contact size at degenerate values, because the injector simply does not populate them with realistic numbers. Published agent-detection logic keys on exactly this, flagging events where the starting contact size and pressure sit near zero as automation. A real fingertip lands with a nonzero, slightly noisy contact area and a pressure reading that wanders. A scripted tap often lands with size and pressure pinned at zero or some constant. On the web side the same shape appears: events synthesized by dispatchEvent or by the DevTools protocol carry pressure and contact geometry at their spec defaults, and a stream of perfectly identical default values across an entire session is itself the tell. The mechanism that catches scripted mouse input via timing and impossible smoothness, covered in synthesizing human input events, has a direct touch analogue.

Beyond the degenerate-value check there is the trajectory check, and this is where the BeCAPTCHA line of work fits. That method, from a group at Universidad Autonoma de Madrid, asks a user to perform a single drag-and-drop and then decides human-or-bot from the touchscreen path plus the accelerometer trace it induces. They evaluated it on HuMIdb, a multimodal mobile database of 14 sensors collected from 600 users, and tested it against fake swipes generated two ways: a handcrafted synthesizer and a generative adversarial network trained to mimic human gestures. From the touchscreen path alone the detector reached the 80 to 90% accuracy range, and adding the accelerometer channel pushed bot detection above 90%. The reason the accelerometer helps is the coupling described earlier. A genuine human drag jostles the phone in a way that is hard to fake if you are only synthesizing the on-screen coordinate stream, because the synthesizer has to produce two correlated signals that a real hand produces for free.

*The pipeline runs cheap to expensive. A flat-pressure or zero-area stream is rejected before any model runs; only ambiguous human-shaped input reaches the trajectory and sensor-correlation stages.*

There is a quieter signal in the timing that is harder to fake than the trajectory shape. A human swipe arrives as an irregular burst of samples whose spacing reflects the touch controller’s real clock and the small jitter of finger motion, then ends with a finger lift that produces a characteristic last sample before the up event. A scripted sequence often arrives at suspiciously even intervals, or at exactly the frame cadence the synthesizer chose, with no sub-frame variation because there was never any analog motion to sample. This is where the coalesced-event machinery cuts both ways. A defender who calls getCoalescedEvents() and finds an empty or impossibly regular set of intermediate samples is looking at input that never passed through a real digitizer. The detail-oriented detectors compare the reported sample timestamps against the device’s claimed touch sample rate and flag streams that are too clean to be physical. None of this requires identifying who is touching; it only requires confirming that something with the noise profile of a finger touched the glass.

This is the same arms race as everywhere else in bot detection, and the GAN result is the warning shot. Once an adversary trains a generator on real human swipes, the synthetic trajectories stop being trivially degenerate and start to sit inside the human distribution. Some 2023 evaluations found deep-learning detectors held up surprisingly well against GAN-generated swipes, but “surprisingly well” is not “solved,” and the gap between a generator trained on a public dataset and a detector trained on a private one is exactly the kind of gap that closes over time. The defensive value of the accelerometer correlation is that it forces the adversary to fake two coupled signals at once, which a coordinate-only injector cannot do. That advantage lasts only until the adversary is running on a real device that produces the coupling for free, which is the entire premise of mobile residential-proxy and real-device fraud farms. The commercial behavioral-biometrics vendors, BioCatch among them with its roughly 2,000 tracked parameters, lean on this by never resting a verdict on touch alone; touch is one tributary into a risk score that also weighs network, device, and session history. The broader pattern across these vendors lives in behavioral biometrics in fraud detection.

Where it sits in 2026

Touch dynamics is a mature signal with a known ceiling. The academic ceiling has barely moved since 2013, because the limit was never the classifier. It is the physics of how much stable, device-independent entropy a few swipes contain, and that quantity is fixed. Deep nets squeezed the error rates a little and made the features automatic instead of hand-designed, but nobody has shown that touch alone authenticates a person reliably enough to drop the password, and the honest evaluations like FETA suggest several of the impressive numbers were partly the classifier reading the phone model off the sensor calibration rather than reading the human off the swipe.

What changed is the deployment context around it. The web platform spent the last decade narrowing the channels, defaulting pressure and area to zero on most hardware, gating the motion sensors behind a permission prompt after the TouchSignatures class of attack, and coalescing the event stream so timing precision now requires an explicit getCoalescedEvents() call. The native side kept its full-resolution access, so the richest touch biometrics now live inside app SDKs rather than in the browser, and the browser-side detectors increasingly settle for the cheaper job of catching synthetic input by its degenerate pressure and contact-size values rather than the harder job of identifying a specific person. The two uses have drifted apart: authentication retreated into native apps where the signal is rich, and bot detection stayed on the web where the signal is thin but a flat 0.5 pressure or a zero contact area is still a perfectly good confession that no finger touched the glass.

The durable observation is the one the foundational paper made and then declined to oversell. Within a single sitting, a person’s touch is almost perfectly self-consistent and the equal error rate falls to zero; stretch the same test across a week, a posture change, or a new phone, and it climbs past the point where you would bet an account on it. Everything built since has been an argument about which of those two numbers to quote.

Sources & further reading

Frank, M., Biedert, R., Ma, E., Martinovic, I., Song, D. (2013), Touchalytics: On the Applicability of Touchscreen Input as a Behavioral Biometric for Continuous Authentication — IEEE TIFS paper proposing the 30-feature stroke vector and reporting 0% intra-session and 2-4% inter-session EER.
W3C (2026), Pointer Events Level 4 Working Draft — the spec defining pressure, width/height, tilt, twist, and angle fields, including the 0.5 default for pressure-less hardware.
W3C Touch Events Community Group (2024), Touch Events Level 2 — the legacy spec defining force, radiusX, radiusY, and rotationAngle, with their zero-when-unknown defaults.
MDN Web Docs, PointerEvent.pressure — reference confirming the normalized range and the synthetic 0.5 value for no-pressure devices.
MDN Web Docs, PointerEvent.getCoalescedEvents() — explains the coalescing that quantizes velocity/curvature unless the un-coalesced samples are requested.
Georgiev, M., Eberz, S., et al. (2022), FETA: Fair Evaluation of Touch-based Authentication — surveys 30 papers, names five evaluation pitfalls, and collects a 515-user longitudinal dataset.
Acien, A., Morales, A., Fierrez, J., Vera-Rodriguez, R., Delgado-Mohatar, O. (2020), BeCAPTCHA: Behavioral Bot Detection using Touchscreen and Mobile Sensors benchmarked on HuMIdb — drag-and-drop bot detection on 600-user HuMIdb, 80-90% from touch and above 90% with accelerometer.
Mehrnezhad, M., Toreini, E., Shahandashti, S.F., Hao, F. (2016), TouchSignatures: Identification of User Touch Actions and PINs Based on Mobile Sensor Data via JavaScript — JavaScript reads motion/orientation to classify touch actions and recover PINs.
BioCatch (2019), BioCatch Obtains 39th US Patent for Authenticating Users Based on Screen Pressure — software-only force estimation without dedicated pressure hardware, part of a ~2,000-parameter system.
Android Open Source Project, MotionEvent — native API exposing per-pointer getPressure() and getSize(), the full-resolution stream behind the web fields.

Touchscreen biometrics: pressure, swipe velocity, and gesture signatures

The raw channels: what the screen actually reports

2013: Touchalytics and the 30-feature vector

What the signal actually encodes

Stability: the part the demos skip

Turning auth around: telling a human from a script

Where it sits in 2026

Sources & further reading

Further reading

Behavioral biometrics in fraud detection: mouse, keystroke, and touch dynamics

Keystroke dynamics: dwell time, flight time, and the typing-rhythm fingerprint

Device-orientation and accelerometer signals in mobile bot detection