Skip to content

Akamai's sensor_data payload: the fields and their telemetry sources

· 19 min read
Copyright: MIT
sensor_data wordmark over numeric field markers on a dark background

Open the developer console on any site behind Akamai Bot Manager, type bmak.sensor_data, and you get back a long opaque string. A few kilobytes of digits, commas, and the occasional base64 blob. It looks like noise. It is not noise. That string is a structured recording of everything the page watched you do and everything it could measure about your browser, serialized into a flat format and wrapped in a layer of obfuscation that has been rebuilt at least three times since 2016. The server reads it, scores it, and decides whether to mint you a valid _abck cookie or quietly mark you as a bot.

This post is about that payload and only that payload. Not the _abck cookie validation handshake, not the server-side scoring pipeline, not the bm_sz pixel challenge, though all three connect to it. The question here is narrower and more concrete. What is actually inside sensor_data? What telemetry does each segment carry, where does the browser get it from, and how is the whole thing encoded so that a static read of the script tells you almost nothing? The exact byte layout is not published by Akamai and changes between deployments, so where the detail comes from reverse-engineering rather than vendor documentation, this post says so plainly.

The walk goes like this. First, where the payload comes from and how it gets to the server. Then the field-marker grammar that gives the string its shape. Then the four telemetry categories the markers carry: device and environment, behavioral events, timing, and the consistency probes that look for lies. After that, the obfuscation and encryption layers and how they hardened across versions. Finally, the mobile SDK variant, which solves the same problem with completely different machinery, and what the whole design tells you about where detection lives.

Where the payload comes from

The collector is a single obfuscated JavaScript file. People in the reverse-engineering community call it the bmak script, after the global object it installs, and it tends to run on the order of half a megabyte after deobfuscation. Akamai serves it from a path on the protected origin, often something ending in a random-looking filename, and once it executes it attaches a global object named bmak to window. That object holds the collector state and the methods that drive it. The behavioral parts hook event listeners on the document. The static parts probe the browser environment once at startup. A timer batches what has accumulated, runs the serialization and encryption, and the result lands in bmak.sensor_data ready to be sent.

The transport has shifted over time. In older deployments the script POSTs a JSON body whose sensor_data key holds the encoded string, typically to a path under the site’s own domain that Akamai’s edge intercepts. More recent web deployments carry the same payload as an akamai-bm-telemetry value, which is the encoded sensor string wrapped in base64. From the server’s side the mechanics are the same either way. A request arrives carrying a telemetry blob, the edge decodes and scores it, and the response either issues or refreshes the _abck cookie. Akamai’s own description of the mechanism stays at this altitude: behavior anomaly detection, it says, is “configured automatically by injecting a simple script into each monitored page,” and the score it produces runs “from 0 (human) to 100 (bot), looking at all anomalies, starting with the very first request.” The product material confirms the telemetry comes “from client input devices, such as mouse movements and keyboard strokes from a desktop, or gyroscope and accelerometer readings from a mobile device.” It does not publish the field grammar. That part is community work.

bmak script event hooks + environment probes serialize field markers + values encrypt obfuscate + base64 POST sensor_data edge decodes + scores _abck issued *The browser-side pipeline: collect, serialize into the field-marker grammar, encrypt and base64, POST. The edge decodes, scores, and returns an _abck cookie that the next request must carry.*

The field-marker grammar

The decoded payload is, at bottom, a flat string of comma-separated integers and strings. There is no JSON, no key-value object, no nesting in the wire format. Structure comes instead from a repeating separator sequence that announces the start of each field, followed by a numeric code that says which field it is. The separator that the community has documented for the desktop script is the literal sequence -1,2,-94 followed by a comma and the field’s identifying code. So a segment that begins -1,2,-94,-100, is the field tagged -100; the values for that field follow until the next -1,2,-94 separator appears.

This is a deliberately awkward format to parse if you do not already know the codes. The separator looks like data. The codes are negative integers that carry no semantic hint. And the same three-token separator is reused for every field, so you cannot find field boundaries by looking for unique delimiters. You have to know that -1,2,-94 is special. Reverse-engineering write-ups have mapped a number of these codes by instrumenting the script and watching which browser events change which segments. The mapping that recurs across independent analyses ties -100 to the user-agent string, -101 to a sensor-status or feature block, -108 to keyboard actions, and -110 to mouse actions. These are observed assignments, not documented ones, and they have drifted across script versions, so treat the specific numbers as illustrative of the grammar rather than a stable contract.

decoded sensor_data (one segment per field) -1,2,-94 separator -100 code Mozilla/5.0 (...) value -1,2,-94 next field... observed code assignments (drift across versions): -100 user-agent -101 sensor / feature block -108 keyboard actions -110 mouse actions -117 touch -70 / -115 / -129 coherence + fp *The separator-and-code grammar. Every field opens with the literal `-1,2,-94` sequence, then an integer code, then the field value, until the next separator. Code numbers are observed, not documented, and shift between script builds.*

Beyond the behavioral codes, older v1 analysis tooling carved the parsed payload into named buckets. A widely used v1.70-era parser organized fields into browser information, automation detection, browser detection, screen size, events, a coherence check, challenges, fingerprinting, target info, and a block of miscellaneous variables. That same tooling cross-referenced a few numeric fields to script internals: field 70 to bmak.fpcf.fpValstr, the fingerprint value string; field 115 to the coherence check; field 129 to an internal variable the parser labels w. Those identifiers are specific to that version of the script. Newer builds renumber and restructure. The lasting point is the shape: a long flat string, sectioned by a reused separator, carrying a fixed set of telemetry categories.

Device and environment fields

The largest static contribution to the payload is the device fingerprint. This is the part of the script that runs once and asks the browser to describe itself. Canvas rendering produces a hash. The script draws text and shapes to an offscreen canvas, reads the pixels back, and hashes the result. Two machines with the same GPU, driver, and font stack produce the same hash, and the value is stable enough that reverse-engineering write-ups consistently flag it as one of the two fields you cannot fabricate from thin air, the other being a convincing motion trace. WebGL contributes the GPU vendor and renderer strings, the literal text the driver reports for the graphics adapter. AudioContext fingerprinting runs a short synthetic audio graph and reads back the floating-point output, which varies subtly by audio stack.

Around those three sits a wider inventory. Screen width, height, available dimensions, color depth, and pixel depth. Hardware-concurrency core count and device-memory size. The plugin and mime-type list, or its modern emptiness. Installed-font enumeration. Timezone and language. The list of navigator properties and their values. Each of these is cheap to read and, taken alone, weak. Taken together they form a fingerprint with enough entropy to recognize the same browser across requests and, more importantly, to notice when the browser’s self-description does not hang together. A reader walking the deobfuscated v2 script counted on the order of a hundred distinct signals feeding the payload, grouped roughly into browser fingerprint, hardware, behavioral, JavaScript-environment, timing, and network buckets. That hundred-signal figure is an analyst’s count of one build, not a number Akamai publishes, but it matches the breadth you see when you watch the script run.

If you want the comparison point for how a different vendor structures the same first-request inventory, the DataDome detection model post walks the equivalent signal set, and the DataDome JS tag post covers how a competing client payload is assembled. The categories overlap heavily because the browser exposes the same surfaces to everyone. What differs is the serialization and the obfuscation, which is where Akamai’s design is distinctive.

Behavioral fields

The behavioral segments are what make sensor_data grow over the life of a page. Where the device fields are captured once, the behavioral fields accumulate. Every mouse move, every keystroke, every scroll and touch and focus change appends to an in-memory buffer that gets flushed into the payload on the next serialization. This is the telemetry Akamai’s product material is describing when it talks about analyzing “mouse movements and keyboard strokes,” and it is the part the server’s machine-learning behavior model actually consumes.

The encoding is compact and relative. A mouse event does not store an absolute timestamp and an absolute coordinate for every sample. It stores deltas: the time since the previous event, the change in position. This keeps the payload small while preserving the shape of the motion, which is the thing the model cares about. Human pointer movement has a characteristic profile. It accelerates and decelerates, it overshoots and corrects, it has micro-jitter that does not come from a for loop interpolating between two points. Keystroke timing carries the same signal in a different dimension: the dwell time a key is held and the flight time between keys form a rhythm that is hard to fake convincingly. The keyboard segment records the cadence, not the characters, which is why the field can exist without the script logging what you actually typed.

This is also why an empty or synthetic behavioral block is such a strong tell. A payload that arrives with zero mouse events, zero scroll, a dead-straight pointer path, or keystroke timing with no natural variance reads as non-human almost immediately. The behavior model is trained to catch exactly the artifacts a naive automation produces. And because the server keeps state across a session, it can also catch replay: a previously valid telemetry block, resubmitted, will not match the new request’s context. Akamai’s own material claims the model detects “minuscule differences in behavioral characteristics” even when a bot tries to “mimic human behavior or replay previously validated telemetry.” The replay resistance is the part that makes the behavioral fields more than a checkbox.

four categories, one flat string device + environment canvas hash, WebGL strings, audio, screen, fonts, plugins, navigator behavioral mouse deltas, key dwell/flight, scroll, touch, focus/blur timing navigation + paint timestamps, script execution duration coherence probes webdriver flag, window.chrome shape, callPhantom, prototype tamper checks *The four telemetry categories the markers carry. Device and environment are captured once; behavioral and timing accumulate over the page's life; the coherence probes look for self-contradiction.*

Timing and consistency fields

Two smaller categories do disproportionate work. The first is timing. The script reads the Navigation Timing and Paint Timing APIs and folds in its own measurements: how long the page took to reach DOM-ready, when first paint fired, how long the serialization routine itself ran. These values are cheap to collect and they cross-check the behavioral story. A headless environment that renders nothing produces paint timings that do not look like a real page load. A script running far faster or far slower than a real CPU and event loop would produce leaves a mark. Timing also seeds parts of the obfuscation, which ties the anti-tamper layer to a live-execution assumption.

The second is the consistency or coherence block, and this is the part that hunts for lies rather than measuring behavior. The script probes for the presence and shape of things a real browser has and an automated one often gets wrong. It reads navigator.webdriver, the standardized flag that is supposed to be true under automation. It inspects the exact shape of the window.chrome object, which stealth tooling frequently stubs incompletely. It checks for telltale globals like callPhantom and window.opera and non-standard properties such as mozInnerScreenY, presences that betray specific automation stacks or spoofing libraries. It can detect prototype poisoning, where a script has overwritten a native function to lie about a value, because an overwritten toString or a function with the wrong arity does not match what a genuine native implementation returns. The coherence check is the field that catches a browser claiming to be Chrome on Windows while exposing a Linux GPU string and a webdriver flag it forgot to hide. A single such contradiction is enough to score the payload as a bot, which is why the literature is consistent that a single mismatch against the observed request fingerprint can invalidate the cookie immediately. This is the same philosophy DataDome’s HTTP/2 fingerprinting applies at the network layer: agreement across independent signals is the actual test, and any one of them disagreeing is fatal.

Obfuscation and encoding

A flat string of fields would be trivial to read off the wire, so the last stage of the pipeline exists to make that read expensive. The obfuscation operates at two levels: the script that builds the payload is itself obfuscated, and the payload it emits is encrypted before transport.

The script obfuscation is conventional but thorough. Strings are not stored as literals; they live in a single rotated array and get decoded at runtime by a custom function that takes a numeric offset, so reading the source shows you _0x3a1f(0x1e) where a clear build would show "navigator". Control flow is flattened into a dispatcher loop so the linear logic is hard to follow. Dead code is injected to pad the real branches. And the script carries timing traps that detect a debugger: if execution between two checkpoints takes too long, which is what stepping through in DevTools does, the script can tell it is being watched. None of this is unique to Akamai, but the combination is dense enough that static analysis alone gets you very little, and the practical path through it is dynamic instrumentation.

The payload encryption is where the version history matters, because it has been rebuilt to defeat exactly the static-extraction shortcut that worked on earlier builds. In the v2 generation the serialized field string was concatenated and then encrypted with a scheme involving data shuffling and character substitution before base64. The decisive change in v3 was to make the encryption depend on a hash of the script file itself. The serialized JSON is turned into a colon-delimited string, and the elements of that string are shuffled by a pseudo-random number generator seeded from a hash derived from Akamai’s own JavaScript. Because the seed comes from the live script, you cannot precompute the transform from a captured payload alone; you have to be running the current script to reproduce it, and Akamai rotates the script. There is a bootstrap wrinkle worth noting: on the very first request, before a server-issued value exists, the script uses a default hash documented in the community as 8888888, and only once the response returns a valid cookie do subsequent payloads switch to a hash derived from that cookie. The effect is a chicken-and-egg dependency that ties each payload to a specific session and a specific script build.

v1 v2 v3 flat field string, lighter encoding shuffle + substitution, then base64 PRNG shuffle seeded from live script hash each version binds the payload tighter to a live, rotating script and the current session's cookie hash *Three encryption generations. The direction of travel is toward making the transform impossible to precompute from a captured payload alone, by seeding it from a hash of the script that Akamai keeps rotating.*

The through-line across versions is consistent. v1 hid the field layout behind a reused separator and light encoding. v2 added a real encryption step over the serialized string. v3 bound that encryption to the script file’s own hash so that a static decryptor goes stale every time the script rotates. None of this stops a sufficiently patient analyst from instrumenting the live script and watching the payload form, which is how the community mapping exists at all. What it does is raise the per-deployment, per-version cost of doing so, and tie any extracted understanding to a moving target. The relationship between this payload and the cookie it earns is covered in the _abck cookie post; how the decoded telemetry becomes a score is the subject of the Bot Manager scoring post.

The mobile variant

The web sensor has a sibling that solves the same problem on phones, and it is worth a section because the engineering is so different. Akamai’s mobile Bot Manager Protocol, the BMP SDK, runs as native code inside an app rather than as JavaScript in a page. A teardown of one mobile build, version 4.1.3, found the telemetry collected through a set of native classes rather than DOM event listeners. Motion data, the accelerometer, gyroscope, and magnetometer readings that the web sensor cannot get, comes from a dedicated sensor class, with a second class processing higher-order motion such as jerk across multiple axes. Touch events record down, move, and up coordinates. A device-metadata class gathers dozens of device-information fields. A system-fingerprinting class produces a device ID.

The serialization rhymes with the web format. The same -1,2,-94,{code},{value} separator-and-code structure appears, opening with the SDK version string as the first field, which on that build is the literal 4.1.3. The encryption is heavier and native: the analysis describes the serialized data passing through AES-128-CBC, an HMAC-SHA256 authentication tag, and base64, with parts of the pipeline implemented in ARM64 assembly reached through JNI calls so that the cryptographic constants never sit in readable Java. The output is a compound string carrying RSA-wrapped material, the base64 of the IV-ciphertext-HMAC bundle, and a timing component. The field grammar is shared with the web sensor; the obfuscation moves from JavaScript tricks to native-code and assembly hardening, which is the right call on a platform where you control the binary.

What the payload tells you about the design

Step back from the field codes and a clear design intent shows through. The payload is built so that no single signal carries the decision and so that the signals constrain each other. The device fields establish what the browser claims to be. The behavioral fields establish whether a human appears to be driving it. The timing fields cross-check the execution environment. And the coherence block exists purely to catch the gap between claim and reality, because the failure mode of every spoofing effort is internal contradiction. A bot can fake any one field. Faking all of them in mutual agreement, continuously, across a stateful session, against a model trained on the artifacts of the easy fakes, is the actual problem the payload poses.

The obfuscation history is the other half of the story. Each version moved the encryption closer to the live script and the live session, away from anything an attacker could compute once and reuse. v3’s script-hash seeding is the clearest statement of that: the payload is not just encrypted, it is encrypted in a way that only the current, rotating script can reproduce, with a bootstrap that ties the very first request into the chain. That is a design optimizing against a specific adversary, the one who extracts a transform once and replays it at scale, and it shifts the economics toward whoever is willing to keep re-instrumenting a moving target.

The honest caveat stays in force throughout. Akamai does not publish the field grammar, the code assignments, or the encryption internals, and the specifics here come from independent reverse-engineering of particular builds that have since changed. The numbers will drift. What does not drift is the categories and the philosophy: a browser’s self-description, a recording of how it was driven, a clock to check both against, and a set of probes whose only job is to notice when the story does not hold together. Read the codes as a snapshot of a system that is rebuilt on a schedule, and the snapshot is still worth having.


Sources & further reading

  • Akamai (2024), Bot Manager — vendor product page describing behavior anomaly detection, the injected script, and the 0-to-100 anomaly score.
  • Akamai Developer (2024), Akamai Bot Manager — overview of telemetry from client input devices and standard versus inline telemetry collection.
  • Edioff (2024), akamai-analysis — educational deep dive into Bot Manager v2 signal categories, the ~100-signal count, and string-array-rotation obfuscation.
  • Edioff (2024), sensor_data structure (sanitized) — documents the pipe- and comma-delimited section layout and the version/device/browser/behavior/timing/checksum breakdown.
  • OXDBXKXO (2023), akamai-toolkit — v1.70 sensor_data parser mapping numeric fields (70 = fpValstr, 115 = coherence, 129 = w) to named categories.
  • glizzykingdreko (2023), Akamai v3 sensor data: encryption and structure — the colon-delimited serialization, PRNG shuffle seeded from a script-file hash, and the 8888888 bootstrap default.
  • 小伟 (2024), Decoding Akamai 2.0: sensor_data and akamai-bm-telemetry — the 58-element array, base64-wrapped akamai-bm-telemetry, and the canvas / motion-trajectory primacy.
  • xve-e (2024), Analyzing Akamai BMP 4.1.3, part 2 — mobile SDK teardown covering native sensor classes, the shared separator grammar, and the AES-128-CBC / HMAC-SHA256 / base64 native pipeline.
  • Dima Kynal (2026), The hidden fingerprints of bot protection — the bmak global object, the _abck and bm_sz cookies, and the Akamai response-header tells.
  • Scrapfly (2026), Bypass Akamai Bot Manager — survey of the telemetry categories the sensor collects and the single-mismatch invalidation behavior.

Further reading