AudioContext fingerprinting: the OscillatorNode signature explained
Generate a 10 kHz triangle wave, push it through a dynamics compressor, render the result offline, and sum the absolute value of the samples that come out. You get a number. On Chrome for macOS it lands somewhere around 101.46. On Firefox it is closer to 80.95, on Safari closer to 79.59. Nobody hears any of this. No speaker moves, no microphone opens, no permission prompt fires. The audio is computed into a buffer and thrown away the instant the sum is read. What survives is the float, and that float is remarkably consistent on one machine and frustratingly different across machines that look identical from the outside.
The strange part is that the math is supposed to be deterministic. A triangle wave is a closed-form function. A compressor is a published algorithm. Two browsers running the same spec on the same input should produce the same output to the last bit, and they do not. The gap between “should be identical” and “is not” is the whole signal. This post traces where that gap comes from, why a sine through a compressor turns into an identifier, how much it actually distinguishes you, and what the browsers that noticed have done about it.
A roadmap
We start with the 2016 measurement that found audio fingerprinting running on real sites and named the two node configurations in the wild. Then we walk the actual API calls a collector makes, node by node, and the two ways the buffer gets reduced to a number. After that comes the part most explainers skip: why deterministic floating-point DSP produces different bytes on different machines, which is a story about FFT twiddle factors, denormals, and per-platform math libraries. Then the entropy question, the move from a research curiosity to a standard line in every commercial fingerprinting SDK, and finally the defenses, the noise-injection arms race, and where the vector sits in 2026.
The 2016 origin
The technique has a clean birth date. In October 2016, Steven Englehardt and Arvind Narayanan presented “Online Tracking: A 1-million-site Measurement and Analysis” at ACM CCS. The paper crawled the top million sites with an instrumented browser and catalogued tracking behavior at a scale nobody had reached before. Most of it was about cookies and canvas. Buried in Section 6.4 was a technique they described as never measured at scale: fingerprinting by abusing the Web Audio API.
Their description is the one worth quoting because it frames the whole idea. Audio signals processed on different machines or browsers may have slight differences due to hardware or software differences between the machines, while the same combination of machine and browser will produce the same output. That is canvas fingerprinting moved from the GPU to the audio DSP path. Same logic, different subsystem. Render a fixed input through the browser’s own processing pipeline, read the output, and let the machine-specific quirks of that pipeline become the identifier.
The paper found the technique used three different ways across three scripts. The simplest, from a company called Liverail, did not render anything. It just checked whether AudioContext and OscillatorNode existed and added a single bit to a larger fingerprint. The more sophisticated scripts, served from cdn-net.com, pxi.pub, and ad-score.com, actually processed an oscillator signal and hashed the result. The paper’s Figure 8 lays out two distinct node graphs: one that runs an oscillator into an AnalyserNode and reads an FFT, another that runs an oscillator into a compressor inside an OfflineAudioContext. Both end the same way, by reading the resulting samples and hashing them with SHA-1 into a device audio fingerprint.
To find out whether the thing actually worked, they built a demo page and let it collect. The numbers they reported are modest and honest. Across 18,500 devices with distinct cookies, the audio fingerprint resolved into 713 distinct values, which they estimated at 5.4 bits of entropy. They were careful to say this was a lower bound from a limited sample and left a full evaluation to future work. The technique was rare in 2016. The Liverail check appeared on 512 sites, the rendering scripts on as few as 6. The point of the section was not that audio fingerprinting was widespread. It was that a one-million-site crawl could bootstrap off a handful of known scripts to discover and quantify a brand-new vector before it spread.
What the script actually does
Strip the rendering variant to its skeleton and there are five steps. Create an offline context. Make an oscillator. Make a compressor. Wire oscillator into compressor into the context destination, start the oscillator, and render. Then read the buffer back and reduce it to a number.
The standard reference implementation today is the one in FingerprintJS, whose audio source has been copied, near-verbatim, into most open-source fingerprinting libraries. Its parameters are worth stating exactly because they are the de facto convention. The context is a single channel at a 44,100 Hz sample rate, with a render length of 5,000 samples, which works out to roughly 113 milliseconds of audio that never plays. The oscillator is a triangle wave at 10,000 Hz. The compressor runs with threshold at -50 dB, knee at 40 dB, ratio at 12, attack at 0, and release at 0.25 seconds.
Those compressor numbers are not arbitrary and they are not chosen for how the audio sounds, because nothing ever hears it. A DynamicsCompressorNode lowers the volume of the loudest parts of a signal, and its behavior is governed by a gain-reduction curve with a soft knee around the threshold. Pushing the threshold low, the knee wide, and the ratio high forces the signal deep into the nonlinear part of that curve, where the per-sample gain depends on a running envelope follower with the configured attack and release. That nonlinearity is the point. A linear pass-through would expose only the oscillator’s quirks. Running the tone through an aggressive compressor multiplies the number of floating-point operations each sample passes through and amplifies any tiny per-platform difference in how those operations round.
The reduction step is deliberately crude. The FingerprintJS variant does not read all 5,000 samples. It takes the slice from index 4,500 to 5,000, the tail of the buffer after the compressor’s envelope has settled, and sums the absolute value of each sample.
let hash = 0for (let i = 4500; i < 5000; i++) { hash += Math.abs(sample[i])}return hash // e.g. 124.0434That is the entire fingerprint reducer. There is no SHA needed, because the collector is not trying to compress the buffer; it is trying to collapse 500 floats into one float that is stable on a machine and differs across machines. Summing absolute values does that and shrugs off a single sample landing one ULP off. The earlier research scripts hashed with SHA-1 over the FFT magnitudes instead, which gives a fixed-length identifier but is more brittle, since one flipped low bit changes the whole digest. Both approaches survive in the wild. Sum-of-absolutes is the one that won, because stability beats cryptographic neatness when the input is itself approximate.
*The compressor variant in full. The tail 500 samples are summed by absolute value into one float; the buffer is discarded.*One implementation wrinkle matters in practice. Safari historically required webkitOfflineAudioContext rather than the unprefixed name, so portable collectors feature-detect both. Mobile Safari 11 and earlier suspended audio contexts unless a user gesture resumed them, which blocked silent fingerprinting outright on those versions. And on current iOS the context can refuse to start until the tab is in the foreground, so the collector retries. These are the kinds of edge cases that pad a real implementation to a few hundred lines despite the core being a dozen.
Why deterministic math gives different bytes
This is the question the surface-level write-ups wave at and the one that actually explains the vector. If the input is a fixed function and the algorithm is a published spec, why does Chrome on macOS sum to 101.46 while Firefox sums to 80.95? The answer is that “the algorithm” hides an enormous amount of per-platform freedom in how the floating-point arithmetic is actually carried out, and the Web Audio spec never nailed it down.
Start with the oscillator. A triangle wave at 10 kHz is not stored as a lookup table in most engines; it is synthesized, and band-limited synthesis to avoid aliasing leans on transcendental functions. The compressor’s gain computer evaluates its curve with pow, exp, and log style operations on every sample. The FFT path runs a fast Fourier transform whose twiddle factors come from sin and cos. None of sin, cos, pow, exp, or log is specified to the last bit by IEEE 754. The standard mandates correct rounding for add, multiply, divide, and square root. It explicitly does not require it for the transcendental functions, which means every platform’s math library is free to return a slightly different result for sin(x), as long as it is close. Apple’s libm, glibc, the MSVC runtime, and the bundled fdlibm all disagree in the last bit or two for many inputs. Run a few thousand samples through a chain of those functions and the rounding differences accumulate into a visible difference in the sum.
The hardware compounds it. SIMD vector units evaluate fused multiply-add in a single rounding step where a scalar path would round twice, so the same expression gives different low bits depending on whether it ran on the vectorized code path. x86 with AVX, an Arm core with NEON, and Apple silicon each take different routes through the same compressor loop. The browser engineers know this. The FingerprintJS write-up notes that Chrome ships a separate FFT implementation on macOS, where it can call Apple’s Accelerate framework, and that its vector operations differ across CPU architectures specifically in the compressor path. That is why the audio fingerprint correlates so strongly with the operating-system-and-CPU bucket: it is, in large part, a fingerprint of which math library and which SIMD path the browser linked against.
Then there are denormals. When the compressor’s envelope decays toward zero, the sample values get extremely small, into the subnormal range of the float format. Some CPUs and some build configurations enable flush-to-zero, where subnormals are clamped to zero instead of processed at full precision, because handling true denormals is slow. Whether flush-to-zero is on changes the decay tail of the compressed signal, which is exactly the region the collector sums. This is one reason the tail samples are the useful ones and also one reason the same browser can occasionally produce two different values: a context that runs while the CPU is in a different power state can take a different denormal path.
*Four independent sources of divergence stack inside one render. The audio fingerprint is, to a large degree, a fingerprint of which math library and SIMD path the browser linked.*The honest caveat: the exact byte-level contribution of each of these sources on a given device is not something a page can read out directly, and browser vendors do not publish a breakdown. What follows from the public engineering discussions is the mechanism, not a per-device attribution. The clearest confirmation that this is the real cause comes from the fix Mozilla shipped, which is the next part of the story.
From curiosity to standard SDK line
Between 2016 and now the vector went from a research footnote to a default signal in nearly every commercial device-intelligence product. The reason is not that audio is especially distinguishing on its own. It is not. The 5.4 bits the original study measured is low, and a 2021 study from the University of New Orleans, “A Study of Feasibility and Diversity of Web Audio Fingerprints,” found it lower still: across 2,093 users they saw only 95 distinct fingerprints from the standard vectors, far less diverse than canvas or WebGL. Their phrasing for the stability problem is worth keeping. The audio vectors showed fickleness, with some browsers handing back differing fingerprints on repeated attempts, and the authors had to build a graph-based clustering step to recover a stable identifier from the noise.
So why is it everywhere? Because it is cheap, it is fast, and it is orthogonal. The whole render takes on the order of a hundred milliseconds and can run multiple iterations in well under a second. It needs no permission and shows no visible sign. And critically, it splits the population along a different axis than the visual fingerprints. Canvas and WebGL key heavily on the GPU and the font stack. Audio keys on the CPU’s math library and SIMD path. A collector that already has 18 bits from canvas gains real separating power from an orthogonal 5 bits of audio, even when those 5 bits are individually weak, because the combined entropy is what matters. In a multi-signal fingerprint, a cheap orthogonal vector earns its place. That is why the FingerprintJS audio module, and the dozens of forks of it, sit in fingerprinting stacks that feed device identification inside anti-bot systems and the broader JavaScript runtime fingerprint a detector assembles on the first page load. It also sits next to siblings that key on different subsystems: the navigator object, screen and device-pixel-ratio entropy, and the font enumeration side channel, each contributing its own orthogonal slice.
For an automation operator the audio vector has a particular bite. A headless Chrome on a Linux server in a datacenter produces an audio sum that matches Linux-Chrome-on-that-CPU, and that is a different bucket from the Windows and macOS consumer machines the traffic is pretending to be. Spoofing the user-agent does nothing to the float. The compressor still runs through glibc and whatever SIMD the server CPU has, and the resulting value betrays the real platform. Patching it convincingly means intercepting the rendered buffer and substituting a value consistent with the claimed device, which the anti-detect tooling does at the engine level, with mixed success, because a substituted constant that never varies is itself a tell.
Defenses and the noise arms race
The browsers that took fingerprinting seriously split into two camps, and the split is instructive.
Firefox tried to make the output identical everywhere. The relevant work is Bugzilla 1358149, opened in 2017 after DoubleClick was caught running a silent AudioContext to fingerprint users, and closed in 2023. The fix was to route Web Audio’s math through fdlibm, the Freely Distributable Math Library, so that sin, cos, and the FFT twiddle factors return bit-identical results regardless of the host’s system libm. The commits replaced the trigonometric and hyperbolic functions in the audio path with fdlibm equivalents and even switched the FFT order calculation to integer log2 to dodge a floating-point precision difference. Under privacy.resistFingerprinting, Firefox also hardcodes the sample rate and channel count so those cannot leak either. The goal is a single, platform-independent audio value, which removes the entropy rather than hiding it. The Tor Browser, built on this work, aims to make every user’s audio fingerprint the same.
Safari went the other way and added noise. Starting with Safari 17, WebKit perturbs the values read out of the Web Audio API by a small random factor, so that each read of a buffer comes back slightly different. The published behavior multiplies each sample by 1 + magnitude * (2 * r - 1) for a uniform random r, with the magnitude set per API: 0.001 for AudioBuffer.getChannelData and copyFromChannel and the worklet path, and a much larger 0.25 for getFloatFrequencyData, which is the analyser read the FFT-based collectors depend on. The protection is on by default in private browsing and off in normal mode, on both desktop and mobile. The effect is that a sum-of-absolutes fingerprint fluctuates by a fraction of a percent on every call in private mode, which breaks exact-match keying. It does not break a collector that quantizes or clusters, which is why the protection is best understood as raising the cost, not closing the channel. FingerprintJS documented that its own library detects Safari 17 and Samsung Internet 26 and simply skips the audio source rather than feed noise into the device identifier.
Chrome, the largest target, has done the least. As of 2026 it ships neither fdlibm normalization nor sample noise by default, which is consistent with its general reluctance to break Web Audio for the sake of fingerprinting resistance and its preference for tackling tracking through the Privacy Sandbox at the storage layer rather than the API layer. The practical consequence is that the audio vector remains fully live in the browser that most automation traffic impersonates, which is precisely why it stays valuable to detectors. The same asymmetry showed up with the Battery Status API, where the privacy fix was eventual removal rather than normalization, and it is the recurring shape of every fingerprinting channel: the entropy is a side effect of an API doing exactly what it was designed to do, and closing it means either flattening the output or accepting a permission prompt nobody wants.
What the float really is
The audio fingerprint is a small, stubborn fact about a machine that no amount of header spoofing touches. It is not high entropy. On its own, 5 bits separates one device from a few dozen others, not from millions. But it is cheap to collect, invisible to the user, stable across sessions and incognito on the browsers that have not patched it, and orthogonal to the visual fingerprints sitting beside it in the same payload. Those properties are what keep a 2016 research curiosity in production a decade later. A collector does not run the oscillator because audio is uniquely identifying. It runs it because five orthogonal bits for a hundred milliseconds of silent compute is a good trade in a stack that is already summing entropy from a dozen sources.
The most telling detail is the fix Mozilla had to ship. To make the audio fingerprint go away, they did not throttle an API or add a prompt. They replaced the host system’s math library with a bundled one, because the leak was the system math library all along. That is the whole vector in one sentence. The value a browser computes for sin of an angle is not a constant of the universe; it is a property of the code that browser was linked against, and a triangle wave through a compressor is just a way of reading that property out a hundred thousand times and summing the difference.
Sources & further reading
- Englehardt, S. and Narayanan, A. (2016), Online Tracking: A 1-million-site Measurement and Analysis — ACM CCS 2016; Section 6.4 is the first at-scale measurement of AudioContext fingerprinting, reporting 18,500 devices, 713 fingerprints, and 5.4 bits of entropy.
- Princeton Web Census (2016), Web Transparency and Accountability Project — project page and data for the 1-million-site crawl behind the CCS paper.
- Chalise, S. and Vadrevu, P. (2021), A Study of Feasibility and Diversity of Web Audio Fingerprints — University of New Orleans study of 2,093 users finding audio fingerprints far less diverse and more fickle than canvas, and a graph-based recovery method.
- FingerprintJS (2019, updated), How the Web Audio API is used for audio fingerprinting — the canonical explainer with the exact node parameters and example sums across Chrome, Firefox, and Safari.
- FingerprintJS source (current), src/sources/audio.ts — the de facto reference implementation: triangle 10 kHz, compressor at -50/40/12, tail-sum of samples 4500–5000.
- FingerprintJS (2024), How We Bypassed Safari 17’s Advanced Audio Fingerprinting Protection — documents WebKit’s per-sample noise injection, the 0.001 and 0.25 magnitudes, and the private-mode-only behavior.
- Mozilla (2017–2023), Bug 1358149: Address fingerprinting issues with AudioContext — Firefox’s fdlibm normalization of the Web Audio math path and the RFP sample-rate/channel hardcoding.
- WebAudio CG (2018), Issue 1500: Fingerprinting Based on DynamicsCompressor and OscillatorNode — the spec-side acknowledgment requesting the privacy section address the risk.
- W3C (2021), Web Audio API — DynamicsCompressorNode — the normative definition of the compressor and oscillator nodes whose under-specified math produces the divergence.
- MDN, DynamicsCompressorNode — reference for the threshold, knee, ratio, attack, release, and reduction parameters.
Further reading
Canvas fingerprinting: how a single toDataURL call identifies a device
Traces how rendering text and shapes to an HTML5 canvas and hashing the toDataURL output yields a stable per-device value, the GPU, driver and font causes behind the variation, the 2012 origin, and how much entropy it really carries.
·22 min readWebGL fingerprinting: the renderer string, precision, and shader quirks
A primary-source reference on WebGL fingerprinting: the UNMASKED_RENDERER and UNMASKED_VENDOR strings, supported extensions, shader precision formats, rendered-image hashing, and the browser mitigations that bucket or hide them.
·24 min readFont fingerprinting: enumeration, measurement, and the @font-face side channel
Traces how the installed-font set became a high-entropy fingerprint, the text-width and ClientRects measurement that reads it without any font API, the @font-face/local() side channel, and the browser defenses that tried to close it.
·18 min read