The history of browser fingerprinting: from Panopticlick to the entropy economy
A cookie is a name the server gives you. You can delete it. A fingerprint is a name the server computes from what you already are, and you cannot delete the thing you are. That asymmetry is the whole story. The moment a server can recognize a returning visitor without storing anything on the visitor’s machine, every defense built around clearing state, private windows, fresh profiles, becomes a speed bump instead of a wall.
Browser fingerprinting is the practice of building a stable identifier out of the small differences between one browser and another: the exact fonts installed, how a GPU rasterizes a curve, the order of audio samples a synthesizer produces, the timezone, the screen geometry, the list of supported codecs. None of these is secret. Each one leaks a few bits. Stacked together they routinely single out one machine in hundreds of thousands. This post follows how that idea went from a thought experiment to a measured reality to a billion-dollar industry, and how the browser vendors who shipped the leaky APIs spent the next fifteen years trying to plug them.
The route runs through six stops. The 2009 experiments that first asked whether a browser could be a name. Eckersley’s 2010 Panopticlick, which put a number on it. The 2012 canvas discovery that turned the GPU into an oracle. The measurement studies that found the technique in the wild and then argued about how unique anyone really is. The commercial entropy economy that grew up around anti-fraud and FingerprintJS. And the slow, partial counter-push from Mozilla, Apple, and eventually Google.
2009: the question, asked twice
The idea that a browser configuration could identify a person was not new in 2009, but that year it got tested. Two efforts matter.
The first was an undergraduate senior thesis at Princeton by Jonathan Mayer, titled Any person… a pamphleteer. Mayer collected the navigator object, the screen object, and the contents of navigator.plugins and navigator.mimeTypes from visitors, and asked a simple question: how many of these clients are distinguishable from each other? Out of 1,328 clients, 1,278 were unique. That is 96 percent, from four JavaScript objects, with no cookies and no special privileges. The thesis is largely forgotten now, but it framed the problem correctly: anonymity online is a property of the crowd, and the crowd is smaller than it looks.
The second effort was practical. The same year, the Electronic Frontier Foundation began building a public experiment that would let anyone measure their own browser against a growing corpus. That became Panopticlick, and it shipped in 2010.
What both efforts shared was a refusal to treat fingerprinting as hypothetical. You could argue in 2008 that browsers were diverse but that nobody was actually using the diversity to track. By the end of 2009 the counterargument had data behind it.
2010: Panopticlick puts a number on uniqueness
Peter Eckersley’s How Unique Is Your Web Browser? was presented at the Privacy Enhancing Technologies Symposium in Berlin in July 2010. It is the paper everyone cites, and for a good reason: it stopped the argument from being about whether and made it about how much.
Panopticlick measured eight things that a website can read without permission: the User-Agent string, the HTTP Accept headers, whether cookies are enabled, screen resolution and color depth, the timezone, the list of browser plugins, the list of system fonts (enumerated through Flash or Java), and whether the browser blocks supercookies. It then treated each browser’s combined configuration as a value drawn from a distribution and measured the distribution’s Shannon entropy.
The numbers are worth stating exactly. Across roughly half a million browsers in the dataset, 83.6 percent had a configuration that was unique in the sample. Among browsers that had Flash or Java installed, which exposed the font list, 94.2 percent were unique. The full fingerprint carried at least 18.1 bits of entropy, which means that for a browser picked at random, at best one in 286,777 others would share its fingerprint. Plugins and fonts were the heaviest contributors. The User-Agent string alone, which people tend to think of as the identifier, carried about 10 bits.
*Approximate per-component entropy from the 2010 study. The plugin and font lists, both exposed through Flash and Java, dominated; the User-Agent string alone was a minority of the total.*The paper also made an observation that has aged into the central tension of the field. Fingerprints are not static. Browsers update, fonts get installed, plugins come and go. Eckersley reported that a returning visitor’s fingerprint changed often, but that a simple heuristic could still re-link most changed fingerprints to their previous identity, because the components rarely all change at once. So the practical identifier is not the fingerprint itself but the fingerprint plus a linking rule. That detail is exactly what the commercial industry would later turn into a product.
If you want the experiment in detail, including what it could and could not prove, we wrote it up separately in the EFF Panopticlick experiment.
2012: canvas turns the GPU into an oracle
Panopticlick’s entropy came largely from plugins and the Flash-enumerated font list. That was a problem for the trackers, because Flash was dying. The font list was about to go dark.
The replacement arrived in May 2012. Pixel Perfect: Fingerprinting Canvas in HTML5, by Keaton Mowery and Hovav Shacham at UC San Diego, presented at the Web 2.0 Security and Privacy workshop, described a technique that needed no plugin at all. You draw text and shapes to an off-screen HTML5 <canvas> element, then read the pixels back with toDataURL. The bytes you get are not identical across machines. They depend on the GPU, the graphics driver, the operating system’s font rasterizer, the antialiasing algorithm, and the sub-pixel hinting. Two machines rendering the same instruction to draw the word “Cwm fjordbank” at the same size produce subtly different pixels, and the difference is stable for a given machine.
The paper’s own description is the clearest summary of why it works. Tying the browser more closely to operating-system functionality and system hardware, the authors wrote, means websites have more access to those resources, and browser behavior varies depending on the behavior of those resources. The canvas reads out that variance. They noted the fingerprint was consistent, high entropy, orthogonal to other fingerprints, transparent to the user, and easy to obtain. That last property is the one that should worry you. The user sees nothing. No prompt, no permission dialog, no visible canvas.
*The canvas pipeline. The draw call is identical on every machine; the readback is not, because it passes through hardware and OS layers that differ. The mechanism is covered in depth in the canvas post linked below.*Canvas was the template for a whole family of techniques that followed the same logic: find an API that touches hardware or the OS, give it a deterministic input, and read the machine-specific output. WebGL exposed the GPU even more directly, including the unmasked renderer string. AudioContext did the same with the audio stack. We cover each separately, canvas, WebGL, and AudioContext, but they are all children of Pixel Perfect.
2014–2016: found in the wild, then measured properly
For two years canvas fingerprinting was a paper. In 2014 it became a headline.
The Web Never Forgets, presented at CCS 2014, ran the first large-scale crawl looking for canvas fingerprinting in production. It found the technique on 5,542 of the top 100,000 websites, about 5.5 percent. More striking, it traced roughly 95 percent of those scripts back to a single company, AddThis, a social-sharing widget vendor that had started experimenting with canvas as a cookie replacement early that year. One widget, embedded across a long tail of sites, quietly fingerprinting visitors. The press coverage was sharp, AddThis backed off, and the episode established the pattern for the decade: a tracking technique moves from research to deployment faster than the public notices, and only a measurement study drags it into the light.
The 2016 follow-up, Online Tracking: A 1-million-site Measurement and Analysis by Steven Englehardt and Arvind Narayanan at Princeton, scaled the crawl up to a million sites and built the open-source OpenWPM platform to do it. It documented several fingerprinting vectors that had never been measured in the wild before, including audio-based fingerprinting through the OscillatorNode in the Web Audio API. The mechanism is the same as canvas: drive an oscillator through the audio processing graph and read back the resulting buffer, which varies by platform.
Meanwhile a parallel research line was asking a harder question: is fingerprinting actually as powerful as Panopticlick suggested, or was that a small, skewed sample? Two studies bracketed the answer.
The optimistic-for-trackers end came from Beauty and the Beast, by Pierre Laperdrix and colleagues at IEEE S&P 2016, the paper behind the AmIUnique project. Using a richer attribute set that included canvas and WebGL, they found 89.4 percent of fingerprints unique in their dataset, and showed that the new HTML5 attributes, canvas especially, carried discriminating power that more than replaced the dying Flash font list.
*The headline uniqueness figure is a function of the sample. A small, opt-in, desktop-heavy population looks very identifiable; a large general-audience one with many phones looks far less so.*The deflationary end came in 2018. Hiding in the Crowd, by Alejandro Gomez-Boix, Laperdrix, and Baudry at the Web Conference, collected over 2 million fingerprints from one of the most-visited French websites, a general-audience news and weather site rather than a privacy-curious opt-in crowd. Using the same 17 attributes as AmIUnique, only 33.6 percent of fingerprints were unique. The reason was phones. Mobile devices come in a small number of models running near-identical software, with no installed-font diversity and locked-down hardware, so a modern iPhone looks like every other iPhone of its model. Among mobile fingerprints, uniqueness collapsed. The takeaway is not that fingerprinting fails, it is that desktop and mobile are different threat models, and the scary 84-to-90-percent numbers come from desktop-heavy, self-selected samples. On a phone you really can hide in the crowd, at least from the basic attribute set.
This tension between stability and uniqueness is the field’s permanent constraint, and it has its own entropy-budget post. A signal that is perfectly unique but changes every session is useless for tracking. A signal that is perfectly stable but shared by millions is useless for distinguishing. The valuable signals sit in the narrow band that is both fairly unique and fairly stable, and the whole commercial game is finding more of them.
The entropy economy: fingerprinting becomes a product
Research proved fingerprinting worked. Two industries turned it into revenue, and they pull in opposite directions on the privacy question, which is why the same technique can be described as both a tracking menace and a fraud defense.
The advertising-and-analytics use is the one privacy advocates worry about: identify a user across sites without consent, build a profile, target ads, survive cookie deletion. The AddThis episode was this category. Most of the GDPR and ePrivacy attention since has been aimed here, because this is the use that touches ordinary browsing.
The anti-fraud use is different in intent and, mostly, in social standing. A bank wants to know whether the device logging into your account is the device that always logs into your account. An e-commerce site wants to know whether a thousand “different” checkout sessions are really one machine running a card-testing script. Here the fingerprint is a security signal, and the same property that makes it creepy for ad tracking, recognition without stored state, makes it useful against attackers who clear state on purpose. This is the world of device fingerprinting in anti-bot stacks and fingerprint-based fraud scoring.
The clearest single artifact of the entropy economy is FingerprintJS. The open-source library started in 2012, written by Valentin Vasilyev and hosted on his personal GitHub. It collected the obvious signals, hashed them into an identifier, and made fingerprinting a three-line include for any developer. It was popular precisely because it was simple. The library went through a rewrite as FingerprintJS2, and the project around it incorporated as a company about eight years after the first commit, keeping the brand. The commercial product, sold as a fraud and bot-detection identifier, added server-side signals and a proprietary identifier the open-source library does not produce, and the company rebranded from FingerprintJS to Fingerprint in 2022.
The licensing history is its own small lesson in how valuable this turned out to be. In 2023 the open-source library moved to a Business Source License for version 4.0, restricting commercial use, before reverting version 5.0 to MIT later. Companies do not fight over the license of code nobody wants. The exact composition of the Pro identifier, the server-side signals and the smoothing logic that re-links a device whose fingerprint drifted, is not fully public; what follows in any vendor analysis is inferred from the open-source signals plus observed behavior, and the vendors are explicit that the valuable part is the linking, not the raw collection. We separate the open and proprietary halves in the FingerprintJS internals post.
*The through-line. Research on the left, deployment and measurement in the middle, vendor pushback at the bottom. Each defensive move trails its trigger by years.*The anti-bot vendors built fingerprinting into much larger collection pipelines, where the device fingerprint is one signal among dozens that include network-layer and behavioral data. The DataDome, Akamai, and Cloudflare systems we document elsewhere all collect a fingerprint as part of a first-request scoring decision, and treat its drift over a session as itself a signal. A fingerprint that changes mid-session is a tell. So is one that matches a known automation framework. The field has expanded well past the browser, into TLS fingerprinting at the handshake layer and behavioral biometrics at the input layer, but the browser fingerprint is still the anchor most stacks start from.
The pushback: vendors against their own APIs
Every signal in a fingerprint is a feature someone shipped on purpose. The plugin list existed so pages could detect Flash. The canvas existed for graphics. The Battery Status API existed so a web app could dim its UI on low power. Each was reasonable in isolation and a leak in aggregate. The history of the defense is the history of browser vendors slowly accepting that their own APIs were the problem.
The cleanest early example is the Battery Status API. Standardized so pages could read charge level and charging state, it turned out that the combination of charge percentage, plus time-to-full and time-to-empty in seconds, produced a short-lived but high-resolution identifier that could re-link a user across the brief window before the values changed. The privacy analysis landed in 2016, and Firefox removed the JavaScript-exposed API in 2017. WebKit did not ship it. A standardized web feature was withdrawn specifically because it leaked entropy. We give it its own post because it is the rare case where the vector was killed outright rather than merely dulled.
Apple took the most aggressive posture overall. WebKit’s approach has been to make Safari deliberately boring: report a simplified, low-entropy system configuration so that more machines look identical, decline to expose installed fonts beyond a fixed set, and refuse to add high-entropy APIs in the first place. Intelligent Tracking Prevention, from 2017 onward, attacked the stateful side by capping script-writable storage lifetimes and disabling third-party cookies, which pushed trackers toward fingerprinting, which Safari then went after directly. The most recent escalation is noise injection. Safari’s Advanced Fingerprinting Protection adds small per-site, per-session perturbations to the readback from 2D canvas, WebGL, and Web Audio, so the canvas hash a tracker computes is salted differently on every site and every session. Introduced for private browsing first, it became a default-on protection in the latest Safari. A salted canvas is no longer a stable cross-site identifier, which is the whole point.
Mozilla’s Firefox took a similar but more configurable line, shipping fingerprinting-script blocking through its tracking-protection lists and a stricter privacy.resistFingerprinting mode that, among other things, spoofs a common screen size and timezone and clamps timer precision. The aggressive mode breaks enough sites that it is not the default, which is the recurring problem with fingerprinting defenses: the strongest mitigations also degrade the experience, so they live behind a toggle most users never find.
Google’s position was the most conflicted, because Google sells advertising. Its largest fingerprinting-relevant change was User-Agent Reduction, a multi-year effort to freeze and trim the Chrome User-Agent string so it stopped broadcasting the exact OS version, device model, and full browser build. The rollout ran in phases: the minor version digits were zeroed in Chrome 101 in 2022, the reduced desktop string rolled out from Chrome 107 in late 2022, the reduced Android string from Chrome 110 in early 2023, and the deprecation trial that let sites keep the old string ended around Chrome 113 in May 2023. The high-entropy detail did not disappear, though. It moved behind the User-Agent Client Hints API, where Sec-CH-UA and friends deliver only low-entropy values by default and a site must explicitly request the full OS version or device model through getHighEntropyValues. Reduction lowered the entropy a passive observer gets for free, while keeping it available to sites that ask. That is a real privacy improvement against passive collection and a much smaller one against a determined fingerprinter, which is roughly the shape of every Google privacy change.
The standards bodies tried to get ahead of the next leak rather than chase the last one. The W3C’s Mitigating Browser Fingerprinting in Web Specifications, maintained by the Privacy Interest Group, is now a checklist that new web-platform features are supposed to pass: does this API add passive fingerprinting surface, can the entropy be reduced, should it require a permission prompt. It will not undo the canvas, but it raises the cost of shipping the next one. The guidance distinguishes passive surface, which a server gets without running code, from active surface, which requires JavaScript, and treats passive additions as the more serious sin because they cannot be detected or blocked at the point of collection.
Where it stands
Fingerprinting won the technical argument and lost most of the moral one, and the result is a stalemate that favors whoever has more engineers. On desktop, a determined script can still single out a large fraction of visitors, because desktops are diverse and the entropy is real. On mobile, the basic attribute set has been quietly defeated by sameness: a phone that looks like ten million identical phones is hard to fingerprint with canvas and fonts alone, which is why the mobile fraud industry leaned harder into network and behavioral signals. The browser vendors have made passive, free fingerprinting meaningfully harder, through User-Agent reduction, noise injection, and storage caps, while leaving active fingerprinting mostly intact because killing it would break the web features people use.
The honest assessment is that no single defense works and no single attack works, so both sides accumulate. A tracker stacks more weak signals to claw back the entropy that vendors removed. A vendor salts or clamps one more API. The user, who started this whole story being told that clearing cookies would protect them, ends it with a browser that fights on their behalf in ways they cannot see and cannot fully trust, against trackers they also cannot see. The asymmetry from the first paragraph never resolved. It just got more instruments on both sides.
What is genuinely new since 2022 is that the defense finally accepted it cannot win by removing signals, because there are too many, and switched to corrupting them instead. Safari’s salted canvas is a different philosophy from Firefox’s spoofed screen size or Chrome’s frozen User-Agent. It does not try to make you look like everyone else, and it does not try to hide the signal. It makes the signal lie, differently every time you ask. That is the move you make when you have given up on plugging the holes and decided to poison the well instead, and it is probably the shape of the next decade.
Sources & further reading
- Eckersley, P. (2010), How Unique Is Your Web Browser? — the Panopticlick paper, PETS 2010; 18.1 bits of entropy and 83.6% unique across ~500K browsers.
- Electronic Frontier Foundation (2010), Is Every Browser Unique? Results from the Panopticlick Experiment — EFF’s plain-language writeup of the headline findings.
- Mowery, K. and Shacham, H. (2012), Pixel Perfect: Fingerprinting Canvas in HTML5 — W2SP 2012; the canvas-as-oracle technique and its properties.
- Acar, G. et al. (2014), The Web Never Forgets — CCS 2014; first in-the-wild measurement of canvas fingerprinting, 5.5% of the top 100K, ~95% traced to AddThis.
- Englehardt, S. and Narayanan, A. (2016), Online Tracking: A 1-million-site Measurement and Analysis — CCS 2016; the OpenWPM crawl that found audio fingerprinting in the wild.
- Laperdrix, P., Rudametkin, W. and Baudry, B. (2016), Beauty and the Beast: Diverting Modern Web Browsers to Build Unique Browser Fingerprints — IEEE S&P 2016; the AmIUnique study, 89.4% unique, canvas replacing the Flash font list.
- Gómez-Boix, A., Laperdrix, P. and Baudry, B. (2018), Hiding in the Crowd — WWW 2018; 2M+ fingerprints, only 33.6% unique, and why mobile breaks the technique.
- Olejnik, Ł. et al. (2016), The Leaking Battery — the Battery Status API privacy analysis that led Firefox to remove the API.
- WebKit (2024), Private Browsing 2.0 — Safari’s Advanced Fingerprinting Protection and per-site noise injection.
- Chromium Project (2023), User-Agent Reduction — the phased plan to freeze and trim Chrome’s User-Agent string.
- Chrome for Developers, User-Agent Client Hints — where the high-entropy UA detail moved after reduction.
- W3C Privacy Interest Group, Mitigating Browser Fingerprinting in Web Specifications — the standards-level checklist for new web features.
Further reading
The EFF Panopticlick experiment and what it proved about browser uniqueness
Traces the 2010 EFF Panopticlick experiment and Eckersley's 'How Unique Is Your Web Browser?' paper: the 18.1-bit result, the eight measurements, the entropy math, the fingerprint-tracking heuristic, and the Cover Your Tracks rebrand.
·23 min readBrowser extension fingerprinting: how installed extensions leak through the DOM
Traces how websites enumerate a visitor's installed extensions: web-accessible-resource probing, DOM and stylesheet artifacts, intra-browser messages, and timing side channels, plus the Chrome and Firefox mitigations that close some of those doors.
·19 min readCanvas fingerprinting: how a single toDataURL call identifies a device
Traces how rendering text and shapes to an HTML5 canvas and hashing the toDataURL output yields a stable per-device value, the GPU, driver and font causes behind the variation, the 2012 origin, and how much entropy it really carries.
·22 min read