Skip to content

Font fingerprinting: enumeration, measurement, and the @font-face side channel

· 18 min read
Copyright: MIT
The CSS function local() as a monospace wordmark with a single orange underline and a small grey font fingerprinting subtitle

Pick a string. Render it once in a font you know the browser does not have, and once in a font you want to test for. If the two render at the same width, the test font is missing and the browser fell back to the same default both times. If the widths differ, the test font exists, drew its own glyphs, and gave itself away. That is the entire idea. No permission prompt, no plugin, no API that announces itself. A few hidden spans and a read of offsetWidth, and a page learns which of a few hundred typefaces your machine has installed.

The set it learns is surprisingly personal. Your fonts reflect your operating system and its version, your locale and language packs, the office suite you installed, the design tools you paid for, the corporate font your employer pushes to every laptop. Two people on the same Chrome build and the same Windows release can still carry different font sets, and the difference is stable across sessions because people do not reshuffle their fonts. That stability is what makes the installed-font set one of the oldest and most durable entropy sources in the browser fingerprinting toolkit, and it is why it survived the death of Flash that was supposed to retire it.

This post stays on one vector. It walks through how the font set became entropy, the original Flash enumeration and the JavaScript width-measurement that replaced it, the finer-grained glyph-metric attack that needs no list of font names at all, the @font-face and local() side channel that pulls the same trick through CSS, and the browser defenses that tried to shut it down. Where the internal behavior is documented, the sources are below; where a detail is inferred from observed behavior rather than a spec, the text says so.

Why the font set is entropy

The argument that an installed-font list identifies people is not new and it is not subtle. Peter Eckersley’s 2010 Panopticlick study measured it directly. Across 470,161 browsers sampled at panopticlick.eff.org, the fonts variable carried 13.9 bits of entropy, the second most identifying attribute he measured after the plugin list. The whole fingerprint reached a lower bound of 18.1 bits, enough that picking a browser at random gave at best a one-in-286,777 chance of a collision. Among browsers that exposed Flash or Java, the average climbed to 18.8 bits, and 94.2% of those browsers were unique in the dataset.

Eckersley collected the font list through the Flash and Java plugin APIs, which at the time handed a page the complete ordered list of system font names for free. That was the easy era. A site embedded a tiny SWF, called the font enumeration method, and got back an array of every installed family with no measurement and no guesswork. The list arrived sorted, and Eckersley noted that even the sort order leaked information, since some platforms returned fonts in installation order rather than alphabetically. He flagged the font-order question as an unnecessary source of entropy and found that the fonts variable alone held 17.1 bits in the Flash-capable subset, with room to grow as the sample grew. For context, the plugin list, the only attribute that beat fonts, sat at 15.4 bits, the User-Agent at 10.0, and timezone at roughly 3.

Two things changed after 2010. Browsers killed the plugin list, so the highest-entropy attribute mostly evaporated, and they killed Flash and the Java applet, so the free enumeration path closed with it. The font set did not become less identifying. The channel that read it just got narrower and noisier, which pushed attackers from clean enumeration toward measurement.

Measuring instead of asking

When you cannot ask the browser for the font list, you make it render text and watch what comes out. The canonical technique predates the Flash purge and is usually credited to a small JavaScript font detector written by Lalit Patel. It rests on a single observation: every browser ships three generic CSS families that always resolve to something, monospace, sans-serif, and serif. Render a test string in each generic family, record its pixel dimensions, then render the same string in "TargetFont, monospace". If the target font exists, it replaces the generic and the dimensions shift. If it does not, the string falls back to the same monospace glyphs and the dimensions match the baseline exactly.

The measurement reads offsetWidth and offsetHeight on a hidden span, set to a large font size so a one-pixel-per-em difference becomes tens of pixels and survives rounding. The test string is chosen for width sensitivity. Patel’s detector used mmmmmmmmmmlli, a run of the widest lowercase glyph padded with thin ones, because a typeface’s character widths are where families diverge most. Comparing against all three generics rather than one matters, because a target font might happen to match monospace widths while differing from serif, and a single baseline would miss it.

Width comparison: present vs. absent font-family: monospace mmmmmmmmmmlli width = 218px (baseline) font-family: "Helvetica Neue", monospace mmmmmmmmmmlli width = 184px ≠ baseline → PRESENT font-family: "Nonexistent Font", monospace mmmmmmmmmmlli width = 218px = baseline → ABSENT *The detector resolves a candidate against a generic fallback. Same width as the baseline means the candidate never rendered; a different width means it did.*

The same comparison runs through several APIs that all read the same underlying layout. The oldest reads offsetWidth and offsetHeight on a positioned element. A cleaner variant calls getBoundingClientRect, which returns sub-pixel width and height as floats and so resolves finer differences than the integer offsets, the same primitive that drives ClientRects and text-metrics fingerprinting. A third path uses the Canvas 2D context: set ctx.font, call ctx.measureText(string), and read the returned TextMetrics.width, which never touches the DOM and never paints anything visible. The DOM-free Canvas route overlaps heavily with canvas fingerprinting, though here the script only wants the metrics, not the rasterized pixels.

A 2013 crawl of the top million sites put numbers on how common this already was. The FPDetective study by Acar and colleagues found 404 of the top million sites running JavaScript-based font probing, mostly the width-comparison method against a dictionary of font names. The dictionaries were not subtle. Detectors iterated a fixed list of a few hundred well-known families and recorded a present/absent bit for each, then hashed the result into a font fingerprint. Modern detectors test on the order of 150 to several hundred families, weighted toward fonts that vary by platform and by installed software.

The glyph-metric attack: no font names required

Width comparison still needs a list of font names to test. The 2015 work by David Fifield and Serge Egelman at Berkeley, Fingerprinting web users through font metrics, removed even that dependency. Their insight: you do not have to name the fonts. You can measure the rendered size of individual Unicode code points in the generic families the browser already exposes, and the variation in those sizes alone is enough to fingerprint.

The mechanism is to render a single character, measure its onscreen bounding box, and treat the box dimensions as a feature. Because browsers map the five generic CSS families (sans-serif, serif, monospace, cursive, fantasy) to whatever real fonts the system provides, and because those mappings differ across platforms and configurations, the same code point comes out at different sizes on different machines. The paper rendered each code point once in each of the five generic families and recorded only the bounding box, never any pixel data. The measurement is invisible, performed offscreen, and Fifield and Egelman timed the whole fingerprint at under a few milliseconds.

The numbers are precise. Out of their input set, there were 444 distinct complete font-metric measurements with an entropy of 7.599 bits. 349 submissions, 34%, were identified uniquely by font metrics alone, and another 84, 8%, fell into anonymity sets of size two. Combining font metrics with the User-Agent string, restricted to cases where the metrics matched, pushed it to 531 distinct submissions and 8.058 bits, with 43% then uniquely identified. The authors stress this is bounded by sample size; the true entropy in a population the size of Panopticlick’s would be higher.

One code point, five generics, one feature vector U+20B9 INDIAN RUPEE SIGN (highest individual entropy in the study) sans-serif → (14.0 × 21.3) serif → (13.2 × 20.8) monospace → (12.0 × 19.0) cursive → (15.1 × 22.0) fantasy → (14.0 × 21.3) hash(box vector) = partial fingerprint Box sizes are illustrative; the study measured real onscreen bounding boxes and recorded only their dimensions. *Fifield and Egelman measured per-code-point bounding boxes across the generic families. The currency and ligature code points carried the most entropy; whitespace and formatting characters the least.*

The code points that mattered were not the obvious ones. The single highest-entropy character was U+20B9 INDIAN RUPEE SIGN at 4.908 bits, followed by U+20B8 TENGE SIGN, then a run of Arabic presentation-form ligatures and Private Use Area code points. The reason is glyph coverage. A newer currency sign or a rare ligature is present in some font versions and absent in others, so whether and how it renders splits the population cleanly, where a common Latin letter that every font draws nearly identically tells you almost nothing. The authors found a subset of just 43 code points that captured essentially all the variation in their sample, selected greedily by conditional entropy: take the highest-entropy code point first, then the one that adds the most given what you already know, and stop when the remaining subsets are uniform.

Fifield and Egelman were explicit that this is weaker than canvas fingerprinting, which reads full pixel data rather than bounding boxes. They pursued it anyway because it was effective against Tor Browser, which at the time resisted canvas extraction but not metric measurement. That gap is the recurring theme of font defenses: closing the loud channels first and leaving the quiet measurement path open.

The @font-face and local() side channel

The measurement attacks run in JavaScript. CSS alone can pull a related trick, and it is arguably more elegant because it needs no script at all to detect a font, only to observe the result.

The @font-face rule lets a page declare a font and point src at a source. The src descriptor accepts two kinds of source: url(...) for a downloadable web font, and local(...) for a font already installed on the system. The original intent was a fallback chain, prefer the local copy if the user has it, otherwise fetch it from the server. The detection trick inverts that intent. Declare an @font-face whose src is local("Some Font") and nothing else, then apply that family to an element. If the named font is installed, the browser uses it. If it is not, the @font-face resolves to nothing and the element falls back. Measure the element, or in the network variant point src at both a local() name and a url() that hits a server you control, and the presence or absence of the network request tells you whether the local font was found.

The technique was demonstrated as a CSS font detector by Stephen Robinson in 2009, cited in Fifield and Egelman’s related work. It is purely declarative on the detection side. A more aggressive version measures glyphs of the loaded @font-face on a canvas, which is how a tracker can probe for narrow, identifying fonts. Chrome’s own developer documentation gives the example plainly: a site can test for a large set of known corporate fonts, like a company’s bespoke brand typeface installed only on employee laptops, by rendering text in each suspected font and measuring the glyphs. A hit on Google Sans or an internal corporate face is a strong signal about where the visitor works.

The local() / url() fallback as a probe @font-face { font-family: probe; src: local("Corp Brand"), url(https://tracker/hit.woff2); } font installed local() resolves, no request font absent fall through to url(), request fires server sees nothing → PRESENT server logs the hit → ABSENT The absence of a network request is itself the signal; no JavaScript is needed to read it. *A single `@font-face` rule turns font presence into a server-observable event. The clean version measures the rendered element instead, keeping the whole probe client-side.*

The reason local() earned a reputation as a fingerprinting vector is that it gives CSS the same enumeration power Flash once had, one font at a time. It is slower than Flash, which dumped the whole list at once, but it is precise, scriptless on the detection side, and works against fonts the JavaScript dictionaries do not bother to include. The newer Local Font Access API closes the loop the other way: window.queryLocalFonts(), shipped in Chrome 103 on desktop, returns an array of FontData objects with postscriptName, fullName, family, and style, which is the clean enumerated list Flash used to give. The difference is the gate. queryLocalFonts() requires the local-fonts permission, prompts the user on first call, and is not implemented in Firefox or Safari, so it is not the broad fingerprinting channel the old plugin path was. It is a deliberate, permissioned capability for design tools, not a silent probe.

What the font set actually reveals

The entropy numbers describe how much the font set distinguishes users. They do not describe what it leaks, and that second question is often the more interesting one for a tracker.

The presence of Cambria and Calibri says Microsoft Office on Windows. Helvetica Neue and the San Francisco system fonts say macOS. Ubuntu or Liberation Sans say a Linux distribution and which one. Beyond the operating-system tell, which is worth roughly the OS-detection bits on its own, the long tail is where individuals separate. Adobe’s fonts ride along with a Creative Cloud install and mark a designer. A pile of programming ligature fonts marks a developer. CJK or Arabic or Devanagari faces mark a locale and language. And the corporate-font case is the sharpest of all: a font that only ships on one company’s managed laptops turns a present/absent bit into an employer label, which is why Chrome’s documentation singles it out as the motivating abuse for the local() channel.

This is why the font set composes so well with other vectors. On its own it places you in a crowd; combined with the timezone and locale signals, the screen and device-pixel-ratio entropy, and a canvas or WebGL fingerprint, it shrinks the crowd to one. Commercial fingerprinting libraries treat it exactly this way, as one stable component among a dozen. The font component is prized because it changes slowly. A canvas hash can shift after a driver update; an installed-font list usually holds steady for months, because installing or removing a font is a deliberate act most people perform rarely.

How the browsers fought back

The defenses split into three families, and none of them fully solved the problem.

The first is enumeration removal. Killing Flash and the Java applet took away the bulk-list path, and dropping navigator.plugins down to a near-empty stub took away the highest-entropy attribute Eckersley measured. That left measurement, which no plugin purge can touch, because measuring rendered text is the browser’s normal job.

The second is allowlisting the available fonts, which Tor Browser pursued furthest. The roadmap is tracked across several Tor tickets, and the core mechanism is the font.system.whitelist preference, which accepts a list of font names and hides every other family from the page. Separate allowlists were defined for macOS, Windows, and Linux. On the Linux bundle the preference is not even used; instead Tor ships its own fonts and a fonts.conf that restricts the browser to the bundled set, so every Tor user on Linux presents an identical font list regardless of what is actually installed. The allowlist covers font-family, src: local(), and the Canvas font property together, which is the part that matters: it closes the measurement path and the CSS side channel at once, not just the enumeration API. Firefox inherited a version of the same control. Both block local font files, but Tor bans all local files outright while Firefox bans only those not on its allowlist.

The third is randomization, which is Brave’s approach. Rather than present a fixed allowlist, Brave perturbs the font list per session and per eTLD+1, removing entries pseudo-randomly so that a fingerprinter never gets a stable view of the available families. The randomization is keyed to a per-session, per-site seed, the same farbling machinery Brave uses for canvas and audio. Brave’s font defenses landed after the broader farbling work; the May 2020 fingerprinting-defenses post listed font enumeration as still to come, and the protection shipped in later releases.

Safari took the bluntest line. Since Safari 12 in 2018, under Intelligent Tracking Protection, the browser exposes only fonts that came with the operating system and the user’s current language, plus any web font a site downloads itself. User-installed fonts simply are not visible to the page. The cost is real for design and document tools, which is the same tension the W3C has been working through. A 2024 W3C note by Chris Lilley on fonts and privacy lays out the spec position: CSS Fonts 4 explicitly leaves undefined which installed fonts the font-matching algorithm may see, permitting a user agent to ignore user-installed fonts entirely, and it floats a privacy-budget model that would penalize a page probing a large number of fonts while letting a page that tests only a few proceed. The note frames modern enumeration as slow enough to mostly reveal the operating system rather than a unique identity, which is true for the width-comparison method and less true for the glyph-metric method that needs no font names.

Attack and defense, 2009–2024 2009 CSS local() 2010 Panoptic. 13.9 bits 2013 FPDetect. 404 sites 2015 glyph metrics 2018 Safari 12 OS fonts 2022 Chrome queryLocal 2024 W3C note Orange marks attack milestones; grey marks defenses and measurement studies. *The attack side matured early and the defenses arrived in waves. Allowlisting and randomization close the measurement path; permission-gating the new enumeration API keeps it from reopening the old one.*

The state of it in 2026

Font fingerprinting is in an unusual position among fingerprinting vectors. The loud channels are closed. Flash is gone, the Java applet is gone, the plugin list is a stub, and the one clean enumeration API that returns a real font list sits behind a permission prompt that most pages will never get a user to accept. By that accounting the vector looks retired.

It is not, because the quiet channel was never about enumeration. As long as a browser renders text and lets a page measure the result, the installed-font set leaks through glyph dimensions, and the Fifield-Egelman result showed you do not even need a dictionary of font names to read it. The defenses that actually work are the ones that attack measurement rather than enumeration: Tor’s bundled-and-allowlisted font set, which makes every user on a platform render identically, Brave’s per-site randomization, which denies a stable reading, and Safari’s flat refusal to expose anything past the OS fonts. Each of those carries a usability cost, which is why mainstream Chrome and Firefox without the resist-fingerprinting flag still render whatever you have installed and still measure to the sub-pixel. The font set on a default desktop browser remains worth its double-digit bits, and unlike a canvas hash it does not drift, so a tracker that reads it once can count on it months later. The cleanest signal in browser fingerprinting is often the one nobody had to break a security boundary to get, and the width of a string is exactly that.


Sources & further reading

Further reading