The emoji and Unicode rendering fingerprint across platforms
Type the four bytes F0 9F 91 A8 into any text field and you have written one codepoint: U+1F468, MAN. It is the same codepoint everywhere. The Unicode Consortium assigned it, every conformant system agrees on what it means, and the wire bytes are identical whether you typed them on an iPhone, a Pixel, or a Windows laptop. What is not identical is the picture. Apple draws a man with a particular beard and skin shading. Google draws a flatter one. Microsoft draws something in a different style again. And when you glue that codepoint to a woman, a girl, and a boy with three invisible joiner characters to make a family, the platforms diverge much harder: some collapse the whole thing into one tidy glyph, and some give up and render four separate little people in a row.
That divergence is measurable from JavaScript without any permission prompt. Paint the emoji to a canvas and read the pixels back, or lay it out in the DOM and ask for its bounding rectangle, and the width comes back different depending on which OS emoji font did the drawing. The codepoint is a constant. The rendering is a variable. A fingerprinting script reads the variable and infers the platform behind it, and it works even when the User-Agent string has been edited to say something else.
This post traces that one vector end to end. It starts with why the glyphs differ at all, down to the font formats each vendor ships. Then it walks the shaping pipeline that turns a stream of codepoints and zero-width joiners into glyph boxes, because that pipeline is exactly where the cross-platform width differences are born. It covers the two readout channels, canvas pixels and getClientRects geometry, and what each one actually leaks. It looks at how emoji fingerprinting catches a lie, the spoofed device whose claimed OS does not match the font it rendered with. And it closes on the defenses, the one browser that solved it cleanly and why almost nobody else has.
Why the same codepoint is a different picture
The Unicode standard is deliberate about one thing: it does not draw the emoji. Unicode Technical Standard #51, the document that governs emoji, assigns codepoints and properties and describes behaviour, but it leaves the artwork to vendors. The phrasing in the implementation literature is blunt about the consequence. The exact appearance of an emoji is not prescribed and can vary between fonts and platforms, much like different typefaces vary for ordinary letters. A capital A from Helvetica and a capital A from Times are the same character and different shapes. U+1F600 GRINNING FACE works the same way. Same character, different shapes, by design.
So each platform ships its own emoji font. Apple ships Apple Color Emoji. Google ships Noto Color Emoji. Microsoft ships Segoe UI Emoji. Twitter open-sourced Twemoji, which many web properties and some Linux distributions adopt. These are not skins over a shared master set. They are independent artwork drawn by different design teams, redrawn periodically, with different metrics, different glyph bounding boxes, and different opinions about how wide a given pictograph should be.
Underneath the artwork the fonts do not even use the same storage format, and the format itself is a fingerprint surface because it decides how the glyph scales. Apple’s font uses the sbix table, a raster format that embeds PNG (or JPEG or TIFF) bitmaps for each glyph. Google’s Android font uses CBDT/CBLC, also raster, also bitmap PNGs packed into the font. Microsoft introduced the vector COLR/CPAL format with Windows 8.1, which builds each emoji from layered colored shapes rather than a baked image. Mozilla and Adobe went a fourth way with SVG-in-OpenType, a full SVG document per glyph. Four formats, four rendering paths, and a bitmap font scaled to an off-native size blurs in a way a vector font never does. That blur is itself measurable.
The format gap is not frozen in time either. In January 2022, Chrome 98 added support for COLRv1, a vector color format with gradients and shape reuse, and Google shipped a COLRv1 build of Noto Color Emoji that dropped from roughly 9 MB in the bitmap CBDT form to about 1.85 MB compressed. A browser that renders emoji through COLRv1 produces subtly different pixels and metrics than one rendering the same Noto set as bitmaps. So the rendering you measure depends not only on the OS but on the browser version and which font build it picked up, which is more entropy, not less.
Codepoints, joiners, and the shaping pipeline
To see where the width differences are actually manufactured, you have to follow what happens between “here is a string of codepoints” and “here are glyphs on a line.” That job belongs to the text shaper. On Apple platforms it is Core Text. On Android and most Linux it is HarfBuzz. On Windows it is DirectWrite. The shaper reads the codepoints, consults the font’s tables, and decides how many glyph slots the run needs and how wide each one is.
For a lone emoji this is simple. For a sequence it is where platforms part ways. The mechanism is the Zero-Width Joiner, U+200D, an invisible control character that carries no width and no glyph of its own. Its only job is to tell the shaper “fuse the emoji on either side of me into one.” A family emoji is not a single codepoint at all. It is man, joiner, woman, joiner, girl, joiner, boy, four people stitched with three joiners into one seven-codepoint sequence that is supposed to read as a single grapheme cluster and occupy a single glyph slot.
Whether it actually does depends entirely on the font. The shaper checks the font’s glyph-substitution table for a ligature covering that exact sequence. If the font has the family ligature, the whole run collapses to one wide glyph. If it does not, UTS #51 spells out the fallback explicitly: when a ZWJ sequence reaches a system without a corresponding single glyph, the joiners are ignored and a fallback sequence of separate emoji is displayed. So the unsupported case renders four small people side by side, and four glyphs at one width is geometrically nothing like one glyph at another width. The bounding box you measure is wildly different between a platform that knows the sequence and one that does not.
*Same seven codepoints, two outcomes. The presence or absence of one ligature in the font roughly doubles the measured width. A script reading that width learns whether the platform supports the sequence, which narrows the OS and its version.*Two more pieces of the sequence grammar matter for the same reason. Variation selectors decide presentation style. U+FE0F (VS-16) requests the colorful emoji form; U+FE0E requests the monochrome text form. A heart or a digit followed by VS-16 becomes a full-color emoji glyph, often a different width than the plain-text version. Skin-tone modifiers add another axis. The five Fitzpatrick-scale modifiers, U+1F3FB through U+1F3FF, immediately follow a base emoji to recolor it, and a font that lacks the modified glyph falls back to base-plus-swatch, two glyphs where there should be one. Every one of these is a fork in the shaping path, and every fork is a width a script can read.
The version axis sits on top of all of it. Unicode and the emoji spec keep adding codepoints; Emoji 17.0 ships with Unicode 17.0 as of late 2025. A platform whose font predates a given emoji has no glyph for it and renders the missing-glyph box, the notorious “tofu” rectangle. So a script can probe a ladder of emoji introduced in successive versions and watch where tofu starts. The version at which a device falls off the ladder is a tight signal about its OS release date, and it is one a User-Agent edit does not touch, because the font is older than the lie.
The tofu probe is worth dwelling on because it is so cheap. Tofu is not a blank space. It is a rendered glyph, the font’s .notdef box, and it has its own metrics. A device that knows a codepoint renders artwork at one width; a device that does not renders the box at another. So even without distinguishing Apple from Google, a script can date the font by binary-searching the version ladder: pick an emoji from Emoji 13, one from Emoji 14, one from Emoji 15, and read which ones produced real glyphs and which produced tofu. The boundary is the font’s age. On iOS that boundary tracks the OS release closely, because Apple ships emoji font updates with point releases. On Android it is muddier, because the emoji font can update through Google Play system updates independently of the OS version, which is itself a tell: an Android claiming a recent OS but rendering an old emoji set has an update story that does not hold together. The version ladder turns a single OS-version claim into something a script can verify against physics rather than take on trust.
There is a subtler cousin to tofu that matters for the ZWJ case. A platform can have glyphs for all four people in a family sequence and still lack the ligature that fuses them. So it renders four valid emoji with no tofu anywhere, yet the total width is double what a supporting platform produces. The script does not see a missing-glyph box; it sees a width that says “this font knows the components but not the combination.” That distinction, full support versus component-only support versus no support, is three measurable states per sequence, and there are hundreds of ZWJ sequences in the modern emoji set. Each one is a small classifier for the font behind it.
Reading the variation: canvas pixels
There are two ways to turn this rendering variation into a number. The first is the canvas. The technique is the same one behind general canvas fingerprinting: draw text to an offscreen <canvas>, call toDataURL() or getImageData(), and hash the pixels. The difference with emoji is that you are no longer probing antialiasing and subpixel hinting on Latin glyphs. You are probing entire vendor artwork sets. The same emoji is a genuinely different image on each platform, so the pixel hash separates platforms cleanly rather than by a few smoothed edges.
This is why emoji turn up in canvas fingerprinting payloads in the wild. The example string that floats around production fingerprinting code is the pangram “Cwm fjordbank glyphs vext quiz” with a grinning emoji appended, and the emoji is the part doing the heaviest lifting. The reasoning, as practitioners put it, is that emoji rendering depends on the OS or even the model of phone, which makes the canvas more unique because it folds in more information about the device, and it can also be used to detect a client lying about the true nature of its device. A flat Latin pangram tells you about the rasterizer. The emoji tells you which vendor’s art is installed.
Google formalized the device-detection angle in 2016 with Picasso, a system from Elie Bursztein and colleagues on the anti-abuse team. Picasso defines a device class as the combination of browser, OS, and graphics hardware, and it is a challenge-response scheme rather than a passive read. The server sends a random seed, an iteration count, and a set of drawing instructions including curves and fonts; the client renders them iteratively and hashes the canvas output, folding each frame’s hash into the next. Because rendering output depends on the whole graphics stack, the result acts like a physically unclonable function for that device class, and the seed makes captured responses useless for replay. The relevant claim for this post is the detection one: Picasso can tell an authentic iPhone running Safari from an emulator or a desktop client spoofing the same configuration, because the spoofed stack cannot reproduce the genuine stack’s pixels. Emoji and font rendering are part of what it draws. DataDome and others have since described using Picasso-style canvas challenges in production.
Reading the variation: getClientRects geometry
The second channel needs no pixel read at all, which makes it the quieter one. Lay an emoji out in the DOM, in a <span> at a large font size, and ask the browser for its geometry with getElementById(...).getClientRects() or getBoundingClientRect(). You get back DOMRect objects, the bounding boxes of the rendered element, as floating-point numbers. This is the same family of measurements covered in ClientRects and text-metrics fingerprinting, applied specifically to emoji glyph boxes. No canvas, no toDataURL, no image-extraction permission. Just geometry that the layout engine hands out for free, because pages legitimately need it for scroll math and hit-testing.
The precision is the part that makes it dangerous. DOMRect values are not rounded to whole pixels. They come back as floats carried to many decimal places, which is plenty of resolution to distinguish one font’s glyph metrics from another’s. The emoji fingerprinting demo built by researchers at LRZ takes this to its logical end: it lays out the full emoji list from Unicode 12.0, 2,575 distinct codepoints with skin tones excluded, each at a 300-pixel font size, and reads a DOMRect for every one. The width of each box depends on the glyph the local font produced. Hash the 2,575 widths together and you have a compact identifier that is mostly a readout of the device’s emoji font. Codepoints the local font lacks fall back, shifting their widths; codepoints it supports as ligatures collapse to a different width than codepoints it splits. The hash captures the whole pattern at once.
*The geometry path. Layout the emoji, read the boxes, hash the widths. Sub-pixel floats give it the resolution, and the absence of a pixel read keeps it under the radar of canvas defenses.*Browser vendors have known the geometry channel is a problem for years. Jose Carlos Norte demonstrated getClientRects fingerprinting against Tor Browser back in March 2016. Mozilla’s own bug 1507879, “Investigate getClientRects for fingerprinting,” lays out the surface in the engineers’ own words: the APIs leak rendering measurements affected by fonts, display scaling, DPI, and the rendering of elements including emoji. The proposals to round, floor, or ceil the values were rejected as ineffective, because an attacker can incrementally grow an element to binary-search the true sub-pixel value back out. The bug author’s conclusion was that the API probably cannot be neutered into returning safe data, so it would need to be gated behind a permission prompt like canvas extraction, or disabled outright. As of this writing the bug is still open. The geometry channel has no clean fix that does not break layout-dependent pages.
What this catches: the inconsistent device
The reason anti-bot and anti-fraud systems care about emoji rendering is not that one device’s emoji hash is unique, though it adds entropy. It is that the hash is hard to fake in a way that stays consistent with everything else the device claims. A fingerprint is only as convincing as its internal agreement. A scraper or anti-detect browser can set navigator.userAgent to a Windows Chrome string in one line. Making the emoji render the way real Windows Chrome renders is a different order of problem, because the glyphs come from a font file the spoofing layer does not control.
This is the inconsistency check, and it is the same logic that powers the broader fingerprint-coherence tests anti-bot vendors run. A stack that claims Windows but renders Apple Color Emoji metrics has told two stories. A headless Linux server claiming to be an iPhone renders neither Apple nor a believable mobile emoji set, and the family-sequence fallback behaviour gives it away because the desktop Linux font splits sequences the real iPhone fuses. Picasso was built precisely to surface this, separating an authentic iPhone-Safari-iOS stack from an emulator or a desktop pretending to be one, on the strength of pixel and metric differences the spoof cannot reproduce. The same emoji-rendering signal that identifies a device also audits whether its self-description is true.
That audit is why emoji rendering pairs naturally with the other rendering-derived signals a detector collects. WebGL fingerprinting reads the GPU renderer string and shader precision; font fingerprinting enumerates the installed text fonts by measuring their metrics. Emoji rendering is the color-font sibling of that font work, and it cross-checks the same OS claim from a different angle. When three independent rendering channels all have to agree on one operating system and the spoof only patched one of them, the disagreement is the detection. The cost of a coherent lie across all of them is high, and it climbs every time a vendor adds a channel.
There is a stability property that makes the emoji channel especially useful as a check rather than a one-shot identifier. The rendering is deterministic. Unlike some canvas reads that vendors have started to perturb at the GPU level, the width a font assigns to a glyph does not change between page loads on the same device. That cuts both ways for a spoofer. It means the value is a reliable cross-session anchor for a tracker, which is the privacy harm. It also means a defense that randomizes the emoji output per load, the way some anti-detect browsers fuzz canvas, immediately fails a different test: a real device returns the same emoji metrics twice in a row, and a device whose metrics drift between reads has announced that it is tampering. Stability is the signal in both directions. The honest device is boringly consistent; the spoof has to choose between matching one platform exactly or looking unstable, and most cannot do the former because they do not have the font.
A note on honesty here. The exact per-emoji width tables for each OS font version are not published by the platform vendors, and neither DataDome nor the other commercial detectors document which specific codepoints they probe or how they weight the result. What is public is the mechanism: the font formats, the shaping fallback rules in UTS #51, the DOMRect precision, and the Picasso design. The specific production payloads are inferred from the documented mechanism and from the demos that researchers have published, not from vendor specs that lay out a field-by-field probe list. Where this post says “a detector probes X,” read it as “the mechanism makes probing X straightforward and the public demos do exactly that,” not as a leaked vendor schema.
The defenses, and why one browser actually won
The clean fix is conceptually simple and almost nobody ships it: make every install render emoji identically by bundling one emoji font and ignoring the OS font. Tor Browser does this. It carries Twemoji Mozilla and routes emoji through it on every platform, so a Tor user on Windows, macOS, and Linux all produce the same emoji metrics. Combined with Tor’s font allowlist, which restricts the browser to a fixed set of fonts and hides the system’s installed fonts entirely, this collapses the emoji and font channels into a single value shared by the whole Tor population. The history is bumpy. Tor’s font defenses broke emoji rendering outright in the 5.5 era, and getting the bundled Twemoji to actually load on every platform took multiple iterations through their tracker. But the principle holds: uniform font, uniform rendering, no signal.
Mainstream browsers have not followed, and the reasons are practical rather than mysterious. Bundling a full color emoji font costs megabytes and means your emoji visibly differ from the rest of the user’s OS, which platform vendors with their own emoji designs are not eager to do. Firefox’s Resistance to Fingerprinting mode addresses parts of the surface but, as the engineers noted, DPI and scaling still leak through the geometry APIs, and getClientRects has no agreed safe form. The rounding approaches that look like an obvious fix fail to the binary-search attack. Permission-gating geometry the way canvas extraction is gated would break a large fraction of the web that uses these APIs for ordinary layout. So the channel stays open in the browsers most people use.
There is a slow structural shift working against fingerprinters, though, and it is worth ending on because it is concrete rather than hopeful. As vendors converge on vector color formats, the rendering becomes more reproducible and slightly less platform-bound. Chrome’s 2022 move to a COLRv1 Noto Color Emoji is a step in that direction, and a web-delivered emoji font rendered by the same engine on every OS narrows the gap that bitmap-per-vendor fonts opened. But convergence is partial and slow. Apple, Google, and Microsoft still ship distinct artwork in distinct formats, old devices still carry old fonts that tofu out on new codepoints, and the family-sequence fallback still splits on the platforms that lack the ligature. For now the four bytes of a single man emoji still resolve to four different pictures, four different widths, and a script that reads the width still learns which machine drew it.
Sources & further reading
- Unicode Consortium (2025), UTS #51: Unicode Emoji — the governing standard: ZWJ sequences, variation selectors, skin-tone modifiers, and the explicit rule that unsupported sequences fall back to separate emoji.
- Laperdrix, Bielova, Baudry, Avoine (2020), Browser Fingerprinting: A Survey — ACM Transactions on the Web survey that names emoji as a high-entropy fingerprint vector because of distinct per-platform representations.
- Bursztein, Malyshev, Pietraszek, Thomas (2016), Picasso: Lightweight Device Class Fingerprinting for Web Clients — Google’s canvas challenge that separates an authentic iPhone-Safari stack from emulators and spoofed desktops via rendering output.
- LRZ Security (Browserize) (2020), Fingerprinting Emoji — demo hashing DOMRects of the full Unicode 12.0 emoji set (2,575 codepoints) at 300px to identify the device.
- Mozilla (2018–), Bug 1507879: Investigate getClientRects for fingerprinting — Firefox engineers conclude the geometry API likely cannot be neutered and would need gating or disabling.
- Chrome Developers (2022), COLRv1 Color Gradient Vector Fonts in Chrome 98 — the January 2022 move to vector color fonts, including the Noto Color Emoji size drop from 9 MB to 1.85 MB.
- Wikipedia (2025), Implementation of emojis — reference on the four color-font formats (sbix, CBDT/CBLC, COLR/CPAL, SVG-in-OpenType) and which vendor uses which.
- Castle (2024), Canvas fingerprinting in the wild — production analysis of why emoji are embedded in canvas payloads and how they expose the OS.
- Tor Project (2017–), Investigate problems with Twemoji Mozilla (#40966) — the work to bundle one emoji font across all platforms so Tor users share a single emoji rendering.
- Anish Shobith P S (2024), The hidden architecture of emoji — walkthrough of Core Text, HarfBuzz, and DirectWrite shaping ZWJ sequences through GSUB ligatures.
Further reading
Canvas fingerprinting: how a single toDataURL call identifies a device
Traces how rendering text and shapes to an HTML5 canvas and hashing the toDataURL output yields a stable per-device value, the GPU, driver and font causes behind the variation, the 2012 origin, and how much entropy it really carries.
·22 min readWebGL fingerprinting: the renderer string, precision, and shader quirks
A primary-source reference on WebGL fingerprinting: the UNMASKED_RENDERER and UNMASKED_VENDOR strings, supported extensions, shader precision formats, rendered-image hashing, and the browser mitigations that bucket or hide them.
·24 min readAudioContext fingerprinting: the OscillatorNode signature explained
Traces how rendering an oscillator through OfflineAudioContext and a DynamicsCompressor produces a stable per-device float, the floating-point and FFT causes behind the variation, the 2016 origin, and how much entropy it really carries.
·18 min read