The Accept, Accept-Language, and Accept-Encoding triad as a browser signature
Open a browser, load a page over HTTPS, and three request headers go out that you never typed and never configured. Accept, Accept-Language, Accept-Encoding. They were designed in the 1990s for content negotiation, so a server could send WebP instead of PNG, French instead of English, Brotli instead of raw bytes. That is still what they do. But the exact strings a browser puts in them are not chosen by you, the page, or the server. They are baked into the browser’s network stack, they differ from build to build, and they are stable enough per client that a detection system can read the three of them together as a label. The question this post asks is a narrow one: how much of a browser’s identity is sitting in those three header values, and what happens to a client whose triad does not look right.
The short answer is that the triad is a weak fingerprint on its own and a strong tripwire in combination. A single header rarely pins down a client. But the three together, read against the User-Agent that claims to have sent them, fail in characteristic ways for HTTP libraries, for headless tooling, and for anyone hand-rolling requests. A real Chrome has a document Accept that ends in application/signed-exchange;v=b3;q=0.7. A Python client that copied a Chrome User-Agent but kept its own Accept: */* has just contradicted itself, and the contradiction is cheap to check on the first request before any JavaScript runs.
The sections below run in order. First the three headers and what each negotiates. Then the exact default values per browser, the part with the most citable detail. Then q-values and the RFC syntax that governs them, because the decimal weights are themselves a fingerprint. Then Accept-Encoding specifically, where Brotli and Zstandard arrival dates split clients by age. Then how the triad gets folded into a header-order fingerprint like JA4H, how mismatches surface, and what the signal is worth in 2026 against an adversary who has read the same MDN page you have.
Three headers, three negotiations
The three headers share a grammar and a job. Each tells the server what representations the client can accept along one axis, ranked by preference, so the server can pick the best one it has. Accept negotiates media type. Accept-Language negotiates natural language. Accept-Encoding negotiates content coding, which in practice means the compression algorithm. RFC 9110, the 2022 consolidation of HTTP semantics, defines all three under proactive content negotiation in section 12.
The mechanics are symmetric. The client offers a comma-separated list of values, optionally weighted, and the server responds with one choice plus a header naming it: Content-Type, Content-Language, Content-Encoding. If the client sends nothing, the server is free to guess. RFC 9110 is explicit that an absent Accept-Encoding lets the server “assume that the client will accept any content coding,” and that the identity coding, meaning no compression, “is always acceptable, unless explicitly refused by the client.” That permissiveness matters for fingerprinting, because it means a bare HTTP client can legally omit the triad entirely and still get a working response. Browsers never omit it. So omission itself is a tell.
What makes the triad a signature rather than just a negotiation is that browsers do not compute these values from anything you can see. The document Accept is a compile-time constant in the browser’s network code. Accept-Language is derived from your UI language setting, but the exact serialization, including which fallback tags get appended and what q-values they carry, is the browser’s own convention. Accept-Encoding is a fixed list gated only by which compression libraries that build was shipped with. None of the three depends on the page being loaded, which is what makes them comparable across sites and stable enough to label a client.
The default values, browser by browser
This is where the citable detail lives, and where most home-rolled clients fall down. MDN maintains a reference page of default Accept values per browser and per request type, and the strings are specific enough to act as near-exact matches.
For a top-level document navigation, current Chrome (version 131 and the Chromium family around it) sends:
text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7That trailing application/signed-exchange;v=b3;q=0.7 is a Chrome-ism. Signed HTTP Exchanges are a Google-driven feature, and no other engine advertises them in the document Accept. Its presence is close to a Chromium signature on its own; its absence from a request that claims to be Chrome is a contradiction.
Firefox tells a different story, and the story changed recently. Firefox 132 and later send the lean form:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8But Firefox 128 through 131 sent a longer string that inlined image types into the document Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/png,image/svg+xml,*/*;q=0.8The version cut at 132 is the kind of detail that turns a coarse signal into a sharp one. A request advertising the image-laden Firefox Accept is claiming to be a Firefox from a specific, now-superseded window. If the User-Agent says Firefox 134, the two disagree, and a detector that keeps a table of known-good triads per version flags the row.
Safari has been steady. From Safari 13.1 through 18.1 and later, document navigations carry the same minimal value as modern Firefox:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Which means Safari and recent Firefox are indistinguishable on the document Accept alone. That is the first lesson of the triad: a single header is often shared across engines, and you need the other two plus the User-Agent to separate them.
The picture gets richer when you leave the document request. The browser sends a different Accept for an image than for a stylesheet than for the navigation, and each subresource type has its own per-engine string. Fetching an image, Chrome and Edge 121 and later send image/avif,image/webp,image/apng,image/*,*/*;q=0.8. Firefox 128 and later send image/avif,image/webp,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5. Safari since Big Sur sends image/webp,image/png,image/svg+xml,image/*;q=0.8,video/*;q=0.8,*/*;q=0.5, and note the video/* in there, which is a Safari quirk no other engine carries on an image request. A stylesheet request collapses the engines back together: Firefox, Safari, and Chrome all send text/css,*/*;q=0.1.
That per-resource variation is itself a check. A real browser loading a page emits a navigation Accept, then a stream of image Accepts, then a CSS Accept, each correctly shaped for its destination. An HTTP client that hardcodes one Accept for every request, no matter what it is fetching, has no way to reproduce that variation without modeling the browser’s resource-type-to-header mapping. The single-Accept-for-everything pattern is one of the oldest tells in the book, and it is visible across a session rather than on any one request.
Q-values: the decimal weights are a fingerprint too
Every value in the triad can carry a weight, written ;q= followed by a number. RFC 9110 section 12.4.2 pins the syntax tightly. A qvalue is either 0 followed by up to three decimal digits, or 1 followed by up to three zeros. The ABNF is exact:
weight = OWS ";" OWS "q=" qvalueqvalue = ( "0" [ "." 0*3DIGIT ] ) / ( "1" [ "." 0*3("0") ] )So the legal range is 0 through 1, with at most three decimal places, and a value with no q is treated as q=1, the top preference. The server reads the weights as a ranking and serves the highest-weighted representation it can produce.
The fingerprinting value is that browsers pick particular weights, and they pick them consistently. Chrome’s document Accept runs three weight tiers: the explicit types at the implicit q=1, then */*;q=0.8, then application/signed-exchange;v=b3;q=0.7. Firefox runs two tiers. The choice of 0.8 versus 0.9 versus 0.5 at each position is part of the constant string, not something the browser recomputes, so a client that gets the media types right but the weights wrong has still missed. Reproducing a triad is not just listing the right MIME types. It is listing them in the right order with the right decimal weights and the right wildcard tail.
Accept-Language is where weights do the most visible work, because here the values come from your settings rather than a compile-time constant. The header lists the locales you prefer, highest first, with decreasing q-values down the list. MDN’s canonical example is fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5: Swiss French at the implicit q=1, then any French at 0.9, English at 0.8, German at 0.7, anything at 0.5. The header generally lists the same locales as the navigator.languages JavaScript property, with decreasing weights, and a browser commonly appends a language-only fallback after a region-specific tag. MDN gives the shape directly: when navigator.languages is ["en-US", "zh-CN"], the header serializes as en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7.
That serialization convention is a tell, because the spacing of the weights is the browser’s own. Chrome steps the q-values in a fixed pattern as it appends fallback tags. A hand-built Accept-Language that lists the same locales but spaces the weights differently, or omits the language-only fallback after a region tag, or rounds to a different number of decimals, does not match the convention. The locales tell you the user’s region. The weight pattern tells you which browser serialized them.
There is a privacy countercurrent worth naming. Because Accept-Language leaks locale and is high-entropy when it lists several uncommon languages, browsers have been trimming it. Safari sends a single language by default rather than the full list, and Chrome’s incognito mode may reduce to one as well. A short Accept-Language is no longer evidence of a bot; it can be a privacy-conscious browser. That is the recurring tension in the whole triad. The same trimming that reduces a user’s fingerprint also erases the detail a defender used to lean on.
Accept-Encoding, and the compression timeline that dates a client
Accept-Encoding is the most mechanical of the three and, for client identification, often the most useful, because the set of compression tokens a browser advertises is a function of when that browser was built. The directives are a small fixed vocabulary: gzip is LZ77 with a CRC, deflate is zlib, br is Brotli, zstd is Zstandard, compress is the old LZW coding nobody ships, identity means no compression, and * is a wildcard. RFC 9110 defines the lot.
The modern default that current Chromium sends is:
Accept-Encoding: gzip, deflate, br, zstdThe presence of zstd is a hard date stamp. Chrome shipped Zstandard as a Content-Encoding in Chrome 123, which reached beta on 21 February 2024 and released in March 2024. Before that, the default was gzip, deflate, br. So a request advertising zstd is a browser from spring 2024 or later, and a request claiming to be a 2025 Chrome that omits zstd has contradicted its own User-Agent. Brotli, the br token, dates a client the same way one generation back. Chrome added Brotli around 2016, Firefox in the same era, so by the early 2020s br was universal among browsers and its absence marked an old or non-browser client.
Order is part of the signature here too, and it is rigid. Browsers send gzip, deflate, br, zstd in that sequence. The tokens carry no q-values in the browser default. An HTTP library that sends Accept-Encoding: deflate, gzip or Accept-Encoding: gzip alone is advertising a different, smaller, differently-ordered set than any browser, and that is detectable without decoding anything. There is a subtle trap on the receiving end: a client that advertises br or zstd must actually be able to decode the response if the server compresses with it. A scraper that copies the browser Accept-Encoding string but lacks a Brotli or Zstd decoder will request a coding it cannot read, and either fail to parse the body or have to send a dishonest header. The honest, decodable header and the advertised header have to agree, which is a constraint a careless client violates.
There is a further wrinkle arriving now: shared dictionary compression. Chrome’s compression-dictionary-transport feature lets a client advertise dictionary-aware codings, and when a matching dictionary is available the request carries extra tokens. The Chrome team’s own blog originally named these br-d and zstd-d, then renamed them to dcb and dcz, with an Available-Dictionary header carrying a hash of the dictionary the client holds. This is bleeding-edge as of 2026 and not universal, so it is more useful as a positive signal of a very recent Chromium than as something whose absence means much. The exact rollout state varies by Chrome channel, and I would treat any claim about which fraction of traffic carries dcb/dcz as something to measure rather than assume.
Folding the triad into a header-order fingerprint
The triad’s values matter, but where they sit in the request and how they are cased and ordered matters just as much, and that is the layer modern fingerprints capture. JA4H, the HTTP-request member of FoxIO’s JA4+ suite, is the most concrete public example. It hashes the shape of an HTTP request, and two of its components are built directly from the triad.
JA4H has an a_b_c_d structure. The a section is a readable prefix. Reading FoxIO’s reference and the behavior of open implementations, it encodes the HTTP method as two characters (ge for GET, po for POST), the version as two digits (11 for HTTP/1.1), a flag for whether a Cookie header is present (c or n), a flag for whether a Referer is present (r or n), a two-digit count of headers that explicitly does not count Cookie and Referer, and then the first four characters of the primary Accept-Language value with the hyphen stripped. The reference go-ja4h implementation makes this legible: a bare curl produces ge11nn020000, the same curl with Accept-Language: en-us produces ge11nn03enus, and a POST of that with a cookie produces po11cn03enus. The enus tail is the language tag folded straight into the fingerprint.
So Accept-Language is doubly exposed. Its value is a signal, and its first four characters are a literal, human-readable field in the JA4H prefix. A bot that sets Accept-Language: zh-CN while presenting a User-Agent and IP geolocation that say United States writes zhcn into its own fingerprint prefix, a mismatch a defender can pattern-match without any hashing.
The b section of JA4H is where ordering lives. It is a truncated SHA-256 over the request header names in the order they appear, which means the relative position of Accept, Accept-Language, and Accept-Encoding among the other headers is captured in the hash. FoxIO’s spec notes the header count excludes Cookie and Referer, since those are session-dependent and would make the fingerprint unstable. The ordering point connects to a broader truth about HTTP/1.1: clients emit headers in a characteristic sequence. Firefox tends toward Host, User-Agent, Accept, Accept-Language, Accept-Encoding. Chrome interleaves its sec-ch-ua client hints earlier. Python’s default client orders things its own way, with Accept-Encoding and Accept in positions no browser uses. The triad’s place in that sequence is part of the label, which is exactly the territory covered in header order and casing as a fingerprint.
JA4H is one implementation, and the JA4+ family it belongs to runs deeper at the TLS and HTTP/2 layers, which is the subject of TLS fingerprinting from ClientHello bytes to JA4. The reason the triad gets folded into request-level fingerprints rather than checked in isolation is that the combination is what carries information. Each header alone is low-entropy and easy to copy. The order, the casing, the q-value spacing, and the cross-header agreement are harder to get all right at once.
How a mismatched triad gives a client away
Put the pieces together and the detection logic is less about any single value and more about consistency across the request. A few failure modes recur.
The first is omission. A browser always sends all three headers. An HTTP client that sends a request line, a Host, and nothing else, or sends only Accept-Encoding: gzip, is advertising a header set no browser produces. RFC 9110 permits it; browsers never do it. The absence of Accept-Language, in particular, is unusual for a real browser navigation, and its absence under a browser User-Agent is a cheap red flag.
The second is the wrong default value. This is the Python requests problem in its purest form. The library’s default Accept has historically been */*, which no browser sends for a document navigation. A scraper that sets a Chrome User-Agent but leaves the library’s Accept: */* in place has a triad that does not match the engine it claims to be. The fix is to copy the exact browser strings, which raises the bar but does not eliminate the check, because the strings have to match the claimed version, and they drift.
The third is version drift. The Firefox 132 cut and the Chrome 123 zstd cut are both examples. A triad frozen from a 2023 browser capture, replayed in 2026, advertises an outdated Accept-Encoding set or an outdated Firefox Accept, and a detector with a current table notices the staleness. Maintaining a copied triad means tracking the browser’s release cadence, which is real ongoing work rather than a one-time copy.
Framed as a check a defender runs, the logic is a lookup rather than a model. The server keeps a table of known-good triads keyed by browser family and version, populated from real browser captures and refreshed as new builds ship. For each incoming request it parses the claimed browser out of the User-Agent, looks up the expected triad, and compares. In pseudocode the shape is this:
expected = triad_table[ua.family][ua.major_version]if request.accept != expected.accept: flag("accept-mismatch")if request.accept_encoding != expected.accept_encoding: flag("encoding-mismatch")if not consistent(request.accept_language, ua, ip_geo): flag("language-geo-mismatch")That last comparison is the loose one, because Accept-Language legitimately varies per user while Accept and Accept-Encoding are fixed per build. So the language check is a consistency test against the rest of the request rather than an exact-string match, while the other two are exact. The table is the maintenance burden: it has to track every browser’s release cadence, or it starts flagging real users whose browsers updated past the captured row. That maintenance cost is the mirror image of the forger’s, and whichever side falls behind the release schedule pays for it.
The fourth, and the one that survives the most effort, is cross-signal disagreement. The triad does not live alone. It rides on a TLS ClientHello whose JA3 or JA4 hash names a client library, over a TCP stack whose options name an operating system, behind an IP whose geolocation names a country. The Accept-Language can say one thing, the TLS fingerprint another, the OS a third, the IP a fourth. A request whose Accept-Language: ja-JP arrives over a ClientHello that fingerprints as Go’s standard library, from a datacenter IP in Virginia, is internally inconsistent in a way no real Japanese browser user produces. This is the same class of contradiction that detecting a proxy by OS mismatch exploits at the network layer, applied to the application layer. The triad’s job in a modern stack is not to be a unique fingerprint. It is to be one more axis that has to agree with all the others, and the more axes a detector checks, the harder it is for a forged client to be consistent on every one of them at once. That same cross-layer logic is why HTTP/2 multiplexing changed what servers can fingerprint: each new layer adds a constraint the others have to honor.
What the triad is worth in 2026
The three Accept headers are a weak identifier and a strong consistency check, and the gap between those two roles is the whole point. On its own, a document Accept shared between Safari and modern Firefox tells you almost nothing. The same Accept-Encoding: gzip, deflate, br, zstd goes out from hundreds of millions of Chromium installs. Measured as raw entropy, the triad is thin. But entropy is the wrong frame for what it does. Its value is conditional: given a User-Agent claiming Chrome 134, the triad either matches what that build actually sends or it does not, and the mismatch is what gets scored. A defender is not asking who you are. The defender is asking whether the headers you sent are the headers the browser you claim to be would have sent, and that question has a crisp answer the client cannot dodge by being generic.
Two forces are pulling on this signal at once. Browsers are trimming the triad for privacy, Safari sending one language, Chrome reducing Accept-Language in private modes, which erases detail and makes a short header less suspicious than it used to be. At the same time the values keep drifting on every release, the Firefox 132 document Accept, the Chrome 123 zstd, the dcb/dcz dictionary tokens now arriving, which keeps a frozen copied triad detectable for as long as the copier fails to track the cadence. The triad is not getting stronger as a standalone fingerprint. It is getting cheaper to maintain on the forging side and cheaper to cross-check on the defending side, and the side with more axes to compare wins more of the marginal cases. The most reliable thing a triad still does in 2026 is contradict the rest of the request, and a header that contradicts its own User-Agent before a single line of JavaScript has run is the kind of evidence that does not require a challenge to collect.
Sources & further reading
- MDN Web Docs (2026), List of default Accept values — the per-browser, per-resource default Accept strings, including the Chrome 131, Firefox 128-132, and Safari 13.1-18.1 values quoted here.
- MDN Web Docs (2026), Accept-Encoding header — the directive vocabulary (gzip, deflate, br, zstd, identity, wildcard) and the q-value weighting examples.
- MDN Web Docs (2026), Accept-Language header — the fr-CH example, the navigator.languages serialization convention, and the Safari/Chrome single-language privacy behavior.
- IETF (2022), RFC 9110: HTTP Semantics — section 12.4.2 qvalue ABNF, and the content-negotiation rules for absent Accept-Encoding and the always-acceptable identity coding.
- FoxIO-LLC (2024), JA4+ Network Fingerprinting suite — the JA4H HTTP-request fingerprint, its a_b_c_d structure, and the rule that the header count excludes Cookie and Referer.
- lum8rjack (2024), go-ja4h: a JA4H implementation in Go — worked examples (ge11nn020000, ge11nn03enus, po11cn03enus) showing the Accept-Language tag folded into the JA4H prefix.
- Chrome for Developers (2024), Shared dictionary compression — the dcb/dcz dictionary-aware Accept-Encoding tokens and the Available-Dictionary header.
- Chrome Platform Status (2024), Zstd Content-Encoding — the feature entry for Zstandard support that shipped in Chrome 123.
- lwthiker (2022), HTTP/2 fingerprinting — why HTTP/1.1 header values like Accept are easy to fake and why the pseudo-header and frame layers carry more reliable signal.
- Electronic Frontier Foundation, About Cover Your Tracks — the browser-uniqueness methodology that counts HTTP ACCEPT headers and the User-Agent among the attributes it measures.
- Cloudflare (2024), JA4 signals — how JA4-family fingerprints and inter-request signals are used together for client classification at the edge.
Further reading
Header order and casing as a fingerprint: the forgotten HTTP/1.1 tell
Traces how HTTP/1.1 header order and field-name casing fingerprint a client, why every browser and library emits a fixed sequence, and how HTTP/2's mandatory lowercasing erased half the signal while keeping the rest.
·22 min readDataDome's detection model: every signal it collects on the first request
Traces what DataDome evaluates on the very first request, before any JavaScript runs: the TLS/JA4 fingerprint, the HTTP/2 frame profile, the header set, and IP and ASN reputation, and how those signals stack into one decision.
·19 min readHow DataDome uses HTTP/2 and network fingerprints as a signal
A reference on the network-layer fingerprints DataDome reads: HTTP/2 SETTINGS frames, flow control, pseudo-header order, and how a mismatch between the claimed user agent and the wire profile flags a client.
·21 min read