Skip to content

The TLS ClientHello, field by field: a fingerprinting reference

· 19 min read
Copyright: MIT
ClientHello wordmark in monospace with an orange byte-field underline

A TLS handshake begins with the client talking first, in the clear, before any key has been agreed. That first message is the ClientHello, and it is the single richest unencrypted artefact a bot-detection system gets on the very first packet of a connection. No JavaScript has run. No cookie has been set. The TCP handshake just finished. And already the client has volunteered a structured list of everything its TLS stack supports, in an order it chose, padded the way it pads, with the quirks of whatever library built it.

That list is a fingerprint whether the client wants it to be one or not. Two questions follow from there. Which exact bytes carry identity, and which are noise the spec forces everyone to send identically? This post answers the first by walking every field a fingerprinting scheme reads, and answers the second by being precise about which fields are fixed by RFC 8446 and which are a free choice the implementation makes. If you want the history of how the fingerprints themselves were built, the companion post TLS fingerprinting: from ClientHello bytes to JA4 covers that arc. This one is the field reference.

The sections below follow the wire order of the message. First the fixed envelope: the legacy version, the random, the session ID, the compression byte. Then the part everyone reads: cipher suites. Then the extension block, where most of the modern signal lives, and inside it the four extensions that matter most for identity (supported_versions, supported_groups, signature_algorithms, and key_share) plus ALPN. The last two sections cover how JA3 and JA4 turn those bytes into a string, and what GREASE and Chrome’s extension permutation did to the whole exercise.

The fixed envelope: version, random, session ID, compression

The ClientHello struct is defined in RFC 8446 section 4.1.2. Stripped to its declaration it looks like this:

struct {
ProtocolVersion legacy_version = 0x0303;
Random random;
opaque legacy_session_id<0..32>;
CipherSuite cipher_suites<2..2^16-2>;
opaque legacy_compression_methods<1..2^8-1>;
Extension extensions<8..2^16-1>;
} ClientHello;

The word legacy appears twice, and that is the first thing to internalize. TLS 1.3 was designed to look like TLS 1.2 on the wire so that middleboxes which had only ever seen 1.2 would pass it through untouched. The real version negotiation moved into an extension. So legacy_version MUST be set to 0x0303, the version number for TLS 1.2, on every TLS 1.3 ClientHello. It is a fixed value. A fingerprint that reads this field learns almost nothing, because almost every modern client writes the same two bytes there. JA3 reads it anyway (more on that below), which is part of why JA3 was never as discriminating as the cipher list it sits next to.

ClientHello, wire order (fixed-envelope fields) 03 03 legacy_version 32 bytes random 0..32 B session_id cipher_suites identity here 00 compression extensions <8..2^16-1> most modern signal lives in here Grey boxes are RFC-fixed or low-signal. Orange boxes are where the client's choices leak. Not to scale; the extension block is usually the largest part of the message. *The fixed envelope. legacy_version is always 03 03 on a TLS 1.3 hello, and compression is always a single zero byte. The two orange fields are where implementations differ.*

The random is 32 bytes from a secure random number generator. In TLS 1.2 the first four bytes were a Unix timestamp by convention, which leaked clock skew, but TLS 1.3 dropped that and the whole field is now random. It is per-connection by definition, so it carries no cross-connection identity and no fingerprint reads it. It exists for key derivation, not for you.

legacy_session_id is the vestige of TLS 1.2 session resumption. In TLS 1.3, resumption moved to pre-shared keys, so this field is dead weight, except for one detail: when a client runs in “compatibility mode” it puts a fresh 32-byte value here so the handshake superficially resembles a 1.2 resumption and middleboxes stay happy. Whether a client sends an empty session ID or a 32-byte one is itself a small behavioural tell, and it interacts with the separate question of session resumption signals covered in the TLS session resumption fingerprint. The field’s length is the signal, not its contents, since the contents are random.

Then legacy_compression_methods. This one is almost comically fixed. TLS once supported compression, CRIME killed it in 2012, and TLS 1.3 buried it: for every TLS 1.3 ClientHello this vector MUST contain exactly one byte, set to zero, the null compression method. So the byte is always 01 00 on the wire (one method, value zero). A non-zero compression list in 2026 means either a genuinely ancient client or a hand-rolled stack that did not get the memo. JA3 reads compression as part of its cipher field’s neighbourhood, which is one of the rare cases where the field actually discriminates, because almost nothing legitimate varies it anymore.

cipher_suites: the oldest signal and still a good one

The cipher suites list is where TLS fingerprinting started, and it is still the load-bearing field. Each entry is a two-byte code identifying a record-protection algorithm and a hash for HKDF, and the client sends them in descending order of preference. That ordering is the whole game. The RFC requires the server to ignore suites it does not recognize and pick from what remains, which means the client is free to list its suites in whatever order it likes, and every TLS stack has its own opinion about that order.

Order is identity here in a way that bears repeating, because it is the single most important idea in TLS fingerprinting and it generalizes to every list-valued field below. Chrome’s cipher order is not Firefox’s. Neither matches Go’s crypto/tls default, or OpenSSL’s, or NSS’s. The set of suites overlaps heavily (everyone supports the same handful of AEAD ciphers in 2026) but the sequence is a near-unique signature of the library and its version. A client that advertises the right suites in the wrong order is, from a detector’s point of view, wearing a convincing mask with the eye-holes in the wrong place. The dedicated post on cipher suite ordering as a fingerprint goes deeper on why the order, specifically, is so hard to fake without reimplementing the target’s stack.

Same suites, different order = different fingerprint Client A 1301 1302 1303 c02b Client B 1302 1303 1301 c02b identical set different sequence → different JA3 hash *JA3 hashes the suites in the order sent, so A and B above produce different fingerprints despite advertising exactly the same ciphers. JA4 sorts them, which trades that sensitivity for stability.*

There is a wrinkle that trips up naive parsers: not every two-byte value in this list is a cipher. Some are signalling values. 00 FF is the TLS_EMPTY_RENEGOTIATION_INFO_SCSV marker, and 00 0B-style fallback markers exist too. JA4 keeps SCSV in the list it hashes but strips GREASE values, which is the correct behaviour, because SCSV is a deliberate, stable client choice while GREASE is random by design. Conflating the two is a common implementation bug.

The extension block and why order moved there

After compression comes the extensions vector, and in a modern ClientHello it is the largest part of the message. Each extension is a two-byte type, a two-byte length, and a body. TLS 1.3 pushed a great deal into extensions that used to live in fixed fields, including the actual protocol version, so reading the extension block is no longer optional for a detector that wants to know what it is really talking to.

For years the extension order was as good a fingerprint as the cipher order, for the same reason: each stack laid its extensions out in a fixed sequence, and that sequence was a signature. JA3 hashed the extension list in send-order precisely to capture this. Then Chrome changed the rules, and the story of the extension block is now inseparable from that change, so it gets its own section near the end. For now the important structural facts are these. There is no required ordering of extensions in the spec, with one exception: pre_shared_key, if present, MUST be the last extension, because the PSK binder is computed over everything that precedes it. Everything else is free to move. And the set of extension types present, independent of order, is still strong signal even after order became unreliable.

Four extensions inside that block deserve individual treatment, because they carry the most identity and because they are where a spoofed client most often gets caught lying.

supported_versions: where the real version lives

This is the extension, type 0x002b, that actually negotiates TLS 1.3. The client lists every TLS version it is prepared to use, in preference order, and the server picks from that list rather than from the legacy_version field. A TLS 1.3 client puts 0x0304 (1.3) first, usually followed by 0x0303 (1.2). The presence of this extension is what makes a hello a 1.3 hello at all.

For fingerprinting this matters in a specific, easily-missed way. JA4 derives the version digits in its first segment from this extension, not from legacy_version. The spec is explicit: if extension 0x002b exists, the version is the highest non-GREASE value in it, mapped to a two-character code (13 for 1.3, 12 for 1.2, down through 10, s3, d1 and so on). So a client that wrote 0x0303 in the legacy field but lists 0x0304 in supported_versions is correctly read as a 1.3 client. JA3, which predated this extension’s ubiquity, reads only the legacy field and therefore records 769 (decimal 0x0303) for nearly everyone. That difference is a small worked example of why JA4 is more honest about the modern stack.

supported_groups and key_share: the curves you offer and the ones you commit to

These two extensions are best understood as a pair because they answer related questions. supported_groups (type 0x000a, the old “elliptic curves” extension) lists every key-exchange group the client will accept, by codepoint, in preference order. key_share (type 0x0033) goes further and actually includes ephemeral public keys for a subset of those groups, so the server can complete the Diffie-Hellman exchange in one round trip without asking the client to commit to a group first.

The gap between the two lists is itself a fingerprint. A client typically advertises many groups in supported_groups but only generates key shares for the one or two it expects the server to pick. Which groups it offers, in what order, and which subset it pre-generates shares for, are all stack-specific. JA3 reads supported_groups as its fourth field (the “elliptic curves” list) and its fifth field is the related EC point formats extension. JA4 folds the group information into its hash differently, but in both cases the group list and its order are part of the identity.

The post-quantum transition made this field a moving target worth dating precisely. Chrome shipped the standardized hybrid group X25519MLKEM768, codepoint 0x11ec, in Chrome 131 in November 2024, replacing an earlier pre-standard X25519Kyber768 codepoint. A client advertising 0x11ec in supported_groups and carrying a 1216-byte hybrid share in key_share (1184 bytes of ML-KEM-768 encapsulation key plus 32 bytes of X25519) is a recent Chrome or a stack deliberately copying one. The size of that key_share is a structural giveaway in its own right, because a 1216-byte share looks nothing like a classical 32-byte X25519 share on the wire. The exact codepoint assignments here are public (IANA registry and the Chrome release notes), but vendor detection rules keyed to them are not, so what follows for any given anti-bot product is inferred from observed behaviour rather than documented internals.

supported_groups advertises; key_share commits supported_groups (0x000a) 11ec X25519MLKEM 001d X25519 0017 P-256 0018 P-384 key_share (0x0033) 11ec + 1216 Bhybrid share 001d + 32 Bclassical share only the groups it expects the server to choose get a share *A 2025-era Chrome offers four-plus groups but pre-generates shares for only the top one or two. The 1216-byte hybrid share is a structural tell that no 32-byte classical client produces.*

signature_algorithms: the field JA4 keeps unsorted on purpose

The signature_algorithms extension (type 0x000d) lists the signature schemes the client will accept for certificate verification and for the handshake signature, in preference order. The codepoints encode hash-and-signature pairs (0x0804 for rsa_pss_rsae_sha256, 0x0403 for ecdsa_secp256r1_sha256, and so on). The set and order are stack-specific, like everything else in this message.

JA4 does something deliberate with this field that is worth flagging because it cuts against the rest of the scheme. JA4 sorts cipher suites and sorts extensions before hashing them, to survive reordering. But it appends the signature algorithms unsorted, in send order, after the extension hash. The reasoning is that signature-algorithm order is a stable, library-controlled property that browsers do not randomize, so preserving its order recovers discriminating power that the sorting elsewhere threw away. It is a neat acknowledgement that not every list in the ClientHello is equally volatile, and the designers tuned the scheme list by list.

ALPN: the protocol you ask for next

The Application-Layer Protocol Negotiation extension (type 0x0010, defined in RFC 7301, published July 2014) lets the client say which application protocol it wants to run once TLS is up, inside the handshake, with no extra round trip. The client sends a ProtocolNameList; a browser in 2026 sends h2 then http/1.1, advertising HTTP/2 first. The server echoes its pick in the ServerHello.

ALPN earns special attention in JA4 because JA3 ignored it entirely and JA4 promotes it into the visible, unhashed part of the fingerprint. The first segment of a JA4 string ends with the first and last alphanumeric characters of the first ALPN value, so an h2-first client contributes the literal h2 to the fingerprint and an HTTP/1.1-only client contributes 11. A client whose ALPN does not match its claimed identity (an http/1.1-only hello claiming to be Chrome, say) is an easy catch. There is more on how protocol negotiation leaks the client in ALPN and NPN in fingerprinting.

How JA3 reads these bytes

JA3, published 25 July 2017 by John Althouse, Jeff Atkinson and Josh Atkins at Salesforce, was the scheme that made TLS fingerprinting routine. It reads five fields and concatenates their decimal values with a fixed punctuation rule: commas between the five fields, hyphens between the values inside each field. The field order is exactly:

SSLVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats

A real JA3 string from the original documentation reads 769,47-53-5-10-49161-49162-49171-49172-50-56-19-4,0-10-11,23-24-25,0. The leading 769 is decimal 0x0301; the cipher block is hyphen-joined decimals; then the extension types, then the named groups, then the point formats. That whole string is MD5-hashed to a 32-character value, which is the part everyone shares and stores.

JA3 five fields, send order, MD5 version ciphers extensions curves point formats extensions hashed in send order → randomization breaks it JA4 three parts, sorted, SHA-256 truncated t13d1516h2JA4_a: readable prefix 8daaf6152771JA4_b: sorted ciphers e5627efa2ab1JA4_c: sorted exts + sigalgs sorting the lists survives reordering; the readable prefix keeps version, counts and ALPN in the clear signature algorithms appended unsorted, in send order *JA3 commits the extension list to its hash in send order. JA4 splits the fingerprint so the human-readable facts stay legible and the volatile lists get sorted before hashing.*

The scheme has one defensive design choice that is easy to overlook. JA3 ignores GREASE values completely, in every field, so that a client which sprinkles GREASE into its ciphers and extensions still produces one stable hash rather than a new one per connection. That worked, because GREASE values are drawn from a known, fixed set. What did not survive was the assumption that extension order was stable. That assumption held from 2017 until early 2023, and then it did not.

GREASE and the extension permutation that broke JA3

GREASE, standardized as RFC 8701 in January 2020 and shipped in Chrome 55 in late 2016, reserves a set of dummy codepoints that a client sprinkles into its cipher list, extensions, supported groups, signature algorithms, versions and ALPN. The reserved cipher-and-ALPN values are the sixteen two-byte patterns 0x0A0A, 0x1A1A, 0x2A2A through 0xFA FA; the extension, named-group, sigalg and version values follow the same pattern as four-hex-digit codepoints. The point is to keep the ecosystem honest: servers MUST NOT negotiate a GREASE value and MUST ignore it, which forces every server to handle unknown values gracefully and stops the protocol from ossifying around whatever happens to be deployed today. Fingerprinting schemes treat GREASE as noise and strip it, which is why both JA3 and JA4 filter it before hashing. The dedicated post on GREASE values in the ClientHello covers why a scheme that forgets to strip GREASE generates a useless new hash on every connection.

The bigger disruption came from the same person who built GREASE. Around Chrome 110, taking effect for observers on 20 January 2023, Chrome began permuting the order of its TLS extensions on every connection. The mechanism is a shuffle of the extension list, with pre_shared_key pinned last because the spec requires it. The motivation was anti-ossification, identical in spirit to GREASE: if servers cannot depend on a fixed extension order, Chrome stays free to change that order later without breaking anyone. The side effect was that JA3, which hashed extensions in send order, started producing a different fingerprint on essentially every Chrome connection. Fastly measured the fall: the dominant Chrome JA3 value, cd08e31494f9531f560d64c695473da9, dropped off sharply from that date, and with roughly 15-factorial possible orderings a single browser build could mint on the order of a trillion distinct JA3 hashes. A fingerprint that explodes into a trillion values is, for blocklist purposes, no fingerprint at all.

Two responses followed. The lighter one was JA3N, a normalized JA3 that sorts the extension list before hashing so permutation no longer matters. The heavier one was JA4, launched September 2023 by FoxIO, which rebuilt the scheme from scratch around the lesson. JA4 sorts both the cipher list and the extension list before hashing, includes ALPN that JA3 ignored, reads the real version from supported_versions, and splits the output into a readable prefix plus two truncated SHA-256 hashes so a human can eyeball the version, cipher count, extension count and ALPN without decoding anything. Cloudflare adopted it and reported computing signals across more than 15 million unique JA4 fingerprints drawn from over 500 million user agents in a single hour of traffic, which gives a sense of how much identity still survives sorting. The full anatomy of the JA4 family, including the server, HTTP and TLS-client-library variants, is in the JA4+ suite. The full Chrome-randomization story is in TLS extension ordering and the Chrome randomization that broke JA3.

The detection arms race did not stop at sorting, of course. Once order stopped being reliable, the set of extensions, the cipher set, the group list, the sigalg order, ALPN, and the key_share sizes all stayed informative, and detectors started cross-checking them against each other and against higher layers. A ClientHello that perfectly mimics Chrome’s TLS but then negotiates HTTP/2 with frame settings no real Chrome sends is caught one layer up; the companion field on detecting curl-impersonate and uTLS covers those second-order tells, and tools like uTLS exist precisely because matching one field is no longer enough.

What the message actually tells you

Walk back through the fields and a pattern emerges. The ones RFC 8446 nailed down (the legacy version, the random, the compression byte) carry almost no identity, because the spec forced everyone to write the same bytes. The ones the spec left to the implementation (the cipher order, the extension set, the group list and its key_share commitments, the sigalg order, the ALPN preference) carry nearly all of it, because every TLS library made its own choices there and those choices are stable across a build. Fingerprinting is just the practice of reading the free fields and ignoring the fixed ones. JA3 read them in send order and MD5’d the result; JA4 sorts the volatile lists, keeps the stable ones in order, and exposes the legible facts in the clear. The difference between the two schemes is entirely a difference of opinion about which fields are stable enough to commit to a hash.

The deeper point is that the ClientHello is a confession the client cannot avoid making. It has to advertise its real capabilities to complete a handshake, in an order its library chose, before encryption begins. A spoofed client can copy the bytes, but copying them perfectly means reimplementing a specific build of a specific TLS stack, including the parts that are easy to forget: the SCSV marker that is not a cipher, the GREASE values in the right slots, the key_share that commits to only the groups it offered, the version that lives in an extension rather than the field named version. Each of those is a place where an imperfect copy reveals itself. The fields are public and the RFCs are open, which is exactly why getting all of them right at once, and keeping them right as Chrome ships its next build, is the hard part.


Sources & further reading

Further reading