Server-side TLS fingerprinting libraries: how the edge captures the handshake
A JA3 or JA4 string is trivial to describe. Take a handful of fields out of the ClientHello, sort some of them, hash the rest, glue it together. The algorithm fits on a napkin. What the napkin leaves out is the hard part: by the time your application code runs, the ClientHello is gone. The TLS library has already parsed it, negotiated a cipher, derived keys, and thrown the original bytes away. Your request handler sees a decrypted HTTP stream and, if you are lucky, a couple of summary fields the library decided to keep. The raw handshake that carried the fingerprint is not in scope.
So the real engineering question for server-side TLS fingerprinting is not “what is the JA4 formula.” It is “where in the stack can I still see the ClientHello bytes, and what does the TLS library let me do with them before it discards them.” That answer lives inside OpenSSL and BoringSSL callbacks, inside a small set of nginx and HAProxy and Envoy extensions that hook those callbacks, and inside a couple of constraints (buffer sizes, session resumption, TLS termination points) that quietly decide whether you get a fingerprint at all. This post walks that capture path from the socket up.
The sections below go in capture order. First, why the bytes are hard to reach and where in the handshake the window opens. Then the two library primitives every server-side implementation is built on. Then the concrete modules for nginx, HAProxy, and Envoy, including what each one patches and what it exposes. Then how JA3 and JA4 get computed once you have the raw fields, and why the move from one to the other was forced by a change inside Chrome. The post closes on the blind spots that make a server-side fingerprint less reliable than it looks on paper.
Why the bytes are hard to reach
The ClientHello is sent in the clear. That is the only reason any of this works. Before the client and server share a key, the client has to announce, in plaintext, which TLS versions it speaks, which cipher suites it offers, and which extensions it carries. Anyone on the wire can read it. A passive sniffer (Zeek, Suricata, Wireshark) reads it straight off the packet and never touches the connection.
A server is in a worse position than a passive sniffer, which is the counterintuitive part. The sniffer keeps a copy of every byte it sees. A TLS server hands the socket to its crypto library, and that library is built to consume the ClientHello, not to preserve it. OpenSSL reads the record, parses the cipher list into its internal preference structures, picks a suite, and moves on. The byte offsets, the original ordering, the GREASE values, the exact extension layout: none of that survives into the SSL object in a form your code can read back out. By the time accept() returns and your HTTP handler runs, you are several layers removed from the handshake and the evidence has been recycled.
That is the gap every server-side fingerprinting library has to close. There is exactly one moment when the raw ClientHello is both fully received and not yet thrown away: the instant the library finishes reading the message and before it acts on it. TLS libraries expose that instant as a callback. Hook it, copy the bytes, and you have your fingerprint material. Miss it, and you are reduced to whatever summary fields the library kept, which is usually the negotiated cipher and the SNI, and nothing about ordering.
*The handshake has one short window where the full ClientHello exists and is still reachable. Server-side fingerprinting is the art of hooking that window.*There is a second, narrower path that avoids the problem entirely: do not terminate TLS at all. If you sit in front of the real server and only peek at the handshake to make a routing decision, you can read the ClientHello off the wire like a sniffer and pass the connection through untouched. nginx ships exactly this as a stream module, and it is the cleanest place to start.
ssl_preread: reading the handshake without terminating it
nginx has shipped ngx_stream_ssl_preread_module since version 1.11.5. It does one thing: it inspects the ClientHello of a stream connection without terminating TLS, so nginx can route the connection based on what it sees and then proxy the raw, still-encrypted bytes to a backend. It was built for SNI-based and ALPN-based routing, not fingerprinting, but it is the canonical example of reading the handshake in passing.
Out of the box it exposes three variables. $ssl_preread_server_name is the host from the SNI extension. $ssl_preread_alpn_protocols (added in 1.13.10) is the comma-separated ALPN list. $ssl_preread_protocol (added in 1.15.2) is the highest TLS version the client advertised. You enable it with ssl_preread on; in a stream server block, and you point routing at the variables through a map. Note that nginx never holds the private key in this mode; it is reading plaintext handshake fields that any observer could read.
That is the model, but the stock module stops at three fields. SNI, ALPN, and version are a thin slice of the ClientHello. They do not include the cipher list, the full extension list, the supported groups, or the ordering of any of it, which is to say they do not include the parts that make a JA3 or JA4. To get those, someone has to extend the parser to walk the rest of the ClientHello and stash the raw fields. That is what the community fingerprint modules do, and they split into two camps depending on whether they re-parse the handshake themselves or lean on the TLS library to hand it over.
The two library primitives
Every server-side TLS fingerprint, no matter which proxy wraps it, bottoms out in one of two things the TLS library gives you: a callback that fires with the parsed ClientHello, or a raw copy of the ClientHello message that you parse yourself.
OpenSSL exposes the callback as SSL_CTX_set_client_hello_cb. It registers a function that fires during the handshake, after the ClientHello has been received and parsed but before the server has chosen its parameters. Inside that callback you can ask OpenSSL for the pieces it parsed out: the offered cipher list, the list of extension types present, and the contents of any individual extension by its IANA number through SSL_client_hello_get0_ext. This is the primitive the early nginx fingerprint module by fooinha used. It is clean because OpenSSL does the byte-walking for you, and it is limited because you only get back what OpenSSL chose to expose through accessor functions.
BoringSSL, the fork Google maintains and ships inside Chrome and increasingly inside server stacks, exposes a parallel SSL_CTX_set_select_certificate_cb. Its callback hands you an SSL_CLIENT_HELLO struct with a pointer straight at the raw ClientHello bytes and their length, plus helpers to pull individual extensions. Because you get the raw buffer, you can compute a fingerprint that depends on exact byte layout and original ordering, which is the part OpenSSL’s higher-level accessors smooth over. Envoy, which links BoringSSL, builds its JA3 support on this callback.
The split matters because of GREASE and ordering. GREASE (defined in RFC 8701) seeds the cipher list and extension list with reserved junk values so that servers do not ossify on a fixed set. A fingerprinting implementation has to strip those values back out, or the same client looks different every connection. Whether you can even see the GREASE values, and the exact order everything arrived in, depends on whether your primitive handed you the parsed summary or the raw bytes. Raw bytes give you everything; the parsed accessors sometimes normalize the very details you wanted.
*Both library callbacks fire in the same window. The difference is what they hand back, and that difference is exactly the ordering detail a fingerprint depends on.*A subtlety that bites people: many of these modules require building nginx against a patched OpenSSL, or against BoringSSL, because the stock builds did not expose the ClientHello early enough or kept the raw buffer around. The early fooinha module needed a master-branch OpenSSL plus an nginx patch. The widely used phuslu module ships patches for both OpenSSL and nginx and tracks specific version pairs (recent releases target OpenSSL 3.5.x and nginx 1.29.x). The BoringSSL-based modules avoid the OpenSSL patch but pull in a whole different TLS library. None of this is a drop-in dynamic module. You are rebuilding your TLS terminator.
nginx: the patched-build approach
The most-used nginx fingerprint module is phuslu’s nginx-ssl-fingerprint, BSD-2-Clause licensed. It patches OpenSSL to preserve the ClientHello bytes through the handshake and patches nginx to compute and expose the fingerprint as variables. On the HTTP side it gives you $http_ssl_ja3, $http_ssl_ja3_hash, $http_ssl_ja4, an $http_ssl_greased flag, and an $http2_fingerprint variable that captures the HTTP/2 layer as well. The stream module mirrors these as $stream_ssl_ja3 and friends. You can then set them as headers and forward them to a backend, or log them, the same way you would $ssl_preread_server_name.
The reason it patches OpenSSL rather than using SSL_CTX_set_client_hello_cb cleanly is fidelity. To compute a JA3 the way the original tool did, and to produce the raw JA4 with original ordering, you want the actual bytes in order, including the cipher list exactly as offered. The patch keeps that buffer alive long enough for nginx to read it. A BoringSSL variant of the same idea exists (wdslb’s nginx-boringssl-fingerprint) and exposes ssl_preread_ja3 and ssl_preread_ja3_hash in the stream context, trading the OpenSSL patch for a BoringSSL dependency. Other forks (HanadaLee, hsw) carry the same lineage with JA4 and HTTP/2 support bolted on.
The practical shape is the same across all of them. Build your TLS terminator against the patched library, turn the module on, and the negotiated handshake leaves a fingerprint string in an nginx variable. From there it is ordinary nginx: stuff it in a request header to your app, write it to the access log, or feed it into a map for an allow or deny decision at the edge. If you want the algorithm-level walk from ClientHello bytes to the final hash, the companion piece on TLS fingerprinting from ClientHello bytes to JA4 covers the field-by-field computation; here the point is only that nginx is handing you those fields because someone patched the TLS library to keep them.
HAProxy: sample fetches and the capture buffer
HAProxy took a different design path, and it is the most instructive one because it makes the constraint visible. HAProxy 2.5, released 23 November 2021, added a set of SSL sample fetches that let you pull the raw ClientHello fields and assemble a JA3-compatible string in configuration, without a dedicated JA3 fetcher and without a custom module. The building blocks are these fetches:
ssl_fc_protocol_hello_idreturns the protocol version from the ClientHello.ssl_fc_cipherlist_bin([exclude_grease])returns the offered cipher list in binary; there is a hex variant too.ssl_fc_extlist_bin([exclude_grease])returns the extension list.ssl_fc_eclist_bin([exclude_grease])returns the supported elliptic curves.ssl_fc_ecformats_binreturns the EC point formats.
Each of the list fetches takes an optional argument to drop GREASE values per RFC 8701, which is the GREASE-stripping step made into a flag. You convert each binary field to the dash-joined decimal form JA3 expects with a be2dec converter, concatenate the five fields with commas, and run the result through a digest(md5),hex converter chain. The output is a standard JA3 string and hash, built entirely in the config language. Because the pieces are separate fetches rather than one opaque function, you can just as easily assemble a JA3N or a custom variant by reordering or sorting before you hash.
There is one hard prerequisite, and it is the detail that catches people. The fetches return nothing unless tune.ssl.capture-buffer-size is set greater than zero. That global setting controls how many bytes of the ClientHello HAProxy bothers to keep around after the handshake. Default behavior does not preserve the message, because preserving it costs memory on every connection. Set it too small and a long ClientHello (a real browser’s, with many extensions) gets truncated and your fingerprint is computed over a partial message, which produces a stable but wrong hash. This is the capture window from earlier, exposed as a tunable. You are explicitly telling HAProxy to spend memory holding the handshake bytes long enough to fingerprint them.
The HAProxy community has also published Lua plugins for the newer variants where the native fetches do not cover the format. The OXL project maintains a JA3N plugin (which sorts extensions before hashing to survive randomization), plus JA4 and JA4H plugins, all driven from the same captured ClientHello and the same tune.ssl.capture-buffer-size requirement. The native fetches give you JA3; the Lua layer gives you the sorted and next-generation variants on top.
Envoy: the TLS inspector listener filter
Envoy links BoringSSL, so it took the raw-buffer route. Its tls_inspector listener filter has always sniffed SNI and ALPN off the ClientHello for routing, exactly like nginx’s preread. JA3 support was added in pull request 18853, merged 16 November 2021, behind a config field. Set enable_ja3_fingerprinting to true on the TLS inspector and the filter computes a JA3 hash from the ClientHello and stashes it on the connection. The default is false, because computing it costs work on every connection that the common case does not need.
Once computed, the fingerprint is reachable through Envoy’s connection info, and the access-log substitution formatter exposes it as %TLS_JA3_FINGERPRINT%. From there it behaves like any other connection attribute: log it, route on it, pass it upstream as a header. There is an initial_read_buffer_size knob on the inspector that sets how many bytes it reads before it has the whole ClientHello, doubling as needed up to a 64 KiB ceiling, which is Envoy’s version of the same capture-buffer question HAProxy makes explicit. A real browser ClientHello with a large extension set can be several hundred bytes to a couple of kilobytes, comfortably under that ceiling but well over a naive small default.
The Envoy implementation reaches into BoringSSL’s SSL_CLIENT_HELLO to walk the ciphers, extensions, supported groups, and EC point formats in the order they arrived, strips GREASE, and formats the classic JA3 string before hashing. Because it is reading the raw struct, it preserves the original ordering, which is what JA3 needs and, as the next section explains, is also what made JA3 fragile.
Computing JA3 and JA4 once you have the fields
The capture is the hard part. The arithmetic on top is small, and it has barely changed since 2017. JA3 takes five fields out of the ClientHello in a fixed order: version, cipher list, extension list, supported groups, and EC point formats. Each field becomes the decimal values of its bytes, joined with dashes; the five fields are joined with commas; the string is MD5-hashed into a 32-character key. The MD5 is not a security choice. It is a database key, where repeatability matters and collisions do not. GREASE values are stripped before hashing so a GREASE-emitting client still lands on one stable hash.
JA4 keeps the same source material but reorganizes it into a readable a_b_c form. The a section is human-legible metadata: the protocol (t for TLS over TCP, q for QUIC, d for DTLS), the negotiated TLS version, a d or i marker for whether SNI was present, two-digit counts of ciphers and extensions with GREASE excluded, and the first and last character of the first ALPN value. The b section is a 12-character truncated SHA-256 of the cipher list. The c section is a 12-character truncated SHA-256 of the extension list followed by the signature algorithms. The canonical example from the spec is t13d1516h2_8daaf6152771_e5627efa2ab1.
The detail that matters for server-side capture is one word in the JA4 definition: the cipher and extension lists in the b and c sections are sorted before hashing. JA3 hashed them in the order the client sent them. JA4 sorts them. That single change is the whole reason the industry moved, and it was forced by a change inside one browser.
What Chrome did, and why it forced JA4
Starting in early 2023, Chrome began shuffling the order of the extensions in its ClientHello on every connection. The Chromium change is “TLS ClientHello extension permutation,” carried in BoringSSL, and Fastly’s network data pinned the rollout to 20 January 2023: the long-dominant Chrome JA3 hash cd08e31494f9531f560d64c695473da9 fell off a cliff and was replaced by a spray of one-off hashes. The change was intended to ship around Chrome 110 but the behavior showed up in 108 and 109 builds as the rollout ramped. TLS 1.3 already permits extensions in any order, except that pre_shared_key must come last, so permuting them is standards-compliant. The motivation was anti-ossification: stop servers and middleboxes from hard-coding an expectation about Chrome’s extension order, so Google keeps room to change its stack later.
For JA3 it was fatal. JA3 hashes the extension list in sent order, so a client that permutes its extensions produces a fresh JA3 on every connection. A server-side JA3 implementation that was reading the raw ordered buffer, exactly the high-fidelity path the nginx and Envoy modules took, now got a different answer each time the same browser connected. The fingerprint still computed correctly; it just no longer identified anything stable. Firefox followed with its own randomization, and the most precise capture pipeline in the world could not put the pieces back in a canonical order, because the information about the original order had been deliberately destroyed at the source.
Two responses emerged, and both are realizable server-side with the same captured bytes. The narrow fix is JA3N: sort the extensions before hashing, leave the rest of JA3 alone, and you recover a stable hash that survives permutation. The OXL HAProxy Lua plugin implements exactly this on top of the same captured ClientHello. The broader fix is JA4, which builds sorting into the standard and adds discriminating dimensions JA3 never had, including the ALPN value and explicit counts. Cloudflare, which computes JA4 at its edge with a Rust crate it calls client-hello-parser, describes JA4 plainly as resistant to the extension randomization that undermined JA3, and layers per-fingerprint behavioral signals on top: the share of browser user-agents, the share of cacheable responses, and the HTTP/2-and-3 ratio seen for a given JA4 over the last hour. That last move is the tell about where this is going. A raw fingerprint identifies a stack; aggregating behavior per fingerprint is what turns it into a bot signal. The relationship between the network fingerprint and the higher-layer signals is covered in the companion post on how Cloudflare uses TLS and HTTP/2 fingerprints in bot scoring.
The blind spots
A server-side TLS fingerprint looks authoritative because it is computed from bytes the client cannot lie about at the TLS layer without changing its actual TLS stack. That is true and it is the source of the technique’s value. It also oversells what you get in production, in three concrete ways.
The first is session resumption. The fingerprint comes from the full ClientHello sent during a fresh handshake. When a returning client resumes a session (TLS 1.3 PSK, or a session ticket), the ClientHello is different and often abbreviated, and the negotiated path skips most of the parameters the fingerprint reads. Cloudflare’s own documentation notes that the JA3 and JA4 identifiers become unavailable once a connection rides a resumed session rather than a full initial handshake. So your fingerprint coverage has holes precisely where your most frequent, best-behaved clients live, because they are the ones whose sessions resume.
The second is the termination point. Every module above assumes you terminate TLS, or at least preread it, at the box doing the fingerprinting. Put a CDN, a load balancer, or a TLS-terminating proxy in front and the ClientHello your origin sees belongs to that intermediary, not the real client. The fingerprint then identifies your own CDN’s TLS stack, uniformly, for all traffic. This is why fingerprinting has gravitated to the edge: the edge is the only place that holds the client’s original handshake. If you run your own edge, you have to make sure the capture happens at the first TLS-terminating hop and gets forwarded inward as a header, which reintroduces a trust boundary, because now a downstream service is believing a header that an upstream hop could have set to anything.
The third is the capture buffer itself, the recurring detail in this whole post. HAProxy will not give you the fields unless tune.ssl.capture-buffer-size is non-zero and large enough; Envoy’s inspector reads up to a 64 KiB ceiling but starts smaller; the nginx modules need a TLS library patched to keep the bytes at all. Every one of these is a place where a real, long ClientHello can get truncated and produce a confidently wrong fingerprint that is stable enough to look correct. A fingerprint computed over a partial handshake is worse than no fingerprint, because it will match across many distinct clients and quietly collapse them into one bucket.
None of this makes server-side TLS fingerprinting weak. It makes it a measurement with a defined aperture. You are reading the one plaintext message the client must send, in the one window before the library consumes it, through a buffer you sized yourself, at the one hop that still holds the original bytes. Get all four right and you have a stable network-layer identifier that survives IP rotation and user-agent spoofing. Get any one wrong and you have a hash that looks like a fingerprint and identifies nothing. The libraries in this post are all, in the end, ways of getting those four things right.
Sources & further reading
- Althouse, Atkinson and Atkins (2017), TLS Fingerprinting with JA3 and JA3S — the original Salesforce post defining the five-field JA3 string, JA3S, MD5 hashing, and GREASE handling.
- Salesforce (2017), salesforce/ja3 — the reference JA3 implementation and field-order documentation.
- FoxIO-LLC (2023), JA4 technical details — the JA4 algorithm: the a_b_c structure, GREASE exclusion, extension sorting, SHA-256 truncation, and the JA4_r raw variant.
- FoxIO-LLC (2023), ja4 — JA4+ network fingerprinting suite — the full JA4+ suite (JA4, JA4S, JA4H, JA4L, JA4X, JA4SSH, JA4T) and the BSD-3 versus FoxIO License 1.1 split.
- nginx (2016), Module ngx_stream_ssl_preread_module — reading SNI, ALPN, and version from the ClientHello without terminating TLS, the basis of edge handshake inspection.
- phuslu (2024), nginx-ssl-fingerprint — patched-OpenSSL nginx module exposing JA3, JA4, and HTTP/2 fingerprints as nginx variables.
- HAProxy Technologies (2021), Announcing HAProxy 2.5 — the release that added the SSL ClientHello sample fetches for building JA3 in configuration.
- HAProxy mailing list (2021), [PATCH] JA3 TLS Fingerprinting (take 2) — the patch defining ssl_fc_cipherlist_bin, ssl_fc_extlist_bin, the exclude_grease argument, and the capture-buffer requirement.
- Envoy project (2021), tls_inspector: create JA3 client fingerprint (PR #18853) — the BoringSSL-based JA3 implementation in Envoy’s TLS inspector listener filter and the enable_ja3_fingerprinting field.
- Foote, Kumar, Woodson and the Fastly Security Research Team (2023), A first look at Chrome’s TLS ClientHello permutation in the wild — network data dating the 20 January 2023 rollout that broke the dominant Chrome JA3.
- Chromium (2023), TLS ClientHello extension permutation — the Chrome Platform Status entry for the BoringSSL change that randomizes extension order.
- Cloudflare (2024), Advancing threat intelligence: JA4 fingerprints and inter-request signals — edge JA4 computation with the client-hello-parser crate and the per-fingerprint browser, cache, and h2/h3 ratios.
- O-X-L (2024), haproxy-ja3n-fingerprint — a HAProxy Lua plugin that sorts extensions to produce a permutation-stable JA3N from the captured ClientHello.
Further reading
How Cloudflare uses TLS and HTTP/2 fingerprints in bot scoring
A reference on Cloudflare's network-layer fingerprinting: how JA3, JA4, and the HTTP/2 frame profile are computed at the edge, what cf.bot_management exposes, and how those signals feed the 1-99 bot score.
·23 min readTLS fingerprinting: from ClientHello bytes to JA4
What a ClientHello actually contains, why JA3 worked for six years and then stopped, and what JA4 fixes, with a Python reference you can run against your own packet captures.
·15 min readDetecting virtualized and containerized browsers: GPU, screen, and timing artifacts
How detectors spot a browser running in a VM or container: software WebGL renderers like SwiftShader and llvmpipe, default 800x600 screens, quantized device memory, and timing artifacts under virtualization.
·23 min read