Post-quantum TLS: ML-KEM, X25519MLKEM768, and the hybrid handshake

A large fraction of the encrypted traffic crossing the internet right now is being recorded by someone who cannot read it. That is fine, for now. RSA-2048 and the elliptic-curve key exchanges underneath TLS hold against every computer that exists in 2026. The problem is the recording. An adversary who keeps the ciphertext and waits for a cryptographically relevant quantum computer can come back years later, run Shor’s algorithm against the captured key exchange, recover the session keys, and decrypt the lot. The data does not have to be valuable today. It only has to still be valuable on the day the machine arrives.

That single threat model, harvest now and decrypt later, is the reason your browser already sends roughly a kilobyte of extra bytes in its very first TLS packet. The fix is in production. Chrome, Firefox, Safari, OpenSSL, Cloudflare, and Google all negotiate a post-quantum key exchange by default for a majority of connections. This post walks the whole thing end to end: why the elliptic-curve key exchange is the weak link, what ML-KEM is and how FIPS 203 specifies it, how the X25519MLKEM768 hybrid glues a classical and a lattice scheme together, why the resulting ClientHello no longer fits in one packet, and where the 2024-2026 rollout actually stands. The companion piece on how post-quantum key exchange changes the TLS fingerprint surface covers what all this does to JA3 and JA4; here the focus is the crypto and the standards.

Why the key exchange, and only the key exchange

Quantum computers do not break all of cryptography equally. Two algorithms matter. Grover’s algorithm gives a quadratic speedup against unstructured search, which halves the effective strength of a symmetric cipher: AES-256 keeps a comfortable 128-bit margin, so symmetric crypto and hashes mostly survive by using bigger parameters. Shor’s algorithm is the dangerous one. It factors integers and computes discrete logarithms in polynomial time, and that is exactly the hard problem holding up RSA, finite-field Diffie-Hellman, and elliptic-curve Diffie-Hellman. Every public-key primitive in a normal TLS 1.3 handshake rests on a problem Shor’s algorithm solves.

Inside the handshake, those primitives do two different jobs, and the jobs have very different deadlines. Authentication, the server proving it owns the certificate, happens live. A signature forged after the connection closes is worthless; nobody can retroactively impersonate a server in a session that already ended. Key exchange is the opposite. The (EC)DHE exchange that establishes the session keys is recorded in the clear in the ClientHello and ServerHello, and if an attacker can later recover the shared secret, they decrypt everything that flowed under it. So the urgent half of the migration is key exchange, not signatures. Break the signature in ten years and you have wasted your time. Break the recorded key exchange in ten years and you read a ten-year-old session.

This is why the browser rollout is all about key agreement and barely touches certificates. The certificate chain still uses classical ECDSA or RSA signatures in 2026, and post-quantum signatures (ML-DSA from FIPS 204, SLH-DSA from FIPS 205) are a later, separate problem with their own size headaches. The thing shipping at scale today, the thing that grew the ClientHello, is one component: the post-quantum KEM bolted onto the existing elliptic-curve exchange.

It is worth being precise about why a quantum attacker who can read a recorded ECDHE exchange wins the whole session. TLS 1.3 derives every key it uses from the (EC)DHE shared secret through HKDF: the handshake traffic keys that protect the rest of the handshake, the application traffic keys that protect the data, and the resumption secrets that protect future sessions. Recover that one shared secret and the entire key schedule unrolls deterministically from the recorded handshake transcript, which is itself in the clear up to the point the handshake keys take over. There is no per-message secret an attacker still has to guess. The forward secrecy that ephemeral key exchange buys against a classical attacker, where a stolen long-term key does not expose past sessions, evaporates against a quantum one, because the ephemeral secret is no longer ephemeral to someone who can compute discrete logs from the recorded shares. That is the precise reason the recording is worth keeping.

*The recording attacker only profits from the key exchange. That is why the 2024-2026 rollout is about key agreement and leaves the certificate signatures classical for now.*

ML-KEM: what FIPS 203 actually specifies

The post-quantum half of the new key exchange is ML-KEM, the Module-Lattice-Based Key-Encapsulation Mechanism, standardized as FIPS 203 and published on 13 August 2024. It is the same algorithm a lot of people still call Kyber. FIPS 203 says so directly: ML-KEM is derived from the round-three version of CRYSTALS-Kyber, a submission to NIST’s Post-Quantum Cryptography Standardization project that began in 2016 with 82 candidate algorithms. The name change was not cosmetic. The standardized ML-KEM differs from the round-three Kyber in a handful of details (domain separation, how the hash inputs are framed), so a conforming ML-KEM implementation is not bit-compatible with old Kyber code. That distinction is what forced the second browser transition in late 2024, which I will come back to.

A KEM is not a signature scheme and it is not plain public-key encryption. It is a narrower, cleaner primitive. FIPS 203 frames it as three algorithms. KeyGen produces a decapsulation key (private) and an encapsulation key (public). Encaps takes someone’s public encapsulation key, draws fresh randomness, and outputs two things: a shared secret and a ciphertext that carries that secret. Decaps takes the ciphertext and the private decapsulation key and recovers the same shared secret. Nobody chooses the shared secret; it falls out of the randomness inside Encaps. That is the whole interface, and it maps cleanly onto a TLS key exchange where the client publishes a public key and the server replies with a ciphertext.

The security rests on the Module Learning With Errors problem, MLWE. The intuition is short. Take a matrix and a secret vector over a polynomial ring, multiply them, then add a small amount of deliberately chosen noise. Recovering the secret from the noisy product is believed to be hard, for classical and quantum computers alike, because the noise destroys the clean linear-algebra structure an attacker would otherwise exploit. ML-KEM works over the ring of polynomials of degree less than 256 with coefficients modulo the prime q = 3329. The “module” qualifier means the secret is a short vector of these polynomials rather than a single ring element, which is what lets the same core machinery scale to three security levels by changing the vector’s dimension.

*The KEM round trip. The client publishes ek, the server encapsulates a secret into ct and ships it back, both sides hold the same 32-byte ss. Sizes are for ML-KEM-768.*

Inside Encaps and Decaps

ML-KEM is built in two layers, and FIPS 203 keeps them separate. Underneath is K-PKE, a public-key encryption scheme that is only IND-CPA secure: it resists an attacker who can encrypt chosen messages but not one who can probe a decryption oracle. K-PKE on its own is brittle. The outer ML-KEM wrapper turns it into an IND-CCA2 secure KEM using a variant of the Fujisaki-Okamoto transform, the standard recipe for hardening a CPA scheme into one that survives chosen-ciphertext attacks.

The clever part is what Decaps does when a ciphertext looks wrong. It does not reject. It re-encrypts. After decrypting the ciphertext to recover the message, ML-KEM.Decaps runs the encryption again with the derived randomness and checks that it reproduces the exact ciphertext it received. If they match, it returns the real shared secret. If they do not, it returns a deterministic pseudo-random value derived from the ciphertext and a secret seed stored in the private key, the “implicit rejection” value, instead of an error. An attacker who tampers with a ciphertext gets back a secret that is wrong but indistinguishable from a real one, which closes the timing and oracle side channels that the FO transform exists to seal. There is no “decryption failed” signal to mine.

Two performance details make ML-KEM fast. The polynomial multiplications run through the Number-Theoretic Transform, a finite-field analogue of the FFT that turns convolution into pointwise multiplication, which is why q = 3329 was chosen (it admits the right roots of unity). And the sampling, expansion, and hashing all use the SHA-3 family from FIPS 202: SHAKE128 and SHAKE256 as extendable-output functions, SHA3-256 and SHA3-512 for the hashes. ML-KEM leans entirely on Keccak for symmetric work, which is why a hardware SHA-3 instruction helps it disproportionately.

The three parameter sets, and why TLS picked 768

FIPS 203 specifies exactly three parameter sets, distinguished by the dimension of the module (the length of the secret vector) and a couple of noise parameters. ML-KEM-512 targets NIST security category 1, comparable to brute-forcing AES-128. ML-KEM-768 targets category 3, comparable to AES-192. ML-KEM-1024 targets category 5, comparable to AES-256. More dimension means more security and bigger keys.

The byte sizes are where the parameter choice becomes a TLS decision. For ML-KEM-768, the encapsulation key (the public key the client sends) is 1184 bytes and the ciphertext (the server’s reply) is 1088 bytes. The shared secret is 32 bytes regardless of parameter set. Compare that to X25519, where a public key and the ciphertext-equivalent are 32 bytes each, and the size jump is the whole story of the rollout. ML-KEM-768 is roughly 35 times larger than the X25519 key exchange it augments, and that ratio is what the Chromium team cited when they shipped it.

Parameter set	NIST category	Encaps key (ek)	Ciphertext (ct)	Shared secret
ML-KEM-512	1 (~AES-128)	800 bytes	768 bytes	32 bytes
ML-KEM-768	3 (~AES-192)	1184 bytes	1088 bytes	32 bytes
ML-KEM-1024	5 (~AES-256)	1568 bytes	1568 bytes	32 bytes

The decapsulation key, the private half, is larger still, because FIPS 203 stores not just the secret vector but a copy of the encapsulation key, a hash of it, and the implicit-rejection seed inside it, so that Decaps can run its re-encryption check without any extra inputs. For ML-KEM-768 that private key is on the order of 2400 bytes. It never goes on the wire in a TLS handshake, though, so it does not affect packet sizes; only the 1184-byte public key and the 1088-byte ciphertext travel. The asymmetry is the opposite of RSA, where the public modulus is the thing you transmit and it is small. Here the public key is big and the ciphertext is big, and that is the cost lattice cryptography charges for quantum resistance.

TLS standardized on ML-KEM-768 as the default. The reasoning is a margin argument. Category 3 already clears the bar most threat models care about, and the lattice estimates that set the categories have a habit of shifting as cryptanalysis improves, so the extra headroom over category 1 buys insurance without the full size cost of category 1024. The same logic shows up across the IETF drafts and the browser ship decisions: 768 is the sweet spot, 512 is rarely offered on the public web, 1024 is reserved for the paranoid and for the secp384 hybrid.

The hybrid: belt and suspenders

Nobody is willing to bet a TLS connection on ML-KEM alone. Lattice cryptography is young by the standards of the field, the security estimates move, and a structural break in MLWE that nobody has found yet would be catastrophic if it were the only thing standing between an attacker and the session keys. The answer is hybrid key exchange: run a classical elliptic-curve exchange and a post-quantum KEM side by side, combine both shared secrets, and design the combiner so the result is secure as long as at least one of the two components is unbroken. Classical stays in to cover a surprise lattice break. ML-KEM goes in to cover the quantum threat. You need to break both to win.

The construction that shipped is X25519MLKEM768, defined in the IETF draft draft-ietf-tls-ecdhe-mlkem (authored by Kris Kwiatkowski, Panos Kampanakis, Bas Westerbaan, and Douglas Stebila). It pairs X25519, the Curve25519 Diffie-Hellman exchange that is already the default in modern TLS, with ML-KEM-768. It gets its own entry in the TLS supported-groups registry with codepoint 0x11EC (decimal 4588). The draft also defines two NIST-curve variants for environments that need FIPS-approved elliptic curves: SecP256r1MLKEM768 at 0x11EB (4587) and SecP384r1MLKEM1024 at 0x11ED (4589).

How the two secrets combine is governed by the more general hybrid-design draft, draft-ietf-tls-hybrid-design, and it is deliberately boring. The two shared secrets are concatenated and the result is dropped into the existing TLS 1.3 key schedule, in the slot where the plain (EC)DHE shared secret normally goes. The draft spells it out:

1
concatenated_shared_secret = MyECDH.shared_secret || MyPQKEM.shared_secret

That concatenated value feeds straight into HKDF-Extract at the handshake-secret stage. No extra KDF, no separate mixing step, no length prefix. The length prefix is unnecessary precisely because both component secrets are fixed length: X25519 gives 32 bytes, ML-KEM-768 gives 32 bytes, so there is no ambiguity about where one ends and the next begins. The draft is explicit that variable-length inputs would have needed an unambiguous encoding; fixed-length inputs do not. Because both secrets pass through HKDF, which acts as a dual-PRF, the output stays pseudo-random if either input is. That is the formal version of “secure if either survives.”

There is a subtlety in the ordering that trips people up. The component shares inside the key_share extension are concatenated in the order the named group dictates, and that order is not consistent across the three hybrids. For X25519MLKEM768 the client sends the ML-KEM-768 encapsulation key first and the X25519 share second, the reverse of what the name suggests. For the two NIST-curve variants the elliptic-curve share comes first. The shared-secret concatenation follows the same per-group ordering. If you are parsing these by hand, read the draft, not the name.

*The client share carries the ML-KEM key first, then X25519. Both 32-byte shared secrets are concatenated and fed once into HKDF. No extra KDF, no length prefix, because both halves are fixed length.*

The ClientHello does not fit in one packet anymore

Here is the operational consequence that turned a crypto upgrade into an internet-plumbing project. A classical X25519 ClientHello fits comfortably inside a single network packet, with room to spare. Add a 1184-byte ML-KEM encapsulation key and the ClientHello grows past the size of one TCP segment on a typical 1500-byte-MTU path. Cloudflare measured the client share at 1216 bytes against the 36 bytes that X25519 needs, which is what pushes the whole message over the line. Now the first flight of the handshake spans two packets where it used to be one.

That should be a non-event. TCP is a byte stream; splitting a ClientHello across segments is completely legal and always has been. The trouble is that it used to be so rare that a lot of deployed software quietly assumed it never happened. Middleboxes, load balancers, and TLS-inspecting appliances that try to parse the ClientHello sometimes grab only the first segment, fail to find the end of a message they assumed would arrive whole, and either hang or drop the connection. This is protocol ossification: behavior that was technically allowed but never exercised becomes a de facto break when something finally exercises it. The post-quantum ClientHello is the something.

Cloudflare’s measurements put a number on it. During their early experiments, the larger ClientHello broke roughly 0.05 percent of connections to origins that used the fast method of advertising the post-quantum group directly. Small, but not zero, and concentrated in exactly the enterprise-network gear that is hardest to update. Google’s Chromium team hit the same wall and worked individually with vendors whose appliances choked, naming Vercel, Zscaler, and a PayPal endpoint among the incompatibilities they fixed before the stable-channel rollout. They also shipped an escape hatch: the enterprise policy PostQuantumKeyAgreementEnabled, which let administrators turn the hybrid group off while their middlebox vendor shipped a patch.

That escape hatch was always meant to be temporary, and it is being closed. Google has been phasing the override out: by Chrome 138, around mid-2025, flipping the policy no longer disables ML-KEM, with full removal scheduled through 2026. The message to network operators is blunt. Fix the box that cannot parse a two-packet ClientHello, because the workaround is going away. If you fingerprint or inspect TLS at the edge, the same two-packet reality matters for reassembly; the mechanics are covered in the TLS 1.3 handshake, frame by frame and in how Cloudflare uses TLS and HTTP/2 fingerprints.

*The split that broke the middleboxes. Splitting a ClientHello across TCP segments was always legal; it was just never common until a kilobyte of lattice key made it routine.*

The rollout, 2023 to 2026

The deployment ran in two waves, and the seam between them is the Kyber-to-ML-KEM rename. The first wave used the pre-standard draft. Chrome ramped a draft post-quantum group to 10 percent of desktop traffic in November 2023, then enabled it by default in early 2024. The Chromium ship was X25519Kyber768Draft00, the hybrid of X25519 with the round-three Kyber-768 that predated FIPS 203. Then FIPS 203 landed in August 2024 with its non-bit-compatible changes, and everyone had to migrate from the Kyber draft codepoint to the standardized X25519MLKEM768 at 0x11EC. Chrome made that switch in late 2024, around Chrome 131. For a stretch in 2024 the two coexisted on the wire, which is why packet captures from that year show both group identifiers.

By late 2024 the second wave was the real one. Chrome on Android and Firefox on desktop both enabled the standardized post-quantum group by default in November 2024 (Firefox 132). OpenSSL turned it on by default in its 3.5 release in April 2025. Apple shipped support across iOS, iPadOS, and macOS 26 in the fall 2025 cycle, rolling out by default in October 2025. On the server side the picture was already ahead of the clients: Cloudflare enabled server-side post-quantum key agreement for all customers back in 2022, and Google switched on most of its servers in 2023.

The adoption numbers from Cloudflare’s State of the post-quantum internet report, dated October 2025, show how far this got. More than half of human-initiated traffic to Cloudflare now uses post-quantum key agreement. Roughly 39 percent of the top 100,000 domains supported it as of September 2025. The laggard is the origin side of the connection, the link between a CDN and the customer’s own server, where only about 3.7 percent of origins supported X25519MLKEM768, up from around 0.5 percent in 2023. The browser-to-edge hop went post-quantum fast. The edge-to-origin hop is still mostly classical, which is why Cloudflare also rolled out post-quantum key agreement to origins as a separate effort.

*Two waves. The pre-standard Kyber draft from 2023-2024, then the FIPS 203 ML-KEM transition that everyone re-shipped against in late 2024.*

What the HelloRetryRequest dance avoids

There is one more wrinkle worth understanding, because it governs how the kilobyte gets onto the wire in the first place. A client that wants post-quantum key agreement has two ways to ask. The eager method puts the full X25519MLKEM768 key share in the very first ClientHello, which costs the extra 1216 bytes on every connection whether or not the server can use them. The polite method only advertises support for the group in the supported-groups list, sends a cheap classical key share, and waits. If the server supports the hybrid, it answers with a HelloRetryRequest asking the client to resend with the real ML-KEM key share. That trades a round trip for not wasting a kilobyte on servers that would have ignored it.

Browsers send eagerly, because for them the latency of an extra round trip is worse than the bandwidth of an extra kilobyte, and they are usually talking to edges that support the group anyway. Cloudflare, sitting in front of millions of origins of unknown capability, does the opposite for its origin connections: it advertises support but waits for a HelloRetryRequest before committing the big key share, so it does not blow up the 0.05 percent of origin paths with brittle middleboxes. Same protocol, opposite defaults, because the cost-benefit flips depending on whether you know the peer can handle it.

Where this leaves things

The quantum computer that motivates all of this does not exist yet, and credible estimates for when a cryptographically relevant one might arrive still range across more than a decade. That gap is exactly why the work is already done. The defenders cannot wait for the threat to materialize, because the threat is retroactive: every session recorded today under a classical-only key exchange is a session that becomes readable the day the machine boots. The only defense against a harvest-now-decrypt-later attack is to stop producing harvestable sessions before the harvest pays off, which means deploying years early, while the attack is still theoretical. The migration is racing a clock nobody can read.

What actually shipped is narrower and more pragmatic than the headlines suggest. Not a wholesale move to post-quantum cryptography, but one carefully chosen graft: ML-KEM-768 spliced onto X25519, combined by plain concatenation through the existing key schedule, secure as long as either half holds, leaving the certificates classical for a later phase. The hard part was never the lattice math. NIST spent eight years and 82 candidates on that. The hard part was that a kilobyte of key broke a generation of network equipment that had silently assumed a ClientHello fits in one packet. The cryptographers solved their problem in 2024. The internet is still paying off the assumption that the rest of us baked into the plumbing, one middlebox firmware update at a time.

Sources & further reading

NIST (2024), FIPS 203: Module-Lattice-Based Key-Encapsulation Mechanism Standard — the primary standard; ML-KEM derived from round-three CRYSTALS-Kyber, MLWE basis, KeyGen/Encaps/Decaps, K-PKE, q=3329, three parameter sets.
Federal Register (2024), Announcing Issuance of FIPS 203, 204, and 205 — the 14 August 2024 effective-date notice for the three post-quantum standards.
Kwiatkowski, Kampanakis, Westerbaan, Stebila (2026), Post-quantum hybrid ECDHE-MLKEM Key Agreement for TLSv1.3 (draft-ietf-tls-ecdhe-mlkem) — defines X25519MLKEM768 (0x11EC), SecP256r1MLKEM768, SecP384r1MLKEM1024, and the per-group concatenation order.
Stebila, Fluhrer, Gueron (IETF), Hybrid key exchange in TLS 1.3 (draft-ietf-tls-hybrid-design) — the concatenation combiner, fixed-length rationale, and dual-PRF security argument.
Bas Westerbaan / Cloudflare (2025), State of the post-quantum Internet in 2025 — adoption percentages, the harvest-now-decrypt-later threat, and the two-packet ClientHello ossification numbers.
Cloudflare (2023), Cloudflare now uses post-quantum cryptography to talk to your origin server — the eager-vs-HelloRetryRequest origin strategy and the split-ClientHello problem.
Jan Schaumann (2024), TLS 1.3 Hybrid Key Exchange using X25519Kyber768 / ML-KEM — concrete byte sizes (32 / 1184 / 1088), the 1216-byte client share, and the counterintuitive concatenation order.
David Benjamin / Chromium (2024), Intent to Ship: X25519Kyber768 key encapsulation for TLS on Desktop — the M124 ship, the ~35x size figure, vendor incompatibilities, and the PostQuantumKeyAgreementEnabled policy.
PostQuantumSecurity.org, From X25519 to X25519+MLKEM768: How Hybrid TLS Is Becoming Real — the “secure if either survives” combiner and the deployment timeline.
Scrapfly (2025), Post-Quantum TLS: Why Scraping Tools Are Now Exposed — how X25519MLKEM768 changes JA3/JA4 surfaces and which clients lack support.