Skip to content

Request smuggling: how HTTP/1.1 desync attacks exploit parser disagreement

· 21 min read
Copyright: MIT
The words REQUEST SMUGGLING as a large monospace wordmark with an orange desync arrow splitting a request boundary

Two servers read the same bytes off the same socket and disagree about where one request ends and the next begins. That is the entire bug. Everything else, the cache poisoning, the credential theft, the mass session hijack, follows from that one disagreement. HTTP/1.1 gives you two different ways to state how long a request body is, and if the proxy in front trusts one of them while the server behind it trusts the other, an attacker can write a request that the front-end reads as complete while the back-end is still waiting for more. The bytes the back-end is waiting for get borrowed from whoever connects next.

It is a strange class of vulnerability because nothing is technically malformed. Every byte is valid HTTP. The attacker does not exploit a buffer or a parser crash. They exploit the fact that two correct-looking parsers, written by different teams, reading the same wire format, can reach different conclusions about a length. This post is a mechanism-level walk through how that happens: the framing ambiguity baked into HTTP/1.1, the CL.TE and TE.CL desync variants that James Kettle’s 2019 research dragged back into relevance, why connection reuse turns one bad request into a weapon against other users, and the defenses that work versus the ones that only look like they work. It is defensive throughout. There are no copy-paste attack strings here, because you do not need them to understand why the system fails.

The sections move from the wire format up to the attack and then to the response. First, how HTTP/1.1 frames a message body and where the two-headers problem comes from. Then the front-end/back-end model and what “desync” actually means on the socket. Then the CL.TE, TE.CL, and TE.TE variants, with the obfuscation tricks that hide a header from one parser. Then the impact, why a single poisoned socket reaches strangers. Then the history, from the 2005 Watchfire paper through 2019 to the 2025 “HTTP/1.1 must die” follow-up. Finally, the defenses, what the RFCs now say, and where the protocol stands in 2026.

Two ways to measure a body

An HTTP/1.1 request is text. A request line, some headers, a blank line, and an optional body. The headers and the blank line tell you where the head ends. Nothing in the body tells you where the body ends. The reader has to be told the length in advance, and HTTP/1.1 offers two mechanisms for that.

The first is Content-Length. It is a byte count. Content-Length: 11 means read exactly eleven bytes after the blank line and stop. Simple, and unambiguous on its own.

The second is Transfer-Encoding: chunked. Instead of declaring a total up front, the body arrives as a sequence of chunks. Each chunk starts with its size written in hexadecimal on its own line, then the chunk data, then a line break. A chunk of size zero terminates the body. This exists because sometimes you are streaming a response and do not know the total length when you start sending, so you send it piece by piece and signal the end with a final zero-sized chunk.

Both mechanisms are legitimate. The trouble starts when a single request carries both. What is the body length then? The byte count in Content-Length, or the chunk stream that ends at the zero chunk? The two can describe completely different boundaries, and a request that contains both is the raw material for every classic desync.

Two ways to say where the body ends Content-Length: 11 hello world read 11 bytes, stop Transfer-Encoding: chunked b (size) hello world 0 stop at the zero chunk A request carrying BOTH headers has two answers. That is the ambiguity. *Content-Length declares a byte count up front; chunked encoding ends at a zero-sized chunk. A request with both headers describes its own length two different ways.*

The specifications have always known this is dangerous. RFC 7230, the 2014 HTTP/1.1 message-syntax document, settled the precedence in section 3.3.3: when both headers are present, “the Transfer-Encoding overrides the Content-Length.” It did not stop there. The same paragraph warns that “such a message might indicate an attempt to perform request smuggling or response splitting and ought to be handled as an error,” and lays down a hard requirement: “A sender MUST remove the received Content-Length field prior to forwarding such a message downstream.” The rule exists. The problem is that the rule lives in the spec, and the attack lives in the gap between the spec and what real servers actually do.

Front-end, back-end, and the shared socket

Almost no production web service is a single server. A request hits a CDN edge, or a load balancer, or a reverse proxy, or all three, before it reaches the application that actually answers it. Call the outermost thing the front-end and the application the back-end. The front-end’s job includes routing, TLS termination, sometimes a web application firewall, and then forwarding the request onward.

To forward efficiently, the front-end keeps a pool of connections open to the back-end and reuses them. This is the load-bearing detail. A single TCP connection between the front-end and the back-end carries many requests over its lifetime, from many different end users, one after another. HTTP/1.1 has no stream identifiers and no length-prefixed frames. As Kettle’s 2025 write-up puts it, requests “are simply concatenated on the underlying TCP/TLS socket with no delimiters, and there are multiple ways to specify their length.” The only thing separating user A’s request from user B’s request on that shared back-end socket is both parties agreeing, byte for byte, on where A’s request stops.

Now suppose the front-end and back-end disagree about a request’s length. The front-end reads what it thinks is one complete request and forwards it. But by the front-end’s measure, the request was shorter than the back-end thinks it is, or longer. The leftover bytes, the ones the front-end considered part of this request but the back-end has not yet consumed, or the ones the front-end forwarded that the back-end has not finished reading, sit in the back-end’s input buffer. They become a prefix glued to the front of whatever request arrives next on that connection.

That is desynchronization. The two ends have lost their shared sense of message boundaries on a connection that is going to keep carrying other people’s traffic. The attacker controls the leftover bytes. The next victim’s request gets that prefix prepended to it before the back-end ever sees the victim’s real request line.

A reused back-end connection after a desync Attacker Front-end proxy / CDN Back-end ambiguous req leaves a prefix stranded bytes in buffer Next, an innocent user reuses the same connection: Victim [attacker prefix] + victim's real request The back-end parses the combined bytes as one request the attacker partly wrote. *The attacker's ambiguous request leaves bytes stranded in the back-end's read buffer. Those bytes become a prefix on the next user's request that happens to reuse the same connection.*

It helps to be precise about what the attacker does and does not control. They do not pick the victim. On a busy site the shared connection serves whoever the pool hands it to next, so the target is random and the technique is probabilistic. Kettle’s own guidance on running these tests is blunt: an attack “on high-traffic sites may require thousands of attempts,” and you should “exercise both caution and restraint,” because every miss can corrupt a real user’s request. That randomness is also why responsible testing is so finicky. The same property that makes the bug dangerous makes a careless proof-of-concept a way to break strangers’ sessions.

CL.TE, TE.CL, and the obfuscation game

The two-letter names describe which header each end trusts. The first letters are the front-end’s choice, the second the back-end’s.

In a CL.TE desync the front-end honors Content-Length and the back-end honors Transfer-Encoding. The front-end measures the body by the byte count and forwards what it thinks is a whole request. The back-end, reading the same bytes as chunked, hits a zero-sized chunk earlier than the byte count would suggest and decides the request ended there. Whatever the front-end sent past that point is unconsumed. It sits in the buffer as the prefix for the next request. The front-end thought it was done; the back-end thinks there is more traffic coming on the wire that the front-end has not labeled.

TE.CL is the mirror image. The front-end trusts Transfer-Encoding and reads the chunk stream; the back-end trusts Content-Length. Here the front-end forwards the full chunked body, but the back-end stops at the byte count, which can land mid-body. The leftover chunk metadata and data become the prefix. Kettle’s description is that TE.CL “results in front-end forwarding incomplete data, leaving back-end waiting for additional chunks,” or the inverse depending on how the counts line up. Either way the boundary the two ends compute is different, and the gap is attacker-controlled.

Both of these require getting the two ends to honor different headers. The blunt version is just sending both headers and hoping the front-end and back-end pick differently, which works against pairs that have not implemented the RFC’s “reject or strip” guidance. The interesting version is TE.TE, where both ends in principle support Transfer-Encoding, and the attacker’s job is to obfuscate the header so that one parser sees it and the other does not. That turns a TE.TE into an effective CL.TE or TE.CL.

The obfuscations Kettle documented are small, ugly mutations of the header that exploit lenient parsing. A horizontal tab between the colon and the value instead of a space. A space before the colon. A value like chunked with trailing junk, or a leading character that makes it xchunked. A vertical tab or form feed in the whitespace. The header split across a line fold. None of these are valid in a strict reading, but a tolerant parser that was written to “be liberal in what it accepts” might normalize one of them to chunked and act on it, while its neighbor sees garbage and falls back to Content-Length. The follow-up research found even stranger triggers: one intrusion-detection engine treated a truncated value beginning chu as chunked, and a carriage-return-based fold could smuggle the header past systems that split on the wrong byte. The exact set of mutations that works against any given pair is a property of those two specific parsers, which is why this stayed a live research area for years rather than getting fixed once.

A later twist removed the headers from the equation entirely. The 2025 research describes “0.CL” and “CL.0” classes where the disagreement is not Content-Length versus Transfer-Encoding but rather whether a request has a body at all. When a front-end treats a request as bodyless while the back-end expects a body, or the reverse, you get the same stranded-bytes outcome without ever sending a Transfer-Encoding header. The detection model also shifted: rather than enumerating header tricks, the newer tooling classifies a discrepancy by whether a given byte is Visible to one parser and Hidden from the other, the “V-H” and “H-V” cases. That generalization matters because it means the problem was never really about two specific headers. It was about any disagreement over message length, and the two-headers case is just the most famous instance.

Why one bad request reaches strangers

A desync by itself is just a confused buffer. The reason it became one of the most serious classes of web vulnerability is what you can do with that prefix.

The most direct use is to capture other people’s requests. If the smuggled prefix is a request whose body the application reflects back, say, a search or a comment endpoint that echoes the submitted value, then when the victim’s real request gets appended and parsed as part of that body, the victim’s request line, cookies, and headers end up stored or reflected as data the attacker can retrieve. Kettle’s write-ups describe using exactly this to pull out X-Forwarded-For, X-Forwarded-Proto, and custom authorization headers that the front-end adds, and in the New Relic case to reach an internal staging system and obtain admin-level API access. The attacker never sees the victim’s screen. They get the victim’s bytes delivered to a place they can read.

The second major use is cache poisoning. If the prefix steers the combined request to a response the attacker controls, and the front-end caches that response under a popular URL, every subsequent visitor to that URL gets the poisoned content until the cache entry expires. In the PayPal case this meant a JavaScript file on the login page, fb-all-prod.pp2.min.js, could be replaced with attacker content. Content-Security-Policy initially blocked direct exploitation, but chaining through unprotected sub-pages bypassed it, and the end state was plaintext password theft from users on certain browsers. PayPal paid two bounties for the chain, $18,900 and $20,000. The point is not the dollar figure. It is that one request, landed once, poisons a shared cache that then serves the attack to a stream of innocent users without any further action.

The third is turning a small reflected flaw into a large one. A reflected cross-site-scripting bug normally requires luring each victim to a crafted link. Smuggled through a desync, the same payload can be delivered to whoever happens to share the connection, with no link, no click, and the attacker’s own cookies stripped out. The research describes converting reflected XSS into persistent, unauthenticated mass exploitation against whatever browsers are active, including theft of HttpOnly cookies that JavaScript is normally forbidden to read, because the theft happens at the HTTP layer, not in the page. Request smuggling also walks straight past a web application firewall sitting at the front-end, because the malicious request only assembles into its dangerous form after the front-end has already inspected and forwarded the innocent-looking pieces. That property is part of why people who study how a WAF actually works treat smuggling as a separate problem class rather than something a rule set can reliably catch, and it overlaps with the broader WAF evasion concepts of fragmenting an attack so no single inspection point sees the whole of it.

A short history of a long-lived bug

The technique is not new. The original “HTTP Request Smuggling” paper came out of Watchfire in 2005, authored by Chaim Linhart, Amit Klein, Ronen Heled, and Steve Orrin. It laid out the core idea, that intermediaries and servers in a chain can disagree about message boundaries, and walked through cache poisoning, firewall and IDS evasion, and request hijacking against the proxy and server software of the day. It was thorough. It was also early, and for more than a decade the class sat mostly dormant in practitioner awareness while the web moved on to CDNs and microservices that, as it turned out, made the problem far worse by multiplying the number of parser pairs on every request path.

The 2019 revival is what made it mainstream. James Kettle’s “HTTP Desync Attacks: Request Smuggling Reborn,” published August 7, 2019 and presented at Black Hat USA and DEF CON, reframed the old technique for modern infrastructure and, just as importantly, shipped tooling that made it findable at scale. The open-source HTTP Request Smuggler extension for Burp Suite automated detection using timing: send a probe whose ambiguous framing forces the back-end to wait for bytes that never come, and measure the resulting timeout. The detection ordering was designed to avoid poisoning real users’ connections during the scan, which is the responsible way to look for a bug that hurts bystanders when triggered carelessly. The named victims were not obscure. PayPal, New Relic, Trello, Red Hat, and the CDNs Akamai, Cloudflare, and Fastly all appear in the disclosure. F5 issued advisory K50375550 on August 25, 2019. Akamai shipped a silent hotfix within roughly 48 hours of the talk. Go’s net/http library got CVE-2019-16276 for a related parsing bug, and HAProxy fixed a vertical-tab normalization bypass in version 2.0.6.

Twenty years of the same framing bug 2005 Watchfire paper 2019 Desync Reborn 2022 RFC 9112, browser desync 2025 HTTP/1.1 must die Each round patched specific parsers; none removed the underlying ambiguity. *The same framing ambiguity has been re-exploited every few years since 2005, with each disclosure patching individual parsers rather than the format.*

The 2022 work, “Browser-Powered Desync Attacks,” presented at Black Hat USA 2022 and DEF CON 30, widened the attack surface again by showing that the browser itself could be coerced into sending the malformed request, which removed the need for the attacker to control a raw socket and brought client-side desync into scope. Then in 2025 Kettle published “HTTP/1.1 Must Die: the desync endgame,” dated August 6, 2025. The argument is that six years of patching has hidden the problem without solving it, because the format itself is the flaw. The new bug classes, 0.CL and Expect-based desync, the latter abusing the two-stage Expect: 100-continue handshake to slip past WAFs and header-stripping, came with a fresh round of high-value findings: Cloudflare (around 24 million sites exposed, $7,000), Akamai (CVE-2025-32094, $9,000), and a set across Netlify, LastPass, T-Mobile, GitLab, and EXNESS totaling roughly $350,000 in bounties. The conclusion is in the title. The recommended fix is not a better WAF rule. It is to stop speaking HTTP/1.1 between the proxy and the origin.

What actually fixes it

The defenses fall into two groups: things that close the parser gap, and things that remove the shared socket the gap depends on.

The RFCs picked the first lever and tightened it over time. RFC 7230 in 2014 said Transfer-Encoding overrides Content-Length and that a sender MUST strip the Content-Length before forwarding a dual-header message. RFC 9112, published June 2022 and obsoleting 7230, went further toward “reject, do not reconcile.” Its section on Transfer-Encoding now states that “a server MAY reject a request that contains both Content-Length and Transfer-Encoding or process such a request in accordance with the Transfer-Encoding alone,” and crucially that the server “MUST close the connection after responding to such a request to avoid the potential attacks.” Closing the connection is the part that defangs the desync, because the whole attack depends on the poisoned socket living long enough to carry the next victim’s request. RFC 9112 also requires a 400 and a connection close when a request carries a Transfer-Encoding that does not end in chunked, and the same treatment for an invalid or conflicting Content-Length. The spec moved from “here is the precedence” to “treat ambiguity as a fatal error and tear down the connection.”

That covers a server reading its own input. In a chain, the more durable defenses live at the boundaries. The Web Security Academy’s guidance is direct: “Use HTTP/2 end to end and disable HTTP downgrading if possible,” and failing that, “make the front-end server normalize ambiguous requests and make the back-end server reject any that are still ambiguous, closing the TCP connection in the process.” Normalize-then-reject is a belt-and-suspenders arrangement: the front-end rewrites anything ambiguous into a single canonical form, and the back-end refuses anything that still looks ambiguous, on the theory that a normalized request should never reach it looking weird. The CDN approach Cloudflare and Fastly took after 2019 was front-end normalization of exactly this kind.

The cleanest fix is structural. HTTP/2 frames every message with explicit, binary-encoded lengths. There is, in Kettle’s words, “zero ambiguity about the length of each message,” because the length is a number in a frame header, not a text convention that two parsers can read differently. If the connection from the front-end to the back-end speaks HTTP/2 rather than HTTP/1.1, the classic desync has nothing to exploit, because the leftover-bytes condition cannot arise. This is why the 2025 recommendation is to run HTTP/2 upstream universally and why “WAFs cannot thwart desync attacks as effectively as upstream HTTP/2.” A firewall inspects requests; it does not change the framing the back-end trusts.

Defenses, weakest to strongest WAF rule against known payloads brittle Reject dual-header requests, close connection RFC 9112 Front-end normalize, back-end reject if still ambiguous HTTP/2 end to end: explicit framing, no ambiguity to exploit *Each step up removes more of the attacker's room. Only end-to-end HTTP/2 takes away the length ambiguity itself; everything below it manages the symptom.*

There are weaker measures that look reassuring and are not. Disabling connection reuse to the back-end removes the shared socket and so removes the bystander problem, but it costs performance and tends to get re-enabled. Making the front-end and back-end identical software with identical configuration narrows the chance they disagree, but it is fragile, because a version skew or a config drift reopens the gap, and it does nothing for the cases where the same parser can be tricked two ways. A WAF rule that matches yesterday’s obfuscation strings is the weakest of all, because the obfuscation space is large, parser-specific, and discovered incrementally; the OWASP Core Rule Set and similar signature sets can catch the crude attempts but were never the right layer to stop a framing disagreement. The honest hierarchy is: HTTP/2 upstream removes the bug, reject-and-close contains it, normalization helps, and signature matching is a speed bump.

Where it stands in 2026

The depressing through-line is that this is the same bug it was in 2005. Two parsers, one wire format, two answers about length. Every revival since has patched the parsers that were caught and left the format alone, which is why the next researcher always finds a new obfuscation or a new length-disagreement class a few years later. The 2025 work made the case that the only durable fix is to retire HTTP/1.1 from the one place it still does damage, the hop between the proxy and the origin, and replace it with a binary protocol that does not let a length be read two ways. That migration is underway at the large CDNs and far from complete across the long tail of self-hosted origins, which means in 2026 the attack surface is smaller at the edge and roughly unchanged behind it.

For anyone defending a service, the practical reading is short. If your front-end and back-end still talk HTTP/1.1 to each other, you have the precondition for a desync, and no amount of request inspection at the front door changes that, because the danger is in how the back-end frames bytes, not in what the bytes say. The configuration that matters is the one you cannot see from the outside: whether the connection behind your proxy is HTTP/2, whether your servers reject ambiguous requests outright and close the connection when they do, and whether a version skew between two pieces of middleware has quietly reopened a gap that a clean install would not have. The bug has survived twenty years of patches because it is not really a bug in any one program. It is a property of a text format that lets the same request mean two lengths, and it will keep being exploitable for exactly as long as two different programs are allowed to disagree about which length is real.


Sources & further reading

  • James Kettle / PortSwigger (2019), HTTP Desync Attacks: Request Smuggling Reborn — the Black Hat USA 2019 research that revived the class; CL.TE/TE.CL/TE.TE definitions, obfuscation tricks, PayPal/New Relic/Akamai case studies.
  • James Kettle / PortSwigger (2019, updated 2022), HTTP Desync Attacks: what happened next — follow-up obfuscations, vendor patches (HAProxy 2.0.6, Go CVE-2019-16276), and tooling refinements.
  • James Kettle / PortSwigger (2025), HTTP/1.1 must die: the desync endgame — the 2025 argument that the format itself is the flaw; 0.CL and Expect-based desync, V-H/H-V detection, Cloudflare/Akamai/LastPass findings, push to HTTP/2 upstream.
  • James Kettle / PortSwigger (2022), Browser-Powered Desync Attacks — Black Hat USA 2022 work bringing client-side desync into scope.
  • PortSwigger Web Security Academy, HTTP request smuggling — the defensive reference: variant definitions in plain terms and the normalize/reject and HTTP/2 end-to-end recommendations.
  • IETF (2014), RFC 7230: HTTP/1.1 Message Syntax and Routing — section 3.3.3 message body length rules, Transfer-Encoding overrides Content-Length, the MUST-strip-Content-Length requirement.
  • IETF (2022), RFC 9112: HTTP/1.1 — the current spec obsoleting 7230; reject-or-process-and-close guidance for dual-header requests and 400-then-close for invalid framing.
  • Linhart, Klein, Heled, Orrin / Watchfire (2005), HTTP Request Smuggling — the original paper defining the technique, cache poisoning, IDS evasion, and request hijacking.
  • F5 (2019), K50375550: HTTP request smuggling vulnerability — vendor advisory issued in the wake of the 2019 disclosure.
  • Snyk (2021), Demystifying HTTP request smuggling — a practitioner walkthrough of the variants and root cause for application developers.

Frequently asked questions

Why does HTTP/1.1 let a front-end and back-end disagree about where a request ends?

HTTP/1.1 offers two ways to state a body's length. Content-Length gives a byte count to read and stop, while Transfer-Encoding: chunked sends the body as size-prefixed chunks that end at a zero-sized chunk. When a single request carries both headers, the two describe different boundaries. If the front-end trusts one and the back-end trusts the other, they reach different conclusions about where the request stops, which is the raw material for a desync.

What is the difference between a CL.TE and a TE.CL desync attack?

The two letters name which header each end trusts, front-end first. In CL.TE the front-end honors Content-Length while the back-end honors Transfer-Encoding, so the back-end hits a zero chunk early and leaves the remaining bytes in its buffer as a prefix. TE.CL is the mirror image: the front-end reads the full chunk stream but the back-end stops at the byte count, leaving leftover chunk data stranded. Either way the boundaries differ and the gap is attacker-controlled.

How does one smuggled request end up affecting other users on a site?

To forward efficiently, a front-end keeps a pool of connections to the back-end and reuses one socket for many users' requests in sequence. After a desync, the attacker's leftover bytes sit in the back-end's read buffer and get prepended to whoever reuses that connection next. The attacker does not pick the victim, so the target is random and the technique is probabilistic, sometimes needing thousands of attempts on high-traffic sites.

Why can't a web application firewall reliably stop request smuggling?

A firewall inspects requests at the front-end, but a smuggled request only assembles into its dangerous form after the front-end has already inspected and forwarded the innocent-looking pieces. The danger is in how the back-end frames bytes, not in what the bytes say. The obfuscation space is also large and parser-specific, so signature rules catch only crude attempts. A WAF manages the symptom rather than removing the framing disagreement itself.

Does switching to HTTP/2 between the proxy and origin actually fix desync attacks?

Yes, and it is described as the cleanest fix because it is structural. HTTP/2 frames every message with explicit binary-encoded lengths, so the length is a number in a frame header rather than a text convention two parsers can read differently. If the front-end-to-back-end connection speaks HTTP/2, the leftover-bytes condition cannot arise and the classic desync has nothing to exploit. Weaker measures like reject-and-close contain the bug, but only end-to-end HTTP/2 removes the ambiguity.

Further reading