The HTTP/2 Rapid Reset attack (CVE-2023-44487) explained
A botnet of roughly 20,000 machines should not be able to push a web service past 300 million requests per second. That is a tiny botnet by modern standards, smaller than the ones that ran ordinary volumetric floods years earlier. Yet in late August 2023 a botnet that size drove a Layer 7 flood against Google that peaked at 398 million requests per second, nearly eight times larger than the previous application-layer record. Cloudflare saw 201 million rps from the same campaign. AWS measured over 155 million against CloudFront. The question that mattered was not how big the botnet was. It was how so few machines could generate so many requests.
The answer is a single HTTP/2 frame used the way the specification allows but nobody anticipated at scale. A client opens a stream, sends a request, and immediately cancels it with a reset. The server has already started working. The client has already moved on to the next stream. There is a counter in HTTP/2 that exists precisely to stop one connection from running too many concurrent requests, and this trick walks straight around it, because a cancelled stream stops counting the instant the reset arrives. The vulnerability got the identifier CVE-2023-44487 and the name Rapid Reset. This post is a single-CVE deep dive: what HTTP/2 streams are, why the concurrency limit was supposed to bound the damage, the exact mechanic that bypassed it, why proxy architectures suffered worst, and what the fixes actually changed.
The sections move from the protocol up to the attack and then to the response. First, how HTTP/2 multiplexing and stream IDs work. Then the stream state machine and the SETTINGS_MAX_CONCURRENT_STREAMS limit that was meant to be the safety valve. Then RST_STREAM and the precise moment a stream stops counting. Then the attack itself, the cost asymmetry that makes it pay, and the two variants observed in the wild. Then why reverse proxies and load balancers took the worst of it. Then the records, the coordinated disclosure, and the mitigations. Finally, where the protocol stands in 2026, including the follow-on MadeYouReset variant that the Rapid Reset fixes happened to blunt.
Streams, frames, and multiplexing
HTTP/1.1 had a structural problem that HTTP/2 was built to solve. On a single HTTP/1.1 connection, requests are serialized. You send a request, you wait for the response, then you send the next one. Pipelining tried to relax that and largely failed in practice because of head-of-line blocking and broken intermediaries. So browsers worked around it by opening six or so parallel TCP connections per origin and spreading requests across them. That is wasteful: every connection pays its own handshake, its own slow-start, its own congestion state.
HTTP/2, standardized first in RFC 7540 (2015) and revised in RFC 9113 (2022), replaces the one-request-per-connection model with multiplexing. A single TCP connection carries many independent streams, and the bytes of different streams are interleaved on the wire. The unit of interleaving is the frame. Every HTTP/2 frame has a fixed nine-octet header carrying a length, a type, flags, and a 31-bit stream identifier, followed by a type-specific payload. The stream identifier is the part that makes multiplexing safe. Because every frame is tagged with the stream it belongs to, the receiver can pull frames for many streams off one connection and reassemble each request and response independently.
A request is mostly a HEADERS frame, possibly followed by DATA frames; a response is the same shape coming back. Header fields are compressed with HPACK, which keeps a shared dynamic table on each connection so repeated header names and values cost almost nothing after the first time. That detail matters later: decompressing a request’s headers is real work the server has to do even if the request is about to be cancelled.
Stream IDs are not arbitrary. Streams that a client initiates use odd-numbered identifiers; streams the server initiates (server push) use even ones. Identifiers are assigned in increasing order, so a client’s first request is stream 1, then 3, then 5, and so on. An identifier is used once and never reused on the same connection. That monotonic, single-use property is exactly what lets a client churn through stream IDs without bound. There is no pool to exhaust on the client side. When you finish with stream 5, you do not free it and reuse it; you move on to 7. Nothing about the numbering pushes back on how fast you advance.
*HTTP/1.1 serializes a request per connection; HTTP/2 interleaves frames from many streams over one, tagging each frame with its stream ID.*The stream state machine and the concurrency limit
A stream is not just an ID. It has a lifecycle, and RFC 9113 section 5.1 spells it out as a state machine. A stream begins idle. Sending or receiving a HEADERS frame moves it to open. From open, an endpoint that sends a frame with the END_STREAM flag set transitions to one of the two half-closed states: half-closed (local) if you sent the flag, half-closed (remote) if you received it. Half-closed means one direction is done. A client in half-closed (local) has finished sending its request and is waiting for the response; it may now only send WINDOW_UPDATE, PRIORITY, or RST_STREAM on that stream. When both directions finish, or when a reset arrives, the stream reaches closed.
The concurrency control sits on top of this state machine. A server advertises, in its SETTINGS frame, a parameter named SETTINGS_MAX_CONCURRENT_STREAMS (setting identifier 0x03). It tells the peer the maximum number of streams the server will permit to be active at once. RFC 9113 section 6.5.2 recommends the value be no smaller than 100 so as not to limit parallelism unnecessarily. Browsers in practice default to about 100 concurrent streams. If a client tries to open a stream that would exceed the advertised limit, the server treats it as an error.
The load-bearing sentence is in section 5.1.2, on stream concurrency. Only streams in the open state or in either of the half-closed states count toward the maximum. Streams in idle, reserved, or closed do not count. Read that carefully, because the whole attack lives in it. The moment a stream goes to closed, it stops occupying a slot. And a RST_STREAM frame moves a stream to closed immediately, with no round trip, no waiting for the server to acknowledge anything.
The intent of the limit was a reasonable one. A connection should not be able to pin an unbounded number of in-progress requests on the server simultaneously. Cap the concurrency at 100, and a single connection can have at most 100 requests being worked on at any instant. The designers reasoned about the worst case as a snapshot: how many streams can be live at the same time. They did not reason about the rate at which streams can be created and destroyed, because in normal operation a stream stays open for the duration of a request and the round-trip time bounds how fast you can cycle them. Rapid Reset breaks that assumption by making the lifetime of a stream as short as the client wants it to be.
*The reset is a side exit from the state machine. It moves a stream to closed with no acknowledgement, which is why a cancelled stream stops counting against the limit immediately.*RST_STREAM and the moment a stream stops counting
RST_STREAM is frame type 0x03. Its payload is a single 32-bit error code and nothing else. RFC 9113 section 6.4 describes it as allowing immediate termination of a stream. An endpoint sends it to abort a stream, either because something went wrong (an error condition) or simply because it no longer wants the response. That second case is legitimate and common. A browser navigates away from a page mid-load and cancels the in-flight image and script requests. A client that already has enough data closes a long response early. The protocol needs a cheap, unilateral cancel, and RST_STREAM is it.
Cheap is the operative word. The frame is eleven octets total: the nine-octet frame header plus the four-octet error code, minus one because the error code is four bytes and the length field already accounts for it. Round it to a dozen bytes on the wire. The sender does not wait for a reply. The stream is closed from the sender’s point of view the moment the frame leaves, and the slot it occupied against the concurrency limit is freed at that instant. The receiver finds out a round trip later, but the client does not care about the receiver’s clock. It cares about its own, and on its own clock the stream is already gone and a new one can take its place.
Put the two facts together. A new stream opens with a HEADERS frame and immediately occupies a concurrency slot. A RST_STREAM closes a stream and immediately frees its slot. Both are client-driven, both are one-shot frames, and neither requires waiting for the server. So a client can open stream 1, send the request, reset it, open stream 3, send a request, reset it, open stream 5, and so on, as fast as it can serialize frames into the connection’s send buffer. At no instant are more than a handful of streams in a counted state. The concurrency limit is satisfied at every snapshot. And yet the server has received a flood of complete requests, each of which it has to at least begin to process.
That is the crux of CVE-2023-44487, in the words of the engineers who wrote the protocol. The ability to reset a stream and instantly open another in its place is what makes the attack work. Stream concurrency on its own cannot mitigate it, because the client can churn requests to create an arbitrarily high request rate no matter what value of SETTINGS_MAX_CONCURRENT_STREAMS the server advertises.
The attack: cost asymmetry per round trip
The reason Rapid Reset is a DDoS primitive and not just a protocol curiosity is cost asymmetry. Look at what each side spends per request.
The attacker spends a HEADERS frame and a RST_STREAM frame. The headers are HPACK-compressed and, after the first request on a connection seeds the dynamic table, near-identical requests compress to a handful of bytes. The reset is a dozen bytes. So a full request-and-cancel costs the attacker on the order of tens of bytes of uplink and effectively no downlink, because the response is cancelled before it is sent. No round trip is required between requests, so the number of in-flight requests is bounded only by the attacker’s bandwidth, not by latency.
The server spends much more. It allocates stream state, decompresses the HPACK headers (which mutates the connection’s shared dynamic table, so it cannot simply be skipped), parses the request, maps the URL to a handler or an upstream, applies access control and routing logic, and in a proxy architecture often dispatches the request to a backend before the reset is even noticed. By the time the RST_STREAM is processed, the work is already in flight. The cancellation does not refund it. At best it stops further work; at worst the backend keeps grinding on a request whose front-end stream no longer exists.
That asymmetry is the multiplier. In an ordinary HTTP/2 flood without resets, an attacker is limited to roughly 100 concurrent requests per connection, and each new request can only start after a previous one frees a slot, which takes a round trip. Rapid Reset removes both limits. There is no concurrency ceiling because resets keep the live count near zero, and there is no per-request round trip because the client never waits. Google’s writeup put the consequence plainly: the number of requests an attacker can have in flight is gated only by available bandwidth, which is why a 20,000-node botnet hit 398 million rps. The bottleneck moved from the protocol’s pacing mechanisms to the raw pipe.
This is the broader anatomy of a Layer 7 DDoS: the attacker forces the server to do expensive per-request work while spending almost nothing per request itself. What makes Rapid Reset notable inside that category is that the amplification comes entirely from a legitimate protocol feature, not from a reflector or a misconfiguration. Nothing the attacker sends is malformed. Every frame is spec-compliant.
*The bar widths are illustrative, not to scale. The point is the ratio: the attacker pays a fixed tiny cost per request, the server pays a large variable one, and nothing paces the loop.*Two variants and an RFC ambiguity
The simplest form of the attack opens a stream and resets it immediately, request after request. That is the canonical Rapid Reset. But defenders’ first instinct was to count RST_STREAM frames and treat a high reset rate as the signal, so attackers adapted. Google documented two variants seen in the wild that complicate the naive reset-rate defense.
The first is batched cancellation. Instead of resetting each stream the instant it opens, the attacker opens a batch of streams, waits briefly, then cancels them, then opens a fresh batch. The reset rate per connection drops below whatever threshold a defender set, but the aggregate request rate stays enormous. It is slightly less efficient than immediate cancellation, since the streams briefly occupy real slots, but it defeats a mitigation keyed purely on reset frequency.
The second variant does not cancel at all. It exploits an ambiguity in how the spec says servers should react to a client that opens more streams than the advertised limit. RFC 9113 says an endpoint that exceeds the peer’s SETTINGS_MAX_CONCURRENT_STREAMS may be treated as a stream error or a connection error, and it suggests that an endpoint could choose to reject only the excess streams rather than tear down the whole connection. Many implementations took the lenient path and rejected the excess streams individually instead of closing the connection. An attacker who keeps opening streams past the limit, knowing the server will reject the surplus one by one without dropping the connection, gets a request-flood that never pays the client-proxy round trip either. The server’s own rejection of excess streams becomes the work the attacker is exploiting. This non-cancelling variant is why the eventual guidance is not “rate-limit resets” but “close the connection.”
The lesson the engineers drew is that you cannot mitigate Rapid Reset by reasoning about individual requests, because no individual request is abusive. A single open-then-reset is exactly what a browser does on navigation. The abuse is a property of the connection as a whole: its pattern of opens, resets, and excess-stream attempts over time. So the correct response operates at connection granularity. When a connection trips a heuristic, the server sends a GOAWAY frame to stop new streams and then closes the underlying TCP connection. Crucially the GOAWAY has to be forceful and immediate, not the graceful multi-round-trip shutdown the protocol also supports, because a graceful drain gives the attacker more time to keep churning.
Why proxies and load balancers took the worst of it
The damage was uneven. A monolithic server that handles a request entirely in one process can, in principle, notice the reset and abandon the work cheaply. The architectures that suffered most were the ones built as a front-end proxy talking to a separate backend, which describes most of the modern web: a reverse proxy or load balancer at the edge, application servers behind it.
The problem is separation of concerns. The front-end terminates the HTTP/2 connection, sees the stream, and dispatches the request to an upstream over a separate connection, often a fresh HTTP/1.1 request or a backend HTTP/2 stream. Then the RST_STREAM arrives. The front-end can mark its own stream closed, but the request it already handed to the backend is now an orphan. Cancelling cleanly means propagating the cancellation across an internal hop, which not every proxy does, and even when it does, the backend may have already committed to the work. Cloudflare described servers eagerly reading an enormous chain of requests and resets at the start of a connection and creating enough upstream stress that they could not process new incoming requests. The front-end was healthy; the backends were drowning in work for streams that no longer existed.
There is a second-order effect on connection reuse. Many proxy designs pool backend connections and assume a front-end stream maps cleanly onto a backend request for its full lifetime. A request that is cancelled microseconds after dispatch breaks that assumption thousands of times a second, leaving the proxy to reconcile cancellations it was not built to handle at that rate. The asynchronous gap between “front-end stream closed” and “backend work finished” is exactly the window the attack lives in, and it is widest in precisely the tiered architectures that scale the largest sites. The systems most able to absorb ordinary traffic were the most exposed to this specific trick.
Cloudflare’s customer-impact numbers show how this played out at the edge. During the initial wave, about 1% of requests saw elevated 502 errors, peaking near 12% for a few seconds during the most serious burst on August 29th. And one of their early mitigations backfired: reducing the advertised SETTINGS_MAX_CONCURRENT_STREAMS to 64 to throttle attackers collided with browsers that default to 100, producing a spike in 499 errors for legitimate clients whose extra streams were refused. The fix had to be more surgical than turning the concurrency dial down.
The records, the disclosure, and the timeline
The campaign ran from late August into October 2023. Cloudflare first noticed unusually large attacks starting on August 25, 2023, and watched them climb to a peak just above 201 million requests per second, nearly three times its prior record of 71 million rps set in February 2023. AWS detected the spike against CloudFront between August 28 and 29, peaking over 155 million rps, and over those two days mitigated more than a dozen distinct rapid reset events. Google’s edge absorbed the largest measured flood, 398 million rps, against a prior Layer 7 record of 46 million rps from 2022. None of the three reported a customer outage from the attack itself.
The three companies handled disclosure together. During the process, Google reserved CVE-2023-44487 to track fixes across the many independent HTTP/2 implementations, and the coordinated public announcement landed on October 10, 2023. The NVD entry describes the flaw tersely: the HTTP/2 protocol allows a denial of service because request cancellation can reset many streams quickly, as exploited in the wild from August through October 2023. It carries a CVSS 3.1 base score of 7.5, high, with the vector AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H, which is to say network-reachable, low complexity, no privileges or interaction, and an availability-only impact. CISA added it to the Known Exploited Vulnerabilities catalog with a federal remediation deadline of October 31, 2023.
The blast radius on the implementation side was wide because HTTP/2 is everywhere. The vulnerability touched nginx, Apache httpd, Node.js, Envoy, HAProxy, Netty, gRPC, Go’s net/http, and a long list of products that embed those stacks, from Kubernetes ingress controllers to commercial load balancers. Each implementation needed its own patch, because the mitigation is not a protocol change but a behavioral one: detect abusive reset patterns and tear down the connection. There was no single switch to flip.
What the fixes actually changed
The patches all converge on the same idea: track per-connection behavior and kill connections that abuse the reset mechanism, rather than trying to rate-limit individual frames or requests. The specifics vary by implementation but the shape is consistent.
Counting resets relative to completed requests is the common heuristic. A connection that sends far more RST_STREAM frames than it lets requests complete is behaving abnormally. Google’s writeup gave an illustrative threshold of more than 100 requests on a connection with a cancellation rate above 50%, but the real values are tuned per deployment and kept fuzzy on purpose, because a published threshold is a published target. nginx, for example, added logic to close a connection when an excessive number of streams are reset before the request completes, exposed through directives that bound resets per connection. Go’s net/http server gained a limit on the number of concurrently executing handler goroutines for streams that have already been reset by the client, so cancelled requests cannot pile up unbounded behind the scenes. Netty, Envoy, HAProxy, and the rest shipped analogous per-connection reset accounting.
When a connection trips the heuristic, the response is to stop it cleanly and fast. The server sends a GOAWAY frame, which tells the peer not to open new streams and carries the highest stream ID the server will still process, then closes the TCP connection. The guidance is explicit that this GOAWAY should be the immediate kind that halts new stream creation right away, not the graceful variant that keeps the connection alive through a drain. Against an attacker, the drain is just more attack surface.
Edge providers layered connection-level reputation on top of the per-connection fixes. Cloudflare extended its existing protections to monitor client-sent RST_STREAM frames and added an “IP Jail” mechanism: an IP caught abusing HTTP/2 is forbidden from using HTTP/2 to any domain on the network for a period, on the reasoning that the same abuse is not possible over HTTP/1.x, which has no equivalent of cheap unilateral stream cancellation. They also reworked frame processing, request dispatch, and the queuing and scheduling around cancellation so that a cancelled request consumes less downstream work. The reduce-the-concurrency-limit approach that caused the 499 spike was abandoned in favor of these targeted changes. None of this required changing HTTP/2 on the wire. It is all behavioral hardening of the server side of the same protocol.
It is worth being precise about what was not fixed. The protocol still permits unilateral, immediate stream cancellation, because that feature is genuinely useful and removing it would break legitimate clients. CVE-2023-44487 was resolved by making servers resilient to abuse of the feature, not by removing the feature. That distinction is why the same primitive resurfaced two years later under a different name.
Where it stands in 2026: MadeYouReset and the residue
In August 2025, researchers at Tel Aviv University disclosed a follow-on HTTP/2 denial-of-service technique they named MadeYouReset, tracked as CVE-2025-8671. It targets the same underlying gap, the asynchronous window between a stream being torn down and the server’s work for it finishing, but it reaches that gap from the other side. Rapid Reset relies on client-sent resets. MadeYouReset relies on server-sent resets: the attacker sends frames that deliberately trigger protocol violations, the server responds by resetting the offending streams itself, and the attacker rides those server-initiated resets to the same end. The attacker provokes the cancellation rather than performing it, which sidesteps mitigations that count client-originated RST_STREAM frames specifically.
The interesting part, for the purposes of this CVE, is what happened to the operators who had already done the 2023 work. The proactive measures taken to implement the RFC 9113 guidance and counter Rapid Reset, the connection-level reset accounting and the willingness to tear down abusive connections, also blunted MadeYouReset. Defenders who had moved their logic from “count client resets” to “watch the connection’s overall reset behavior and kill it when it misbehaves” were largely covered, because that broader heuristic does not care which side initiated the reset. The implementations still vulnerable in 2025 were generally the ones that had patched Rapid Reset narrowly. MadeYouReset affected a relatively small number of HTTP/2 stacks for exactly that reason.
Two durable lessons fall out of this CVE. The first is that a concurrency limit bounds a snapshot, not a rate, and any protocol that lets one side both create and destroy work cheaply needs to bound the rate explicitly, because the snapshot count can stay near zero while the throughput runs away. The HTTP/2 designers reasoned correctly about how many streams could be live at once and did not reason about how fast streams could cycle. The second is that the fix was behavioral, not structural, and behavioral fixes generalize only as far as the abstraction you chose to defend at. The operators who patched the symptom (client resets) had to patch again in 2025; the ones who patched the shape of the problem (abusive connection behavior, whoever resets) did not. HTTP/3 carries its own stream-cancellation machinery over QUIC, and the consensus going into 2026 is to build per-connection work limits into those stacks proactively rather than wait for the equivalent flood to arrive. The frame that started all this, eleven bytes carrying a single error code, is still in the protocol, still useful, and still exactly as cheap to send as it was in August 2023.
Sources & further reading
- Cloudflare (2023), HTTP/2 Rapid Reset: deconstructing the record-breaking attack — the deepest technical breakdown, with stream-state mechanics, the IP Jail mitigation, and the 502/499 customer-impact figures.
- Google Cloud (2023), How it works: The novel HTTP/2 ‘Rapid Reset’ DDoS attack — the 398M rps account, the cost-asymmetry framing, and the two observed attack variants.
- Cloudflare (2023), HTTP/2 Zero-Day vulnerability results in record-breaking DDoS attacks — the 201M rps disclosure, the 20,000-node botnet, and the August timeline.
- AWS (2023), How AWS protects customers from DDoS events — the 155M rps CloudFront measurement and the August 28-29 detection window.
- AWS Security Bulletin (2023), CVE-2023-44487 — HTTP/2 Rapid Reset Attack (AWS-2023-011) — AWS’s affected-services list and mitigation status.
- IETF (2022), RFC 9113: HTTP/2 — the protocol itself: stream state machine (5.1), concurrency counting (5.1.2),
SETTINGS_MAX_CONCURRENT_STREAMS(6.5.2), and theRST_STREAMframe (6.4). - NIST NVD (2023), CVE-2023-44487 Detail — the official description, CVSS 7.5 vector, and exploitation window.
- Qualys (2023), Understanding the HTTP/2 Rapid Reset Attack (CVE-2023-44487) — a vendor-neutral summary of the mechanism and the coordinated disclosure.
- Cloudflare (2025), MadeYouReset: An HTTP/2 vulnerability thwarted by Rapid Reset mitigations — the 2025 server-side-reset variant (CVE-2025-8671) and why the 2023 fixes covered it.
- OWASP (2023), HTTP/2 Reset Attack — a concise reference entry on the attack pattern and defenses.
Further reading
Layer 7 DDoS: how application-layer floods differ from volumetric attacks
A reference on application-layer DDoS: why HTTP floods are measured in requests per second, how they diverge from L3/L4 volumetric attacks, why they are cheap to mount and hard to filter, and what actually stops them.
·23 min readRate-limiting algorithms for defense: token bucket, sliding window, and GCRA
Traces the algorithms behind server-side rate limiting as an abuse defense: fixed and sliding windows, the log-versus-counter tradeoff, token and leaky buckets, GCRA, and how Redis enforces them across a fleet.
·25 min readSlowloris and the slow-attack family: starving a server with patience
Traces the low-bandwidth slow attacks: Slowloris, slow POST (RUDY), and slow read, how each pins a worker thread on thread-per-connection servers, why event-driven servers shrug them off, and what actually times them out.
·23 min read