Skip to content

The history of HTTP: from 0.9 to HTTP/3, told through its RFCs

· 22 min read
Copyright: MIT
The letters HTTP as a large monospace wordmark with an orange arrow climbing from 0.9 to 3 across a version timeline

The first version of HTTP had no version number. It had no headers, no status codes, no way to send anything but a hypertext file, and no method other than GET. A request was one line of ASCII. The server answered with the document and hung up. You could drive the whole thing by hand from a Telnet prompt, and people did. That protocol, retroactively named 0.9, is the ancestor of every web request your browser makes right now, including the encrypted, multiplexed, UDP-borne ones that never touch TCP at all.

How did a protocol that fit on a napkin become three incompatible wire formats sharing one set of semantics, argued over for thirty-five years across two dozen RFCs? That is the question this post answers, and it answers it the way the protocol was actually built: through the documents. The sections below walk the line from the 1991 one-liner, to RFC 1945 writing down what HTTP/1.0 had already become, to the long HTTP/1.1 era and its two big rewrites (RFC 2616, then the RFC 723x split), to Google’s SPDY experiment that became HTTP/2 (RFC 7540, later RFC 9113), to HTTP/3 binding the same semantics onto QUIC over UDP (RFC 9114). Along the way: why the Host header saved the web’s address space, why server push died, and why a feature of HTTP/2’s multiplexing turned into the largest denial-of-service attack ever recorded.

The one-line protocol, 1991

There is no RFC for HTTP/0.9. The closest thing to a specification is a short W3C document describing “the HTTP protocol as implemented in W3,” and the protocol it describes barely qualifies as a protocol. Tim Berners-Lee designed it at CERN in 1991 with simplicity as the explicit goal. A client opened a TCP connection, sent the word GET followed by a document path, and got back a stream of HTML. No version string. No metadata in either direction. When the document finished, the server closed the connection. That was the entire exchange.

client server GET /index.html <html>...</html> connection closes — no headers, no status line, no version *HTTP/0.9: a single GET line in, a body out, then the socket closes. There is nothing else on the wire.*

The limits are obvious in hindsight, and they shaped everything that came after. With no Content-Type, the server could only ever return HTML, because there was no way to say “this is a GIF.” With no status line, an error was indistinguishable from a short document, so error pages were just HTML that happened to read like an apology. With no request headers, the server learned nothing about the client. And because the connection carried exactly one document, fetching a page with ten inline images meant ten separate TCP handshakes. None of that mattered in 1991, when the web was a few dozen pages of cross-linked physics documents. It mattered enormously by 1995.

RFC 1945: writing down what already happened

Between 1991 and 1995 HTTP grew the features it needed, but it grew them in the wild. Browsers and servers added request methods, headers, and status codes by mutual agreement and trial, not by specification. By the time anyone wrote it all down, “HTTP/1.0” already meant a loose family of implementations that mostly agreed. RFC 1945 appeared in May 1996, and it is unusually honest about this. It does not claim to be a standard. It calls itself a record of “common usage” and is explicitly informational, a description of what the deployed software did rather than a mandate for what it should do.

What HTTP/1.0 added was the machinery that made HTTP a general transport instead of a hypertext-only one. The request line gained a version (GET /page HTTP/1.0). Responses gained a status line, which is where 200 OK and 404 Not Found enter the story as wire-level facts rather than HTML conventions. Both directions gained headers, free-form Name: value lines, and headers are the feature that let the protocol keep growing for the next three decades without a new wire format every time. Content-Type meant a server could finally serve an image and say so. User-Agent meant the client could identify itself, a small line with a long and strange afterlife (see the history of the user-agent string). The cookie, the other load-bearing piece of web identity, arrived in the same window as a Netscape extension rather than part of RFC 1945 itself (see the history of the cookie).

One thing HTTP/1.0 did not fix was the connection model. By default each request still opened a fresh TCP connection and closed it after the response. There was an unofficial Connection: keep-alive header in circulation, but it was a bolt-on, not part of the document, and it interacted badly with the proxies of the era. The cost of a TCP handshake per object, multiplied across a page full of images, was the performance problem HTTP/1.1 would be built to attack.

RFC 2068 and 2616: the long reign of HTTP/1.1

HTTP/1.1 is the version the web actually ran on for most of its life. The first standardized cut was RFC 2068 in January 1997, barely six months after RFC 1945. Two and a half years of clarifications and corrections followed, and the result, RFC 2616, landed in June 1999. RFC 2616 then stood as the reference for fifteen years. If you learned HTTP from a book before roughly 2015, you learned it from RFC 2616.

Two changes in HTTP/1.1 carried most of the weight. The first was persistent connections by default. Where HTTP/1.0 closed the socket after each response, HTTP/1.1 keeps it open and reuses it, so a page of twenty assets can ride one or a few TCP connections instead of twenty handshakes. Chunked transfer encoding (Transfer-Encoding: chunked) was the companion feature: it let a server start sending a response before it knew the total length, framing the body as a sequence of sized chunks, which is what made dynamically generated pages and streaming responses practical over a reused connection.

HTTP/1.0 — new connection per object SYN GET FIN SYN GET FIN SYN GET FIN HTTP/1.1 — one persistent connection, reused SYN GET GET GET FIN one handshake amortized across many requests; responses still come back in order *Persistent connections were the headline HTTP/1.1 change: one TCP handshake serves many requests instead of one each.*

The second change was the one that quietly saved the web’s address space. RFC 2616 made the Host request header mandatory. In HTTP/1.0 the request only carried a path, so a server at a given IP could host exactly one site, because it had no way to know which hostname the client had typed. Requiring Host meant one IP address could serve thousands of distinct domains, the server reading the header to decide which site to render. Name-based virtual hosting is the reason the IPv4 address space survived the dot-com explosion, and it is a single header field doing the work. It is also, decades later, the seam that HTTP request smuggling attacks pry at when a front-end and back-end disagree about where one request ends and the next begins (see request smuggling and HTTP/1.1 desync).

HTTP/1.1 also built out the caching and conditional-request machinery that keeps the web from re-downloading things it already has. The Cache-Control header replaced HTTP/1.0’s blunter Expires with a directive language: max-age, no-cache, public, private, and the rest. Conditional requests let a client say “give me this only if it changed” using If-Modified-Since against a Last-Modified timestamp, or If-None-Match against an ETag, an opaque validator the server attaches to a resource version. A match returns 304 Not Modified with no body, which is the cheapest useful response HTTP has. That whole apparatus, later carved into RFC 7232 (conditional requests) and RFC 7234 (caching) in the 2014 split, is what makes incremental recrawling and CDN revalidation possible (see caching and incremental recrawl). HTTP/1.1 also added byte-range requests, Range: bytes=0-1023, which is the foundation of resumable downloads and video seeking.

HTTP/1.1 also specified pipelining: a client could send several requests back to back without waiting for each response. In theory this hid round-trip latency. In practice it was a near-total failure. The protocol required responses to come back in the same order the requests went out, so one slow response stalled everything queued behind it. This is head-of-line blocking, and it is the defining flaw of HTTP/1.1 that the next two versions exist to solve. Buggy proxies that mishandled pipelined requests made it worse, and most browsers ended up disabling pipelining entirely. Instead they opened multiple parallel connections per host, conventionally capped around six, which spread the blocking risk but burned file descriptors and TCP slow-start budget and never actually fixed the underlying problem. Web developers responded with a catalogue of workarounds, spriting many images into one file, inlining CSS, sharding assets across multiple hostnames to dodge the six-connection cap, all of them hacks to paper over a protocol that could not do real concurrency. Those hacks are exactly the things HTTP/2 made obsolete, and the fact that an entire genre of front-end performance advice evaporated overnight is a fair measure of how much the connection model had been distorting the web above it.

In June 2014 the IETF’s HTTP Working Group replaced the monolithic RFC 2616 with a set of six documents, edited by Roy Fielding, Julian Reschke and others, that carved the protocol along its natural joints. RFC 7230 covered message syntax and routing, 7231 the semantics (methods, status codes, content), 7232 conditional requests, 7233 range requests, 7234 caching, and 7235 authentication. The wire protocol did not change. The point was maintainability: fifteen years of errata and ambiguity in one giant document had become unworkable, and splitting it let each concern be revised on its own. That split mattered more than it looked, because it pulled HTTP’s semantics apart from HTTP’s wire format, and a version of the web that runs three different wire formats over one shared semantics needed exactly that separation to exist.

SPDY: Google runs the experiment in production

By 2009 the diagnosis was settled. HTTP/1.1 was latency-bound. Pages had grown to dozens of subresources, round trips dominated load time, and the protocol’s answer, more parallel TCP connections, was a workaround with diminishing returns. The question was what to replace it with, and the usual standards process had no running code to argue over.

Google supplied the running code. Mike Belshe and Roberto Peon, engineers at Google, announced SPDY (pronounced “speedy”) in late 2009 and deployed it across Google’s own services and Chrome in 2010. SPDY was not a clean-slate protocol. It kept HTTP’s methods, headers and status codes, and changed the way they were carried on the wire. Its central ideas: a single TCP connection carrying many interleaved request/response streams, header compression to cut the redundant bytes that every request repeats, and request prioritization so the browser could tell the server which resources mattered first. Because it shipped in a browser and a huge server fleet at the same time, SPDY produced something standards bodies rarely get early: real-world latency numbers from real users, on a protocol you could actually deploy.

SPDY also leaned hard on TLS, in a way that quietly changed the web’s defaults. It ran only over encrypted connections, and it used the TLS handshake itself to negotiate the protocol, first through the NPN extension and later through ALPN. That choice made “if you want the fast protocol, you encrypt” a structural fact rather than a recommendation, and HTTP/2 inherited it: while the RFC technically permits cleartext HTTP/2, every major browser ships HTTP/2 only over TLS, so in practice moving to the modern protocol meant turning on HTTPS. The protocol negotiation moved into the ClientHello, which is one reason TLS fingerprinting and HTTP-version fingerprinting ended up so entangled in modern bot detection (see ALPN and NPN in fingerprinting).

That production evidence is what let the HTTP Working Group skip the usual decade of theory. When the group chartered work on HTTP/2 in 2012, the charter named SPDY as the starting point. HTTP/2 then diverged from SPDY in the details, header compression being the clearest example: SPDY used a zlib-based scheme that the CRIME attack proved dangerous, and HTTP/2 replaced it with the purpose-built HPACK. Once HTTP/2 was standardized its backers, Google included, deprecated SPDY in favor of it. No modern browser has supported SPDY since around 2021. The experiment did its job and was retired, which is a cleaner ending than most protocols get. The broader arc, of crawlers and clients chasing whatever the browsers do, is its own long story (see the history of web scraping).

RFC 7540: HTTP/2 goes binary

HTTP/2 was published as RFC 7540 in May 2015. The single most consequential decision in it is that HTTP/2 is a binary protocol. Every version before it was text you could type. HTTP/2 frames everything into a binary layer, and that one change is what makes the rest possible.

one TCP connection — frames from three streams interleaved S1 HDR S3 DATA S1 DATA S5 HDR S3 DATA S5 DATA frame header (9 bytes): length · type · flags · stream identifier length (24) type (8) flags (8) stream id (31) the stream id is what lets responses arrive in any order without ambiguity *HTTP/2's binary framing. Every frame carries a stream identifier, so frames from many concurrent streams share one connection and reassemble correctly regardless of arrival order. The field widths shown follow RFC 7540's frame header layout.*

Multiplexing is the payoff. Because every frame is tagged with a stream identifier, a single TCP connection can carry many requests and responses at once, interleaved frame by frame, and the receiver reassembles each stream from its tag. There is no requirement that response three wait for response one. That is the head-of-line blocking problem from HTTP/1.1 solved at the HTTP layer, and it is why the browser convention of six parallel connections per host became unnecessary.

Header compression got its own algorithm, HPACK, specified separately in RFC 7541. HTTP requests are absurdly repetitive: the same User-Agent, the same Accept lines, the same cookies on every request to a host. HPACK keeps a shared dynamic table of header fields on both ends so that a header sent once can be referenced by a small index afterward, with a static table of common fields built in. It was designed with one eye on a specific attack: the CRIME exploit had shown that naive compression of secrets like cookies could be turned into a decryption oracle, so HPACK deliberately avoids the generic-compression patterns that made CRIME work.

Then there was server push, the feature that was supposed to be HTTP/2’s other headline and instead became its cautionary tale. The idea: a server, knowing a request for index.html will be followed by requests for its CSS and JavaScript, could push those resources before the browser asked. In practice it was hard to use without wasting bandwidth on resources the browser already had cached, and the data was brutal. By the time Chrome moved to remove it, Google reported that over a 28-day window 99.95% of HTTP/2 connections Chrome opened never received a single pushed stream, and of the pushes that did arrive, fewer than 40% were actually used, down from about 63% two years earlier. Chrome 106 disabled push by default in 2022. Firefox 132 removed its support in October 2024. The replacement is the 103 Early Hints status code, which tells the browser what to go fetch itself rather than guessing on its behalf.

HTTP/2 got its own revision in June 2022. RFC 9113 obsoletes RFC 7540, folds in errata and security fixes accumulated over seven years, and formally documents server push as difficult to use effectively. The same 2022 batch produced the semantics reorganization that finally made the version-independence explicit: RFC 9110 defines HTTP semantics shared across all versions, RFC 9111 defines caching, and RFC 9112 defines the HTTP/1.1 wire format specifically. Methods, status codes and header meanings now live in one document that 1.1, 2 and 3 all point to; each version document describes only how those shared semantics get onto its particular wire.

That clean separation is also where a great deal of modern bot detection lives, because the binary wire format leaks far more about a client than text ever did. The order a client lists its HTTP/2 SETTINGS, its initial window size, the order of its pseudo-headers, and its stream-priority behavior together form a fingerprint that anti-bot vendors read on the first request, often before a single byte of application data (see HTTP/2 fingerprinting and the Akamai format). A protocol designed for performance turned out to be a rich identity signal, which nobody specified and nobody can easily turn off.

RFC 9114: HTTP/3 leaves TCP behind

HTTP/2 fixed head-of-line blocking at the HTTP layer and then ran straight into it again one layer down. Multiplexing many streams over a single TCP connection works beautifully until a packet is lost. TCP guarantees in-order delivery of its byte stream, so when one segment goes missing, the operating system holds back every byte that arrived after it, for all streams, until the retransmission fills the gap. The application sees independent streams; the kernel sees one ordered pipe. A lost packet belonging to stream one stalls streams three and five too, even though their data already arrived. HTTP/2 had moved the blocking from the protocol into the transport, where the protocol could not reach it.

The only way out was to stop using TCP. That is what QUIC does. QUIC is a transport built on UDP that implements its own streams, its own loss detection and its own congestion control, with TLS 1.3 wired into the handshake rather than layered on top. Because QUIC understands streams natively, a lost packet only stalls the stream it belonged to; the others keep flowing. The work began as Google’s gQUIC, deployed experimentally much as SPDY had been, and then diverged substantially during IETF standardization. The custom Google handshake was replaced with standard TLS 1.3, the wire format was redesigned, and the protocol was split into modular pieces. The core QUIC documents, RFC 9000 (transport), RFC 9001 (TLS), and RFC 9002 (loss detection and congestion control), were published in May 2021 after the QUIC Working Group, chartered in 2016, spent years and dozens of draft revisions on them.

HTTP/2 HTTP/2 (RFC 9113) TLS 1.3 TCP IP in-order TCP byte stream — lost packet stalls all streams HTTP/3 HTTP/3 (RFC 9114) QUIC (RFC 9000) streams + TLS 1.3 + loss detection UDP per-stream delivery — lost packet stalls only its stream *The reason HTTP/3 exists: QUIC folds TLS and stream management into one UDP-based transport, so packet loss no longer blocks every stream the way it does on TCP.*

HTTP/3 itself is the mapping of HTTP semantics onto QUIC, published as RFC 9114 in June 2022. The semantics are identical to HTTP/2 and HTTP/1.1, by design, because they all point at RFC 9110. What changes is the carriage. Header compression needed a redesign because HPACK assumed the in-order delivery that QUIC deliberately abandons, so HTTP/3 uses QPACK (RFC 9204), which compresses headers without forcing streams to wait on each other. The name itself was a late decision: the draft was called “HTTP/2 Semantics Using The QUIC Transport Protocol” until October 2018, when Mark Nottingham, who chaired both the HTTP and QUIC working groups, proposed renaming it HTTP/3 to mark it as another binding of HTTP semantics to a wire protocol rather than a variant of HTTP/2.

QUIC buys two things beyond the head-of-line fix. A connection is identified by a connection ID rather than the IP-and-port four-tuple TCP uses, so a connection can survive a network change. A phone moving from Wi-Fi to cellular keeps its QUIC connection alive across the IP change instead of tearing down and reconnecting. And QUIC’s handshake folds the transport and TLS setup together, so a fresh connection completes in one round trip, with 0-RTT resumption for connections to a server seen before, against TCP-plus-TLS that historically needed more. That 0-RTT path comes with a documented replay caveat, which is why it is restricted to requests safe to repeat.

There is a discovery wrinkle worth naming. A browser cannot start a connection in HTTP/3, because the very first contact has to reach the server somehow, and there is no DNS field that says “this host speaks QUIC on UDP.” So the browser connects over HTTP/1.1 or HTTP/2, and the server advertises HTTP/3 availability with an Alt-Svc response header. The browser remembers it and uses QUIC on the next visit. More recent deployments can shortcut this with an HTTPS DNS resource record, but the Alt-Svc upgrade dance is still the common path, and it means a client’s very first request to a host is essentially never HTTP/3.

Rapid Reset: when a feature became the largest attack on record

The clearest illustration that protocol design has consequences nobody intends came in August 2023, from the multiplexing feature that was HTTP/2’s whole point. The attack is CVE-2023-44487, called Rapid Reset, and Amazon, Cloudflare and Google disclosed it together in October 2023.

The mechanism is almost embarrassingly simple. HTTP/2 lets a client open a stream and then immediately cancel it by sending a RST_STREAM frame. The protocol permits this, and it is useful: a browser that navigates away should be able to abandon in-flight requests. But opening and instantly resetting a stream costs the attacker almost nothing while still making the server do the work of setting up request processing, and crucially, a reset stream frees up the connection’s concurrency budget immediately, so the attacker can open another one in its place without limit. A client can cycle through requests far faster than the server’s per-connection stream cap was ever meant to allow. The result is a request flood whose volume is decoupled from the number of connections the attacker holds open.

The numbers were records. Google measured a peak of 398 million requests per second. Cloudflare measured 201 million, and reported that several Rapid Reset attacks were nearly three times larger than the biggest DDoS it had previously seen. Amazon clocked 155 million. Cloudflare’s writeup noted the 398-million-rps Google figure came from a botnet of roughly 20,000 machines, a tiny number for an attack of that scale, which is the whole point: the leverage came from the protocol, not from raw botnet size. The fix was not a new protocol version but server-side accounting, tracking and limiting the rate of stream resets per connection rather than just counting concurrent streams (see the deeper writeups on the HTTP/2 Rapid Reset attack and Layer 7 DDoS). The feature stayed. The lesson was that “client can cheaply cancel a request” and “server does work per request” combine into an amplifier the moment someone looks at them adversarially.

Where the protocol stands in 2026

HTTP today is three wire formats sharing one set of meanings. A request for a web page carries the same methods, the same status codes and the same header semantics whether it travels as HTTP/1.1 text over TCP, HTTP/2 binary frames over TLS over TCP, or HTTP/3 frames over QUIC over UDP. That separation, made explicit in the 2022 RFC 9110 reorganization, is what lets the wire format keep changing while thirty years of application behavior keeps working unchanged. HTTP/1.1 has not gone away and will not soon; it is the fallback every client still speaks, the format proxies and old middleboxes understand, and the discovery path HTTP/3 itself relies on. HTTP/2 carries a large share of traffic. HTTP/3 has climbed past a third of websites by most measures in 2025 and 2026 and is the default on the big CDNs, with all four major browser engines supporting it, though the figures vary by who is counting and what they count as “support.”

The throughline across thirty-five years is that each version solved the previous version’s worst problem and inherited a new one in the trade. HTTP/1.0’s per-request connections gave way to HTTP/1.1’s persistent connections, which exposed head-of-line blocking, which HTTP/2’s multiplexing fixed at the HTTP layer while leaving it intact at the transport, which HTTP/3 finally drained by abandoning TCP for QUIC. Each step also quietly widened the protocol’s attack and fingerprint surface in ways its designers were not optimizing for. Binary framing made clients fingerprintable from their first packet. Cheap stream cancellation became a record-breaking flood. UDP transport moved the whole conversation past the middleboxes that used to inspect it. The protocol that started as a single line of ASCII you could type by hand is now a stack deep enough that no one person holds all of it in their head, and the document trail, from a sub-700-word W3C note to RFC 9114, is the only honest map of how it got that way.


Sources & further reading

Further reading