Memcached amplification and the 1.3 Tbps GitHub attack of 2018

For about nine minutes on the afternoon of 28 February 2018, GitHub was the target of more inbound traffic than almost any service on the internet had ever absorbed. The peak was 1.35 terabits per second, carried by 126.9 million packets per second, and it arrived not from a botnet of compromised machines grinding out small requests, but from a few thousand database caches that GitHub’s attacker had never logged into. The caches were doing exactly what they were built to do. They answered a question. The question had just been asked with a forged return address, and the answer was tens of thousands of times larger than the question.

That ratio is the whole story. A reflection-amplification attack is a trick of arithmetic before it is anything else: find a server that replies to a small UDP request with a large response, lie to it about who is asking, and let the server’s own bandwidth do the work of flooding the victim. Memcached, a piece of software that has no business being on the public internet at all, turned out to be the best amplifier anyone had found. This post walks through why, using GitHub’s incident as the case study, and ends with the fix, which was almost embarrassingly simple.

The sections below go in order. First, what reflection and amplification mean and why the amplification factor is the number that matters. Then memcached itself: what it is, why it speaks UDP, and the protocol detail that made it dangerous. Then the attack mechanics, step by step, from planting a payload to the flood. Then the GitHub incident with the real numbers and how Akamai absorbed it. Then the cleanup: the kill switch, the version 1.5.6 change, the CVE, and what the exposed-server count did afterward. Finally, what the episode says about an entire class of protocols that were never meant to face the open internet.

Reflection, amplification, and the only number that matters

Two ideas get bundled together in the phrase “reflection-amplification attack,” and they are worth separating because they do different jobs.

Reflection is about hiding. UDP is connectionless. There is no handshake, so when a server receives a UDP datagram it has no cheap way to confirm that the source address in the packet is really where the packet came from. A sender can write any source IP it likes into the header. So the attacker sends requests to a third party (the reflector) with the victim’s IP in the source field, and the reflector dutifully sends its reply to the victim. The victim sees traffic from the reflector, not the attacker. The attacker’s own address never appears. Mitigation gets harder because there is no single source to block. The traffic comes from thousands of innocent, real servers scattered across the world.

Amplification is about leverage. If the reflected reply were the same size as the request, reflection alone would be a wash. The attacker would spend a bit of bandwidth to make the reflector spend the same bit. The attack only becomes worthwhile when the reply is much larger than the request, because then the attacker’s modest upload turns into a flood. The bandwidth amplification factor (BAF) is the ratio of response bytes to request bytes. A BAF of 50 means every byte the attacker sends produces fifty bytes aimed at the victim. There is also a packet amplification factor (PAF), the ratio of response packets to request packets, which matters for attacks that aim to exhaust a router’s packet-processing budget rather than its raw bandwidth.

Amplification is not new and not specific to memcached. DNS resolvers, NTP servers, SSDP, SNMP, and CharGen have all been abused the same way, and the DNS amplification family is the closest relative to what happened here. What set memcached apart was the size of the multiplier. DNS amplification typically runs a BAF in the tens. NTP’s monlist command, the previous favorite, could hit a few hundred. Memcached was measured in the tens of thousands.

Approximate bandwidth amplification factors. DNS and NTP figures are from US-CERT's reflection advisory; the memcached 51,200x figure is Cloudflare's best-case observation and is a ceiling, not a typical result.

That last caveat is important, and most coverage skips it. The 51,200x number is real but it is a peak, not an average. It comes from a single observed case where a 15-byte request returned a 750-kilobyte response. The realistic factor in the GitHub flood was lower, because attackers store a payload and read it back, and the payload size, not a theoretical maximum, sets the multiplier. Hold onto that distinction. The headline number sells the story; the working number is what filled GitHub’s pipes.

There is a second subtlety in how the arithmetic actually plays out on the wire. The reflector’s large response does not arrive as one enormous packet. UDP datagrams are bounded by the path MTU, and anything larger gets fragmented by IP into roughly 1,400-byte pieces, which is why Cloudflare’s capture of a 257 Gbps flow worked out to about 23 million packets per second of 1,400-byte datagrams. So a single planted megabyte-scale value produces a burst of hundreds of fragments per request. That matters for the defender twice over. The bandwidth amplification fills the victim’s links, and the packet amplification, the sheer count of datagrams per second, taxes every router and firewall on the path that has to make a forwarding decision per packet. GitHub’s 126.9 million packets per second is the figure that actually stresses hardware, more than the terabit headline does, because line-rate packet processing at small sizes is harder than moving the same bytes in large frames.

Why the forged source address is even possible

The reflection half of the attack rests on a property of the internet that has nothing to do with memcached: a host can usually put any source IP it wants into a packet and have it delivered. Source-address validation is not enforced by default at most network boundaries. The fix for that has existed since 2000, written up as BCP 38 (also published as RFC 2827), which specifies ingress filtering: a network operator should drop any packet leaving its edge whose source address does not belong to a prefix that network legitimately originates. If every provider implemented it, a machine inside provider X could not emit a packet claiming to be from GitHub’s address space, and reflection attacks would lose their disguise.

BCP 38 is twenty-five years old and still only partly deployed. The MANRS initiative and the CAIDA Spoofer project both track how much of the internet still permits spoofed egress, and the answer remains “a meaningful fraction.” The reason is the same incentive gap that explains the open memcached servers. A provider that filters its own egress spends effort and gains nothing directly; the benefit accrues to the strangers who would otherwise be spoofed. The cost of not filtering lands on a victim elsewhere. Until that asymmetry changes, spoofing remains available, and any UDP service that answers strangers with a larger reply is a candidate amplifier. The same connectionless-transport problem drives the entire reflection family, which is why the defensive playbook here overlaps so heavily with rate-limiting and edge filtering covered in rate-limiting algorithms for defense.

What memcached is and why it answers strangers

Memcached is a distributed in-memory key-value cache. Web applications use it to stash the results of expensive operations (a rendered page fragment, a database query result, a session object) so the next request can read the value from RAM instead of recomputing it. It is old, fast, and everywhere. It was written for one job and does it well.

The design assumption baked into memcached is that it lives on a trusted network. It has no authentication in its classic mode, no access control, no notion of a client identity. Anyone who can reach the port can read and write the cache. That assumption is fine when the cache sits on a private backend network behind a firewall, talking only to application servers that operators control. It is a catastrophe when the cache is bound to a public IP, which is precisely what tens of thousands of misconfigured deployments had done.

By default, memcached listened on INADDR_ANY, meaning every interface on the host, and it ran with UDP support enabled. The combination is what mattered. A server bound to all interfaces with a public IP and no firewall rule for port 11211 was reachable from anywhere. And because UDP was on, it would answer reflected, spoofed requests. Operators who deployed memcached on a cloud instance without locking down the security group, or who exposed it to speed up a multi-region setup, handed the internet a free amplifier without realizing it.

Why does a cache even speak UDP? TCP is the obvious transport for a request-response cache: it is reliable, it confirms the peer, and the overhead is modest on a local network. Memcached supports UDP as an optimization. For very high request rates, skipping the TCP handshake and connection state per operation can lower latency and CPU cost. The feature was added years before it became a weapon. Nobody designing it imagined the cache facing the public internet, so nobody designed the UDP path to resist spoofing. There was nothing to resist on a trusted LAN.

The UDP frame and the field that gives the trick away

The memcached UDP protocol prepends a small header to every datagram. Per the project’s own protocol.txt, the frame header is 8 bytes, and all values are 16-bit integers in network byte order, high byte first. The four fields are these.

The memcached UDP frame header. The protocol echoes the Request ID and validates nothing about the source, which is exactly what a reflection attack needs.

Bytes 0 and 1 hold a request ID, supplied by the client, which the server copies verbatim into its response. Bytes 2 and 3 are a sequence number used when a response spans multiple datagrams. Bytes 4 and 5 give the total number of datagrams in the message. Bytes 6 and 7 are reserved and must be zero.

The protocol does nothing to verify the source. The closest thing to a safety check is a hint for clients: discard datagrams whose request ID you do not recognize, treating them as stale replies to an earlier query. That advice protects a legitimate client from confusion. It does nothing for a victim who never sent a query at all, because the victim is not running a memcached client. It is just an IP address that the server was tricked into shouting at. The request ID is echoed, the IP header source is trusted, and the response is delivered. From the protocol’s point of view this is correct behavior.

The attack, step by step

The mechanics are simple enough to describe in full without handing anyone a weapon, because the weapon was the open server, not the technique. Defending against it requires understanding the sequence.

First, the attacker finds reflectors. Search engines for internet-connected devices, like Shodan, index hosts that respond on port 11211. At the time, that returned tens of thousands of openly reachable memcached instances. Cloudflare cited roughly 88,000 open servers visible on Shodan.

Second, the attacker plants a payload. Because an open memcached server accepts writes from anyone, the attacker stores a large value under a known key using a normal set command over UDP or TCP. The value can be as large as memcached’s slab limits allow, on the order of a megabyte per key by default, and the attacker can store several keys. This is the step that turns a modest amplifier into a large one. The reflected response size is the size of the stored value, so the attacker chooses it.

Third, the attacker spoofs the read. It sends a tiny get request for the planted key, with the source IP in the packet header forged to the victim’s address. The request is around 15 bytes on the wire. The server looks up the key, finds the large value, and sends it to the source address it was told about, which is the victim.

Fourth, repeat across the whole reflector set. The attacker sprays spoofed get requests at thousands of open servers at once. Each server independently fires a large response at GitHub. The aggregate is the flood. The attacker’s own upload is small; the servers supply the volume, and they supply it from thousands of distinct, legitimate IP addresses, which is what makes simple source-blocking useless.

The reflection loop. Thin grey lines are the attacker's small spoofed requests; thick orange lines are the servers' large responses converging on the victim. The attacker's address never touches the victim.

There was an even crueler variant that needs no set at all. A stats command returns the server’s statistics, and on a busy server that response is substantial on its own. Cloudflare’s testing showed a 15-byte stats-style probe could yield a 134-kilobyte response, a 10,000x factor with zero setup. The planted-payload method just lets the attacker dial the response size higher.

What hit GitHub, and how Akamai swallowed it

The numbers in GitHub’s own incident report are precise, so use them rather than the rounded versions that circulated.

The attack began at 17:21 UTC on 28 February 2018. GitHub’s network monitoring flagged it immediately. The site’s edge had been receiving an ordinary amount of traffic and then, within seconds, was facing 1.35 Tbps at 126.9 million packets per second. The report describes the leverage in plain terms: for each byte the attacker sent, up to 51 KB was directed at GitHub. The flood came from over a thousand different autonomous systems spread across tens of thousands of unique endpoints, which is the reflection property made visible. There was no botnet to take down. There were just a great many open caches, each answering a question it should never have been asked.

GitHub already had a mitigation arrangement with Akamai. At 17:26 UTC, five minutes in, GitHub moved its traffic to Akamai by withdrawing its own BGP routes for the affected prefixes and letting Akamai announce them instead, which pulled all inbound traffic into Akamai’s Prolexic scrubbing infrastructure. Akamai’s edge could filter the memcached traffic, because UDP datagrams sourced from port 11211 are trivially identifiable and a content delivery network has no legitimate need to pass them. The diffuse, spoofed, single-protocol nature of the flood that made it hard to block at GitHub’s own border made it easy to drop at a network built to absorb volume. By 17:30 UTC the attack was fully mitigated, nine minutes after it began. GitHub saw roughly five minutes of complete unavailability and four of intermittent reachability. A second, smaller spike of about 400 Gbps arrived near 18:00 UTC and was handled without incident. At no point was the confidentiality or integrity of customer data at risk; this was a volumetric flood at the network layer, not an intrusion.

The GitHub timeline from the company's incident report. The clock that matters is five minutes from onset to the BGP shift into Akamai's scrubbing centers.

Two pieces of internet plumbing did the heavy lifting and are worth naming. BGP is how GitHub handed the problem off: withdrawing route announcements and letting Akamai announce the same prefixes is a routing maneuver, not a firewall change, and it redirects traffic at the level of the global routing table. Anycast is how Akamai then survived the volume: the same destination IP is announced from many physical locations, so a flood aimed at one address gets split across dozens of scrubbing centers by the routing fabric itself, and no single facility eats the whole 1.35 Tbps. The mechanics of that dispersion are covered in how CDNs absorb volumetric DDoS and the routing side in anycast routing.

For context on scale, Akamai called this the largest attack it had seen, more than twice the size of the September 2016 floods that introduced the world to the Mirai botnet. And the record did not last long. Within days, Arbor Networks (now part of NETSCOUT) reported a memcached attack against an unnamed US service provider that peaked at 1.7 Tbps, using the same vector. The technique was out, the open servers were still open, and anyone with a list could point them somewhere.

The cleanup: a kill switch, a one-line default, and a CVE

Three things shut this down, roughly in order of how quickly they took effect.

The fastest was a kill switch, found by researchers at Corero Network Security. Because an open memcached server accepts commands from anyone, a defender under attack could send the reflector the very commands that empty it. A flush_all command invalidates every key in the cache without restarting the server, which wipes the attacker’s planted payload, so the reflected get returns nothing. A shutdown command stops the server outright where it is permitted. Corero reported the flush technique 100% effective in testing with no observed collateral damage, and a researcher released a Python tool, Memfixed, that pulled a list of vulnerable servers from Shodan and sent them flush or shutdown commands in bulk. This is a strange and legally fraught remedy: it works by issuing unauthorized commands to other people’s servers, which is exactly the openness the attack relied on, turned around. It was a stopgap, not a policy.

The durable fix was a default change. Memcached 1.5.6 shipped on 27 February 2018, one day before the GitHub peak, and its headline change was to disable the UDP protocol by default. After that release, a fresh install would not answer UDP at all unless an operator explicitly re-enabled it with -U 11211. Operators who could not upgrade had a one-flag workaround that predated the release: start memcached with -U 0 to turn UDP off, and bind it to localhost with --listen 127.0.0.1 so it is unreachable from outside the host in the first place. The advice that should have been in every deployment guide all along became the default.

The vulnerability got a name for tracking: CVE-2018-1000115. The classification is CWE-406, insufficient control of network message volume, which is the formal way of saying network amplification. The record lists memcached 1.5.5 as affected, notes the reported 1:50,000 amplification reachable over UDP port 11211, and points to 1.5.6 as the fix. Calling a misconfiguration-plus-protocol-design a CVE was slightly controversial, since the software did exactly what it was told, but a CVE is how downstream Linux distributions pick up and ship the fixed default, and that is what mattered for getting the change onto running servers.

The net effect showed up in the exposed-server counts within weeks. ISPs and cloud providers began filtering UDP source port 11211 at their edges, distributions shipped the disabled-by-default build, and the number of openly reachable memcached instances dropped sharply, by more than half on the measurements that tracked it. The attack surface did not vanish, but it shrank fast enough that memcached stopped being the amplifier of choice. The arithmetic still works on whatever servers remain open; there are simply far fewer of them, and the edges of more networks now drop the protocol on sight.

It is worth being precise about what each of the three fixes does and does not solve, because they operate at different layers and none of them alone is sufficient. The kill switch is reactive and targets the reflector: it works only while an attack is in progress, requires sending unauthorized commands to machines you do not own, and does nothing to prevent the next attack from a server you did not flush. The version default is preventive and targets the software: it stops new and upgraded installs from answering UDP, but it cannot touch the long tail of old servers that never get updated, which is where the residual exposure lives. Edge filtering of source port 11211 is preventive and targets the network path: it is the most durable of the three because it does not depend on any individual operator patching anything, but it relies on enough transit and access networks bothering to install the filter. Notice that none of the three addresses the root enabler, source-address spoofing. They each break the chain at a different link, which is why the count fell so fast: several independent parties each closed their own piece without waiting for the others.

What the episode actually taught

The memcached attack is remembered for its record and its almost comic asymmetry, a 15-byte request returning three quarters of a megabyte, but the more useful lesson is about a category of software, not one program. Memcached was never broken. It did precisely what its protocol specified: it answered a UDP query, echoed the request ID, and sent the response to the address in the packet. Every one of those behaviors is correct on the trusted private network the software assumed it lived on. The failure was the gap between that assumption and reality, and the reality was tens of thousands of caches bound to public IPs with no firewall, because spinning up a cloud instance and skipping the security-group rule is easy and the default bound to every interface.

That pattern is general. A protocol designed for a closed environment, an operator who exposes it to the open internet by accident, and a connectionless transport that lets attackers forge the return address: those three conditions describe NTP’s monlist before memcached and will describe the next amplifier after it. The defenses are unglamorous and well understood. Disable UDP where the request-response semantics do not need it. Refuse to bind cache and management services to public interfaces. And implement source-address validation at the network edge so a packet claiming to come from a customer’s prefix cannot leave a provider that does not host that prefix, which is the BCP 38 ingress-filtering idea that, fully deployed, would make every reflection attack impossible. The reason these attacks persist is not that the fixes are hard. It is that the cost of a misconfigured server falls on the victim it gets pointed at, not on the operator who left it open, and that incentive has not changed.

What did change, concretely, is the number Shodan returns for port 11211. Before 28 February 2018 it was around 88,000. The default flipped, the edges started filtering, and the count fell by more than half inside a month. The 1.35 Tbps record stood for a while and was eventually beaten by other vectors. The open-memcached census never recovered to its pre-2018 size, which is the rare case of an internet exposure that got measurably smaller and stayed that way.

Sources & further reading

Cloudflare (2018), Memcrashed — Major amplification attacks from UDP port 11211 — the original technical writeup, with the 51,200x figure, the 88,000 open-server count, and the -U 0 / localhost mitigation advice.
GitHub Engineering (2018), February 28th DDoS incident report — the primary source for the 1.35 Tbps peak, 126.9 Mpps, the UTC timeline, and the BGP shift to Akamai.
memcached project, protocol.txt — the UDP frame header layout (request ID, sequence, total datagrams, reserved) straight from the source.
memcached project (2018), Release notes for 1.5.6 — the release, dated 27 February 2018, that disabled UDP by default.
MITRE / Ubuntu Security (2018), CVE-2018-1000115 — the CVE record: memcached 1.5.5 affected, CWE-406, 1:50,000 amplification over UDP 11211, fixed in 1.5.6.
The Hacker News (2018), Memcached servers abused for massive amplification DDoS attacks — early coverage citing Cloudflare, Arbor, and Qihoo 360 on the 51,200x factor.
The Hacker News (2018), 1.7 Tbps DDoS attack — memcached UDP reflections set new record — the Arbor Networks / NETSCOUT report of the 1.7 Tbps follow-up days later.
The Hacker News (2018), ‘Kill switch’ to mitigate memcached DDoS attacks — flush ‘em all — Corero’s flush_all / shutdown countermeasure and the Memfixed tool.
Qrator Labs (2018), Understanding the facts of memcached amplification attacks — analysis arguing the theoretical amplification figure is a ceiling, not a working average.
SUSE Support (2018), CVE-2018-1000115: memcached UDP server support allows spoofed traffic amplification DoS — distribution-level writeup of the vulnerability and the disable-UDP remediation.