BGP hijacks and route leaks: the history of the internet's trust problem

The Border Gateway Protocol decides where every packet on the internet goes, and it decides this by believing whatever it is told. A router in Pakistan can announce that it owns YouTube’s address space, and routers in Frankfurt and São Paulo will write that down and forward traffic accordingly. There is no signature to check, no proof of ownership demanded at the door. The protocol that glues sixty-thousand-odd autonomous systems into one network was designed in two days on the back of three napkins in 1989, and the design assumed everyone speaking it was a friend.

That assumption has been wrong for at least thirty years. The question this post answers is not whether BGP can be abused. It is how the abuse works at the protocol level, why the same failure keeps recurring with different actors, and whether the cryptographic patch the industry settled on actually fixes anything. The short version: it fixes the easy half and barely touches the hard half.

The route runs through the mechanism first, then the incidents that made the mechanism famous, then the mitigations. We start with how a router picks a path. Then the 1997 accident that proved a single misconfigured box could black-hole the internet, the 2008 Pakistan hijack, the 2010 China Telecom reroute, the cryptocurrency thefts of the late 2010s, and the 2021 Facebook outage as the inverse case where a network deleted itself. We end with RPKI, Route Origin Validation, BGPsec, and MANRS, and an honest accounting of what each one stops.

How a router decides to believe you

BGP is a path-vector protocol. Each autonomous system, or AS, is a network under one administrative control with its own number: AS15169 is Google, AS13335 is Cloudflare, AS701 is Verizon. An AS announces the IP prefixes it originates, blocks of addresses written as CIDR ranges like 208.65.152.0/22. It sends those announcements to its neighbors over TCP sessions, and each neighbor that accepts the route prepends its own AS number to the path and passes it along. By the time an announcement has crossed the internet, it carries a list of every AS it traversed, the AS_PATH.

When a router holds several routes to the same destination, it runs a decision process to pick one. The single most important rule for understanding hijacks comes before any of that: longest prefix match. A router forwards a packet using the most specific prefix that contains the destination address, regardless of AS path length, regardless of anything else. A /24 covers 256 addresses; a /22 covers 1,024. If one router announces 208.65.153.0/24 and another announces the /22 that contains it, traffic for the 256 addresses in that /24 follows the more specific route. The /22 simply does not compete for those addresses. This is not a bug. It is how aggregation is supposed to work, so that a network can carve up the space it owns. The problem is that BGP never checks whether the announcer owns it.

*Longest prefix match: the more specific announcement captures its slice of the address space no matter who sent it.*

If the prefixes are equal length, then the tie-breakers matter. A router prefers routes with higher local preference (a policy knob set by the operator), then shorter AS paths, then a cascade of further rules down to the router ID. So a hijacker has two ways to win. Announce a more specific prefix, which wins outright, or announce the same prefix with a path that looks better to some subset of the internet, which wins regionally. Both happen in the wild. Neither requires breaking anything. The attacker just speaks BGP, correctly, and lies about what it owns.

For a deeper walk through the decision process and convergence, the companion post BGP explained covers it. The thing to carry forward here is that acceptance is the default. A router rejects a route only if its operator wrote a filter to reject it, and for most of BGP’s history most operators wrote very few.

1997: the day AS7007 ate the internet

The first time the whole internet noticed this gap, it was an accident, not an attack. On 25 April 1997 a router at a small Virginia ISP, autonomous system 7007, started announcing a large chunk of the global routing table as if it originated there. The cause was mundane. The router de-aggregated routes it had learned, breaking them into /24 pieces, and re-announced those /24s with AS7007 as the origin. Because they were more specific than the real routes, longest prefix match did the rest. Traffic for thousands of networks bent toward a single ISP in Virginia that had no capacity to carry it.

The result was a black hole that spread across the internet through Sprint, AS7007’s upstream, and outward from there. Disconnecting the offending router did not immediately fix it, because the bad routes lingered in other networks’ tables. The outage is remembered as the first demonstration that one box, misconfigured, could disrupt routing globally. It also set the template that every later incident would follow: more-specific announcements, an upstream that accepted them without checking, and global propagation faster than any human could react.

The lesson the industry drew was to filter customer announcements by prefix. If AS7007 announces a route for space it does not own, its provider should drop the announcement at ingress. This works, when it is configured, and that conditional is the whole story of the next twenty-five years. Filtering is correct, well understood, and unevenly deployed. The networks that do it carefully are not the ones that cause incidents.

2008: Pakistan, YouTube, and a censorship order that escaped

The 2008 event is the canonical teaching example because the timeline is clean and the motive was not malicious in the global sense. On 24 February 2008, Pakistan Telecom, AS17557, received a government order to block YouTube inside Pakistan. The mechanism their engineers chose was a routing trick: announce a more-specific route for YouTube’s address space internally, so that Pakistani users trying to reach YouTube would be sent to a null destination instead. Take YouTube’s 208.65.152.0/22 and announce 208.65.153.0/24, a more specific piece, pointed at nothing.

The mistake was that the announcement did not stay inside Pakistan. AS17557’s upstream provider, PCCW Global (AS3491), accepted the /24 and propagated it to the rest of the internet without filtering it. From the RIPE NCC’s Routing Information Service, the sequence is precise. At 18:47 UTC, AS17557 began announcing 208.65.153.0/24 and PCCW carried it onward. Because the /24 was more specific than YouTube’s /22, longest prefix match steered a large fraction of the world’s YouTube traffic into a Pakistani null route. YouTube was offline for much of the internet.

*The fix came from below: YouTube fought a more-specific war of its own before PCCW pulled the plug.*

What YouTube did to fight back is instructive. At 20:07 UTC the company began re-announcing the same 208.65.153.0/24 from its own AS36561, splitting the traffic. Then at 20:18 it went more specific than the hijacker, announcing 208.65.153.0/25 and 208.65.153.128/25, two /25s that each beat the /24 under longest prefix match. That reclaimed the traffic for networks close enough to hear YouTube’s announcements. The hijack ended cleanly at 21:01 UTC when PCCW withdrew all of AS17557’s prefixes, roughly two hours and fourteen minutes after it began. The cure was not a filter. It was YouTube winning a more-specific arms race long enough for a human at PCCW to intervene.

Two structural facts come out of this. First, a routing leak does not respect the intent behind it. Pakistan Telecom wanted to censor YouTube in Pakistan and accidentally censored it everywhere, because BGP has no concept of “internal only” that the protocol enforces. Second, the defense of last resort for the victim is to announce even more specifics, which means the global routing table bloats every time someone is under attack. That is a tax everyone pays.

2010: China Telecom and the difference between a leak and an interception

In April 2010, China Telecom (AS23724) announced a large set of prefixes it did not originate, and for a window of roughly eighteen minutes a slice of global traffic, including routes touching US government and military networks, flowed into China Telecom’s network and back out. The widely repeated “15 percent of the internet” figure is misleading and was disputed at the time. It referred to the number of prefixes announced, on the order of tens of thousands, not the share of actual traffic volume, which was far smaller. The accurate claim is narrower and more interesting than the headline.

This incident introduced a distinction worth holding onto. A blunt hijack black-holes traffic: it goes to the hijacker and dies, as in Pakistan/YouTube. An interception is subtler. The hijacker announces a path that pulls traffic in, but still has a working route to the real destination, so packets transit the hijacker’s network and continue on to where they were going. The victim may notice nothing beyond a latency bump. The attacker gets to see, log, or tamper with the traffic in the middle. Whether the 2010 event was a deliberate interception experiment or an honest leak that happened to work as one was never settled publicly, and China Telecom denied intent. The capability it demonstrated is the part that mattered to everyone watching: announce the right paths and you can route a meaningful fraction of someone else’s traffic through your own routers without anyone’s permission.

Interception is the model that makes BGP a security problem rather than just a reliability problem. If you can pull traffic through your network transparently, you can do everything a man in the middle does. You can attempt to strip TLS, serve forged DNS, or simply collect metadata about who is talking to whom. The encryption on top of the connection limits the damage, which is exactly why the next wave of incidents went after the one thing encryption depends on: getting the right certificate to the right place.

2018-2022: hijacking BGP to steal cryptocurrency

By the late 2010s, attackers had worked out that BGP hijacking pairs beautifully with the public certificate system. The recipe: hijack the IP space of a target’s DNS or web server, stand up your own server on that space, obtain a real TLS certificate for the domain through a domain-validated challenge (which checks control of the IP or DNS, both of which you now control), and harvest whatever the victims send. The companion post on how BGP hijacking enabled crypto theft goes deeper; here is the shape of the canonical case.

On 24 April 2018, an attacker hijacked five /24 prefixes belonging to Amazon’s Route 53 DNS service. The announcements originated from eNet (AS10297) in Columbus, Ohio, and propagated through peers including Hurricane Electric. For roughly two hours, DNS queries that should have reached Amazon’s authoritative nameservers for myetherwallet.com were pulled to a rogue server hosted in a Chicago data center, which answered with the IP of a phishing site. Visitors who clicked past the browser’s TLS warning, because the attacker could not get a valid certificate for the wallet domain, handed over their keys. The reported theft was in the low hundreds of thousands of dollars of Ethereum. The blunt detail that several large transit providers were running prefix filters, and so refused to carry the bad routes, is the reason the damage was not far larger.

The pattern recurred. In February 2022 the KLAYswap exchange in South Korea lost around two million dollars when an attacker hijacked the IP space serving a JavaScript file, obtained a valid certificate, and swapped the script for malicious code that rewrote transaction destinations. In August 2022 the Celer Bridge attacker forged BGP announcements and, notably, worked around RPKI origin validation by crafting announcements whose origin AS matched what the signed records expected while still controlling the path. That last detail is the one to remember when we get to mitigations: origin validation checks who originates a prefix, not who sits in the middle of the path, and a sufficiently careful attacker can satisfy the origin check and still intercept.

2019: Verizon, a BGP optimizer, and a leak that took down a chunk of the web

Not every disaster is an attack. On 24 June 2019, around 10:30 UTC, a small ISP in Pennsylvania, DQE Communications (AS33154), was running a “BGP optimizer,” a class of appliance that improves routing inside a network by splitting prefixes into more-specific pieces and steering them down better paths. Those more-specifics are meant to stay internal. DQE leaked them to a customer, Allegheny Technologies (AS396531), which passed them up to its transit provider Verizon (AS701). Verizon, lacking prefix filtering on that session, accepted the more-specifics and announced them to the world.

Because the optimizer had de-aggregated other networks’ prefixes into smaller pieces, those leaked more-specifics beat the legitimate aggregates under longest prefix match. Cloudflare’s 104.20.0.0/20, for instance, was split into 104.20.0.0/21 and 104.20.8.0/21, and traffic followed the smaller pieces straight into a small Pennsylvania steel company’s network, which promptly fell over. Large parts of the internet, Cloudflare and others, were degraded or unreachable for over an hour. This is a textbook RFC 7908 Type 6 leak, accidental propagation of internal more-specifics, amplified by one large network that did not filter.

*The leak needed two failures: an optimizer that de-aggregated, and a tier-1 that did not filter. RPKI would have caught the second.*

Cloudflare’s writeup made the prevention explicit. Cloudflare had published RPKI Route Origin Authorizations for its space with a maximum prefix length, which means a /21 carved out of its /20 is cryptographically invalid. Had Verizon been doing RPKI origin validation, it would have dropped the leaked more-specifics automatically. It was not. This incident, more than the crypto thefts, is what pushed RPKI from a standards-track curiosity toward something networks felt obligated to deploy.

2021: Facebook deletes itself from the routing table

The Facebook outage of 4 October 2021 belongs in a BGP history not because anyone attacked anything, but because it shows the other failure mode. A hijack is BGP carrying a route that should not exist. A withdrawal outage is BGP correctly removing routes that should still exist, with consequences that cascade through everything that depends on reachability.

During routine maintenance, a command meant to assess backbone capacity instead disconnected Facebook’s data centers from its backbone. Facebook’s authoritative DNS servers were built to withdraw their own BGP route announcements if they lost their connection to the data centers, a health-check behavior intended to steer traffic away from a nameserver that cannot answer correctly. With the backbone severed, every nameserver concluded it was unhealthy and pulled its routes. From Cloudflare’s vantage, the DNS prefixes 185.89.218.0/23 and 129.134.30.0/23 went from present to “not in table.” At that point no resolver on earth could find a Facebook nameserver, because the IP space the nameservers lived on had been withdrawn from BGP. Facebook, Instagram, and WhatsApp were gone for over six hours.

The recovery was slow for reasons that compound the irony. With DNS down, Facebook’s own internal tools and the badge systems for its buildings reportedly failed, which slowed the engineers trying to physically reach the routers. The BGP timeline from Cloudflare shows Facebook ceasing to announce its DNS routes at 15:58 UTC and renewed announcements only around 21:00 UTC. This is the failure that automated, self-protecting routing produces. Each component did exactly what it was told. The nameservers correctly diagnosed themselves as useless and correctly removed themselves from the routing table, and in doing so removed the only way for anyone to recover the system over the network.

It belongs next to the hijacks because it shows the same root property from the opposite direction. BGP is a control plane that propagates statements about reachability with no human in the loop and no brake. When the statement is a lie, you get a hijack. When the statement is an unintended truth, you get Facebook. Either way the protocol does its job at machine speed and the blast radius is global before anyone can type a command. If you want the reachability layer underneath this, the posts on anycast routing and DNS resolution end to end cover how a withdrawn prefix turns into a failed lookup.

RFC 7908: giving the failure a name

Before you can build defenses, you have to define the thing precisely, and the industry took until June 2016 to do that for leaks. RFC 7908, “Problem Definition and Classification of BGP Route Leaks,” defines a route leak as the propagation of routing announcements beyond their intended scope, where an AS passes along a learned route in violation of the intended policies of the receiver, the sender, or some AS earlier in the path. The key phrase is “intended scope.” BGP itself has no field that says how far an announcement is allowed to travel. The intent lives in business relationships, customer versus peer versus provider, that the protocol does not encode.

The RFC enumerates six types. Type 1 is the hairpin turn with a full prefix. Type 2 is a lateral ISP-to-ISP-to-ISP leak. Types 3 and 4 cover a network leaking its transit provider’s prefixes to a peer, or a peer’s prefixes to a transit provider, the classic “valley” violations of the customer-peer-provider hierarchy. Type 5 is prefix re-origination where the data path still reaches the legitimate origin. Type 6 is the accidental leak of internal prefixes and more-specifics, which is the Verizon 2019 case exactly. Naming these did not stop them, but it gave operators and vendors a shared vocabulary, and it set up the work on encoding relationship intent into the protocol so that a leak becomes detectable rather than merely regrettable.

RPKI and Route Origin Validation: signing the easy half

The Resource Public Key Infrastructure is the answer the industry actually deployed. Standardized starting with RFC 6480, RPKI is a certificate hierarchy rooted in the five Regional Internet Registries that mirrors the allocation of IP space. If an RIR allocated a block to your network, you hold a resource certificate proving it, and you can use that certificate to sign a Route Origin Authorization, a ROA. A ROA is a signed statement: “AS13335 is authorized to originate 104.16.0.0/13, with prefixes no longer than /13.” That maxLength field is what would have caught the 2019 leak.

Route Origin Validation, specified in RFC 6811, is the consumption side. A router fetching the global set of ROAs can classify every announcement it hears into one of three states. Valid means a ROA covers this prefix and origin AS and the prefix length is within the allowed maximum. Invalid means a ROA covers the prefix but the origin AS is wrong or the prefix is more specific than the maxLength allows, which is the signature of a hijack or a de-aggregation leak. NotFound, the largest bucket by far, means no ROA exists for this space at all. Operators then set policy: most who deploy ROV drop Invalid routes and accept Valid and NotFound.

*Three states, one decision. NotFound is the bucket that swallows most of the table, which is why ROV is only half a fix.*

The deployment numbers tell an honest story. The NIST RPKI Monitor, which compares ROAs against BGP table dumps from more than twenty RouteViews collectors four times a day, reports roughly 35 percent of IPv4 routes and 34 percent of IPv6 routes as RPKI-valid, with most of the remainder, over 60 percent, sitting in NotFound because no ROA has been published for that space. So a third of the table is signed and protected against origin spoofing. Two-thirds is not, simply because its holders have never created a ROA. The trend is upward and has been for years, helped along by Cloudflare’s isbgpsafeyet.com campaign which named and shamed networks that did not validate, but the long tail of unsigned space is the ceiling RPKI keeps running into.

The deeper limit is in the name. Origin validation. A ROA proves which AS is allowed to originate a prefix. It says nothing about the rest of the AS path. An attacker who announces a victim’s prefix with the victim’s real AS appended as the origin, while inserting itself earlier in the path, produces an announcement that passes origin validation and still hijacks traffic. This is sometimes called a forged-origin or path-manipulation hijack, and it is the technique the Celer Bridge attacker used to defeat ROV. RPKI closes the door on naive origin spoofing, which covers most accidents and a lot of clumsy attacks. It does not close the door on a competent path attacker.

BGPsec: validating the hard half, which nobody runs

The hard half is the AS path, and the standard that addresses it is BGPsec, RFC 8205, published in 2017. Where RPKI signs the origin, BGPsec signs the path. Each AS that forwards a BGPsec-protected announcement adds a cryptographic signature over the path so far, so a receiver can verify that the route really traversed exactly the sequence of ASes it claims, with no AS inserted, removed, or forged. In principle this defeats the path-manipulation attacks that origin validation cannot see.

In practice nobody runs it, and the reasons are structural rather than political. Every UPDATE has to be cryptographically verified, which is a heavy load on routers built for fast-path forwarding, not signature checking. Signatures accumulate along the path, so longer paths mean bigger messages and more verification work. Worst of all for adoption, BGPsec only helps if every AS along a path participates, which makes the value to any early adopter close to zero until a critical mass exists. That is the deployment trap incremental security protocols fall into, and BGPsec fell straight in. Years after standardization there is no meaningful production deployment. The path-validation problem remains effectively open, and the industry’s working answer is a mix of partial measures rather than the clean cryptographic fix BGPsec promised.

There is newer work aimed at the leak-specific slice of the path problem, encoding the customer-peer-provider relationship into BGP so that a router can detect a valley violation, the kind of relationship breach behind most leaks, without verifying the full cryptographic path. That work targets exactly the RFC 7908 leak types and asks less of routers than BGPsec does, which is why it has a better chance of seeing real deployment. It is not finished, and it is not everywhere.

Because the technical fixes are partial and unevenly deployed, a lot of the actual security on the wire comes from operators agreeing to behave. MANRS, Mutually Agreed Norms for Routing Security, is that agreement written down. It launched under the Internet Society and moved to the Global Cyber Alliance in 2024. It asks participating networks to commit to a small set of concrete actions: filter your own and your customers’ announcements so you never propagate something incorrect, deploy source-address validation against spoofing, keep your contact and routing data current in the public registries so others can validate you, and publish that data globally, which in practice means maintaining RPKI ROAs and IRR records.

None of this is cryptographically enforced. MANRS is a norm, and its strength is that the networks causing the worst incidents are usually the ones not filtering, so getting more networks to filter correctly removes a lot of accidents at the source. The reported compliance figures for the compulsory actions among participants run above 97 percent, which is encouraging until you remember that the operators who join MANRS are self-selected for caring. The Verizons of the world that accept an unfiltered leak from a customer are, almost by definition, not the careful ones. A voluntary norm reaches the people who were already going to do the right thing. The hard cases sit outside it.

What thirty years of the same incident teaches

The throughline from AS7007 in 1997 to Celer Bridge in 2022 is that BGP’s security has improved at the origin and barely at the path, and that the protocol’s core property, machine-speed propagation of unverified statements about reachability, is intact. A more-specific announcement still wins. An upstream that does not filter still propagates whatever it is handed. The difference today is that a third of the routing table is signed, so the naive version of the attack now fails against any network running ROV, and the trend line is going the right way. That is real progress and it is also a third of the way through a problem the industry has been working on for a quarter century.

The incidents cluster into a small number of shapes. There is the censorship order that escaped its jurisdiction, the optimizer leak that a tier-1 amplified, the interception that may or may not have been deliberate, the certificate-laundering crypto theft, and the self-inflicted withdrawal. Different actors, different motives, the same protocol behavior underneath each one. RPKI stops the first and second cleanly. It does not stop a competent path attacker, it does nothing about the Facebook-style withdrawal because that route really was being legitimately removed, and it can only protect space whose holder bothered to sign it.

The honest assessment is that the internet still runs on trust, just less of it than it did in 2008. A router in another country can still announce your address space. The new fact is that if you signed a ROA and the networks between you and the attacker run validation, that announcement gets dropped before it spreads, and that improvement happened one ROA and one filter at a time, with no flag day and no central authority forcing it. The remaining two-thirds of unsigned space, and the entirely unsolved path-validation problem, are where the next decade of this story gets written.

Sources & further reading

RIPE NCC (2008), YouTube Hijacking: A RIPE NCC RIS Case Study — the minute-by-minute UTC timeline of the Pakistan Telecom hijack with prefixes and AS numbers.
Sriram et al., IETF (2016), RFC 7908: Problem Definition and Classification of BGP Route Leaks — the formal definition of a route leak and the six-type taxonomy.
Lepinski & Sriram, IETF (2017), RFC 8205: BGPsec Protocol Specification — the path-validation standard that signs the full AS path, still effectively undeployed.
Mohapatra et al., IETF (2013), RFC 6811: BGP Prefix Origin Validation — defines the Valid / Invalid / NotFound classification used by ROV.
Cloudflare (2019), How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline — the DQE/Allegheny/Verizon leak with AS numbers and the RPKI prevention argument.
Cloudflare (2021), Understanding How Facebook Disappeared from the Internet — BGP withdrawal timeline and the two DNS prefixes that vanished from the table.
Internet Society (2018), Amazon’s Route 53 BGP Hijack — the MyEtherWallet DNS hijack with eNet AS10297 and the affected /24s.
NANOG / LACNIC (2023), A Brief History of the Internet’s Biggest BGP Incidents — survey from AS7007 through the crypto-bridge thefts.
NIST (2026), NIST RPKI Monitor — live measurement of RPKI-valid, invalid, and NotFound share across the IPv4 and IPv6 tables.
Global Cyber Alliance, Mutually Agreed Norms for Routing Security (MANRS) — the four operator actions and the implementation guidance behind them.
Wikipedia, AS 7007 incident — the April 1997 de-aggregation leak that first demonstrated global BGP fragility.