SSRF and the cloud-metadata endpoint: the attack that breached Capital One
There is an HTTP server inside every EC2 instance that no firewall protects, that requires no password, and that will hand out the instance’s cloud credentials to anyone on the box who asks nicely. It lives at a fixed address, 169.254.169.254, and it has been there since 2009. For most of that history it answered any plain GET request without a second thought. That design is fine right up until your application can be tricked into making a request on an attacker’s behalf. Then the attacker is, effectively, on the box.
That trick has a name: server-side request forgery. In July 2019 a single SSRF bug, sitting in a misconfigured web application firewall in front of Capital One, was walked all the way to that metadata endpoint, used to steal temporary IAM credentials, and turned into the exfiltration of roughly 106 million credit-card applications from S3. This post is a mechanism-level walk through how SSRF reaches the metadata service, what exactly went wrong at Capital One, and how AWS’s IMDSv2 redesign is built specifically to break that chain. It is defensive throughout. The goal is to understand why the system failed, not to hand anyone a recipe.
Here is the route. First, what SSRF actually is and why it is different from the request forgery you already know. Then the link-local metadata endpoint, what it returns, and why credentials live there. Then the Capital One chain itself, reconstructed from the indictment and the regulators’ findings. Then IMDSv2, field by field, and why a PUT request and a one-hop token defeat the overwhelming majority of SSRF. Finally, the layered defenses that hold when any single control fails, and the awkward fact that IMDSv1 is still reachable on a lot of running instances in 2026.
What server-side request forgery actually is
PortSwigger’s definition is the one to anchor on: server-side request forgery is “a web security vulnerability that allows an attacker to cause the server-side application to make requests to an unintended location.” The key word is server-side. The forged request originates from the application’s own backend, from inside the trust boundary, using the server’s network position and often its credentials.
This is what makes SSRF more dangerous than its better-known cousin, cross-site request forgery. CSRF abuses a victim’s browser to make a request the user did not intend, and it is constrained by the same-origin policy and the cookie rules the browser enforces. SSRF has no browser in the loop and no same-origin policy to obey. The server makes the request, so the request inherits everything the server can reach: internal admin panels on RFC 1918 addresses, databases bound to localhost, internal APIs that trust anything from inside the VPC, and the link-local metadata endpoint. The same-origin model that protects browser-side requests is a separate subject, covered in CORS, the same-origin policy, and the long history of cross-origin trust; none of it applies here, because the request never touches a browser.
The vulnerable pattern is almost always the same. An application takes a URL, or something that becomes a URL, from user input and fetches it server-side. A webhook tester. A PDF renderer that loads remote images. An “import from URL” feature. An image-proxy. A link-preview generator. A document converter. The developer expected a public web address and got a request to http://169.254.169.254/ instead.
When the response comes back to the attacker, that is plain SSRF. When it does not, when the application fetches the URL but never reflects the body, that is blind SSRF. PortSwigger puts it this way: blind SSRF arises when “you can cause an application to issue a back-end HTTP request to a supplied URL, but the response from the back-end request is not returned in the application’s front-end response.” Blind variants are harder to exploit and usually detected with out-of-band techniques, watching for a callback to a server the attacker controls, but blind does not mean safe. It means the attacker works without seeing the answers, not that the answers are protected.
There is a further wrinkle that matters for the metadata case. SSRF is often discussed as if it were purely an HTTP-to-HTTP problem, but the protocol of the forged request depends on what the server-side fetcher supports. Where the application uses a library that understands more than HTTP, an attacker may be able to coax it into file://, ftp://, gopher://, or dict:// requests. The gopher:// scheme is the dangerous one, because it lets an attacker write nearly arbitrary bytes to a TCP socket, including the carriage-return-line-feed sequences that frame an HTTP or Redis or SMTP request, which is how blind SSRF sometimes escalates into something closer to remote code execution. OWASP’s prevention guidance reflects this directly: reject dangerous schemes outright and accept only the ones you actually need, normally just http and https. For the metadata endpoint a plain HTTP GET is all that is required, so the scheme question is a sideline here, but it is the reason a denylist that only thinks about HTTP is already losing.
The link-local metadata endpoint
169.254.169.254 is not a normal IP. It sits in the 169.254.0.0/16 link-local range, which by definition is not routable beyond the local link. On an EC2 instance, the hypervisor intercepts traffic to that address and answers it locally. Every instance sees the same address; each gets its own answers. There is an IPv6 form too, fd00:ec2::254, on Nitro-based instances in IPv6 subnets.
What does it serve? A tree of metadata about the instance, reachable under http://169.254.169.254/latest/meta-data/. The top level alone lists ami-id, hostname, instance-id, instance-type, local-ipv4, mac, public-keys/, security-groups, and a couple dozen more. Most of it is harmless. Knowing your own AMI ID does not help an attacker much.
One branch is not harmless. Under iam/security-credentials/<role-name> the service returns the temporary security credentials for whatever IAM role is attached to the instance. These are real, working AWS credentials: an access key ID, a secret access key, and a session token, with an expiry. They exist so that code on the instance can call AWS APIs without anyone hard-coding a long-lived key into the application. That is genuinely good design. An EC2 instance running an app that needs to read from S3 gets a role, the SDK fetches short-lived credentials from the metadata service, and nobody ever pastes a secret key into a config file.
The catch is the threat model. The metadata service trusts the network. Its security assumption is that anything able to make an HTTP request from inside the instance is authorized to act as the instance. For local code, fair enough. For an application that can be coerced into making arbitrary requests, that assumption collapses. SSRF turns “make an HTTP request from inside the instance” into a primitive the attacker controls, and the metadata service hands over the credentials, because from its point of view the request came from the right place.
*Most of the metadata tree is recon at worst. One branch returns live IAM credentials, and the service hands them to any local HTTP request.*This is not an AWS-only problem. Google Cloud, Azure, and others all expose a metadata service on the same 169.254.169.254 address, with their own paths and their own credential-bearing branches. The specifics differ. Google Cloud’s metadata server has long required a request header, historically Metadata-Flavor: Google, before it returns anything sensitive, which is the same idea IMDSv2 later adopted: demand something a naive URL-fetch SSRF cannot supply. Azure’s instance metadata service similarly expects a Metadata: true header and refuses requests that arrive with an X-Forwarded-For. The pattern, a link-local HTTP endpoint that trusts the network and serves credentials, is industry-wide, and so is the realisation that a single mandatory header raises the bar against the cheapest SSRF. AWS’s hardening, IMDSv2, is the most thorough public response to it, which is why it anchors this post.
It helps to be clear about why credentials sit here at all, because the instinct on first encounter is that this whole thing is a design mistake. It is not. The alternative to instance roles is worse: long-lived AWS access keys pasted into application config, committed to source control, copied between environments, and almost never rotated. Those static keys leak constantly, and when they leak they are valid until someone notices and revokes them. Instance roles replace that with short-lived credentials minted on demand and expired automatically, scoped to a role you define. The metadata service is the delivery mechanism for that better model. The flaw was never that credentials are delivered over HTTP on the instance. The flaw was that the delivery channel trusted the network so completely that an SSRF could read from it, and that is the specific assumption IMDSv2 walks back.
The Capital One chain, reconstructed
On 19 July 2019 Capital One was alerted, via a responsible-disclosure email, that data was sitting in a public GitHub repository. By 29 July the company disclosed the breach. Roughly 106 million people in the US and Canada were affected. The attacker, Paige Thompson, a former AWS engineer who used the handle “erratic”, was arrested within days and ultimately convicted in June 2022 of wire fraud and computer-fraud charges.
The technical chain, as far as the public record describes it, has four moves. The role names and some details are redacted in the charging documents, so where the record is thin I will say so rather than invent specifics.
The entry point was a web application firewall. Brian Krebs reported the WAF was ModSecurity, the open-source engine, running on an EC2 instance in front of Capital One’s application. A WAF is a reverse proxy by nature: it receives a request, and to do its job it can be made to fetch things. The irony is not subtle. The device whose job is to inspect and block malicious requests was itself the thing tricked into making one. How a WAF inspects traffic, and the gaps in that inspection, is its own subject, covered in how a WAF actually works; the relevant point here is that proxy-shaped security devices have an outbound-request capability that is easy to forget about. This particular deployment had an SSRF-capable misconfiguration. The exact request that triggered it is not in the public record at the level of an exploit, and that is fine for understanding it. What matters is that the WAF could be induced to make an outbound request to a URL influenced by the attacker.
Move one, the SSRF. The attacker got the WAF to make a request to http://169.254.169.254/latest/meta-data/iam/security-credentials/. Because the instance still answered plain GET requests, the metadata service returned the name of the IAM role attached to the WAF instance. The charging documents refer to a role whose name ends in -WAF-Role, with the prefix redacted. A second request to that role’s path returned the temporary credentials.
Move two, the over-privileged role. Here is the failure that turned an annoying SSRF into a catastrophic one. The WAF’s role had far more access than a WAF needs. By the public accounts, those credentials could list the contents of Capital One’s S3 buckets and read the objects inside them. A WAF has no business reading customer data out of S3. The principle of least privilege existed to prevent exactly this, and it was not applied to this role.
Move three, exfiltration. With working credentials, the attacker used ordinary AWS tooling. The charging documents describe a “Sync Command” used to copy data out, consistent with the AWS CLI’s aws s3 sync, which recursively mirrors a bucket to a destination. Public reporting puts the haul at around 30 GB across roughly 700 buckets. Nothing exotic. Once you hold the credentials, you are using AWS exactly the way AWS is meant to be used, which is the entire problem.
The aftermath set a price on the failure. In August 2020 the Office of the Comptroller of the Currency assessed an $80 million civil money penalty, citing the bank’s “failure to establish effective risk assessment processes prior to migrating significant information technology operations to the public cloud environment.” The Federal Reserve issued a separate cease-and-desist order without a monetary penalty. In December 2021 Capital One agreed to a $190 million settlement of the consumer class action. The regulators’ language is worth sitting with: the finding was not “you got hacked,” it was “you moved to the cloud without the risk processes that move requires.” The SSRF was the proximate cause. The root cause was governance.
IMDSv2: a PUT request and a one-hop token
AWS announced IMDSv2 in November 2019, months after the breach, and it is engineered against precisely this chain. The redesign does not patch SSRF in anyone’s application, because AWS cannot. It changes the metadata service so that a typical SSRF can no longer talk to it. Two changes do most of the work.
First, IMDSv2 is session-oriented. You cannot just GET a credential path. You first send a PUT to http://169.254.169.254/latest/api/token, including the header X-aws-ec2-metadata-token-ttl-seconds with a value between one second and six hours (21,600 seconds maximum). The service returns a session token. Every subsequent metadata GET must carry that token in the X-aws-ec2-metadata-token header. Miss it, and IMDSv2 returns 401 Unauthorized. The token is instance-specific, never stored by the service, and cannot be reused on another instance.
Why does this break SSRF? Because the overwhelming majority of SSRF primitives can only make a GET (or sometimes a POST), to a URL, with no control over the HTTP method or custom request headers. The image-proxy fetches a URL. The PDF renderer fetches a URL. They issue GETs. They do not issue PUTs, and they do not let the attacker inject an arbitrary request header. AWS says it plainly: requiring “a PUT request, and then requiring the secret session token in other requests, is always strictly more effective than requiring only a static header.” A plain GET to the metadata root under IMDSv2 gets nothing useful, because there is no token to present and no way for the typical SSRF to obtain one.
Second, the token response carries a default IP TTL of 1. That is the time-to-live field in the IP header, the hop counter. A TTL of 1 means the packet is decremented to zero and dropped by the first router it hits. The token, and the metadata responses, physically cannot leave the instance. This is aimed at a different SSRF flavour: the open reverse proxy or misconfigured router that would forward metadata traffic off-box. Even if something on the instance fetches the token, it dies at the first hop. AWS layers on one more rule for the same threat: IMDSv2 rejects any PUT that carries an X-Forwarded-For header, because that header is the fingerprint of a request that has passed through a proxy. An open reverse proxy trying to relay a token request announces itself and gets refused.
It is worth being precise about what IMDSv2 does and does not fix. It does not fix the SSRF bug in the application. It does not stop an attacker who achieves full remote code execution on the instance, because RCE lets you make a PUT and read your own headers like any local process. What it stops is the large and common class of SSRF where the attacker controls a URL but not the method or headers. That class is exactly what breached Capital One. AWS has stated that retroactively analysing IMDSv2 against the known SSRF chain, including the one used against Capital One, it would have blocked it. The bug would still be there. The blast radius would have been a 401.
Defense in depth, because any one control fails
IMDSv2 is the headline fix, but the lesson of the breach is that no single control should have to hold. Several layers failed at Capital One, and each one, applied, would have changed the outcome.
Start at the application. The cleanest fix for SSRF is to not fetch user-controlled URLs at all, and where you must, to validate the destination against an allowlist of known-good hosts rather than a denylist of bad ones. OWASP is blunt about this: “prefer allow-lists.” Denylists are bypass-prone, and the bypass techniques are well catalogued. Alternate IP encodings turn 127.0.0.1 into a decimal 2130706433 or octal forms a naive filter misses. DNS rebinding points a name that resolved to a public IP during validation at an internal IP by the time the fetch happens. Open redirects let a validated public URL bounce the request to an internal one. Because of rebinding, validation has to happen against the resolved IP at fetch time, not against the hostname string, and the fetcher must refuse to follow redirects into private space. This is fiddly, which is why allowlisting the small set of hosts you actually need to reach is the durable answer.
Then the IAM role. Least privilege is the control that would have shrunk the Capital One breach from catastrophic to trivial. The WAF’s role could list and read every object across hundreds of S3 buckets. A WAF needs none of that. Had the role carried only the permissions a WAF requires, the stolen credentials would have unlocked nothing worth stealing. SSRF would still have leaked the credentials. The credentials would have been close to useless. This is the difference between a vulnerability and an incident.
Then the network. The metadata service is reachable from any process on the instance, but you can put a host firewall rule in front of it so that only the specific user or process that legitimately needs metadata can reach 169.254.169.254, and the web-facing application cannot. A WAF process making a connection to the metadata address is anomalous and blockable. This is the network-layer counterpart to application-layer validation, and OWASP recommends both precisely because either can fail alone.
Then detection. The breach was not caught by Capital One’s monitoring. It was caught by an outside party emailing them after spotting the data on GitHub, and the disclosure email arrived more than three months after the access. Unusual API calls from the WAF’s role, a sudden s3 sync of hundreds of buckets, credential use from an unexpected pattern, all of it was visible in CloudTrail and none of it triggered a response in time. The OCC penalty was, at its core, about that gap between what was logged and what was noticed.
There is a specific, high-value detection signal that the metadata-credential-theft pattern produces, and it is worth naming because it generalises beyond this one breach. Credentials minted for an instance role are meant to be used by that instance. When an SSRF or a later compromise exfiltrates them and the attacker uses them from their own machine, the API calls now originate from an IP address that is not the instance’s. AWS surfaces this in GuardDuty as the UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration finding family: instance credentials being used from outside AWS, or from a different AWS account, is anomalous by construction and cheap to alert on. None of that machinery needs to understand the SSRF itself. It only needs to notice that a credential tied to a place is being used from a different place. Capital One’s failure was not a lack of available signal. It was that the signal was not wired to anyone who would act on it inside a window that mattered, and three months is not that window.
One more application-layer detail deserves spelling out, because it is the part teams most often get wrong when they try to fix SSRF themselves. The naive fix is to take the user-supplied URL, parse out the hostname, check it against a blocklist of private ranges, and proceed. This fails for a reason that has nothing to do with the blocklist’s completeness: the check and the fetch happen at different moments, and DNS can change between them. An attacker registers a domain, points it at a harmless public IP during the validation window, and re-points it at 169.254.169.254 before the fetch fires. The validated hostname and the fetched address are no longer the same thing. The correct shape, per OWASP, is to resolve the name yourself, validate the resolved IP, and then connect to that exact IP, never letting the HTTP client perform its own second resolution, and never following a redirect into a range you would have rejected. It is more work than a substring check. The substring check is also worthless, which is the trade.
The long tail: IMDSv1 in 2026
IMDSv2 has existed since November 2019, and AWS has spent years pushing the default toward it without ever flipping a hard kill switch on v1. From November 2023, console Quick Start launches use IMDSv2-only. From mid-2024, newly released EC2 instance types default to IMDSv2-only. AWS added account-level controls so an organization can require IMDSv2 across the board.
But “newly released instance types default to v2-only” is a narrow promise. An instance launched years ago, on an older instance type, with HttpTokens set to optional, still answers plain IMDSv1 GETs today. The transition for existing instances stays voluntary. AWS gives you the tools, an account default, service control policies, a single modify-instance-metadata-options call to set tokens to required, but it will not break running workloads by forcing the change. So the population of instances that would still leak credentials to a 2019-style GET-only SSRF is smaller than it was, and it is not zero. Cloud security scanners flag HttpTokens: optional as a finding for exactly this reason. Auditing your fleet for it, and setting tokens to required everywhere it does not break something, is the single highest-value action this whole story points to.
The shape of the Capital One breach is what makes it the canonical SSRF case study. There was no zero-day. ModSecurity was not backdoored. AWS was not compromised. Every component behaved as built: the WAF proxied a request, the metadata service answered a local GET, the IAM role granted the access it was configured to grant, and aws s3 sync copied the files it was pointed at. The breach was the sum of correct components composed into a chain nobody had modelled. That is the uncomfortable thing about SSRF reaching a metadata endpoint. It is not an exotic exploit. It is a few ordinary features, each reasonable in isolation, lined up so that a request the developer never imagined walks from a public input to a private credential. IMDSv2 broke the most common version of that line. The credentials still sit at 169.254.169.254, waiting for the next request that should never have been made.
Sources & further reading
- AWS Security (2019), Add defense in depth against open firewalls, reverse proxies, and SSRF vulnerabilities with enhancements to the EC2 Instance Metadata Service — the primary IMDSv2 design rationale: PUT-then-token sessions, TTL=1, X-Forwarded-For rejection, and the SSRF threat model each addresses.
- AWS (docs), Use the Instance Metadata Service to access instance metadata — exact endpoints, header names, the token PUT, the 401 behaviour, hop-limit default, and the IPv6
fd00:ec2::254address. - AWS (2023), Amazon EC2 Instance Metadata Service IMDSv2 by default — the rollout timeline: console Quick Starts in November 2023, new instance types defaulting to v2-only in 2024.
- Brian Krebs (2019), What We Can Learn from the Capital One Hack — early technical reporting identifying ModSecurity, the SSRF-to-metadata chain, and the over-permissive role that could list and read S3.
- OCC (2020), OCC Assesses $80 Million Civil Money Penalty Against Capital One — the regulator’s finding on cloud-migration risk processes, in its own words.
- PortSwigger (Web Security Academy), Server-side request forgery (SSRF) — working definitions of SSRF and blind SSRF, and the bypass techniques that defeat denylist filtering.
- OWASP, Server-Side Request Forgery Prevention Cheat Sheet — allowlist-over-denylist guidance, post-resolution IP validation, and application-vs-network-layer defenses.
- MIT CAMS (2020), A Case Study of the Capital One Data Breach — an academic reconstruction of the incident, the role misconfiguration, and the governance failures behind it.
- Rhino Security Labs (2019), The Capital One Breach & “cloud_breach_s3” CloudGoat Scenario — a defensive lab reconstruction of the SSRF, metadata credential theft, and S3 sync exfiltration.
- Hacking The Cloud, Steal EC2 Metadata Credentials via SSRF — a documentation-style reference on how SSRF reaches IMDS and what IMDSv2 changes.
- Datadog Security Labs (2023), Misconfiguration Spotlight: Securing the EC2 Instance Metadata Service — current-state guidance on enforcing IMDSv2 and the prevalence of
HttpTokens: optional. - AWS (docs), GuardDuty IAM finding types — the
InstanceCredentialExfiltrationfinding family that flags instance credentials used from outside the instance, the detection signal Capital One lacked.
Further reading
Anti-bot honeypots: hidden form fields, decoy links, and timing traps
Traces the honeypot technique family used to catch automation cheaply: hidden form fields, off-screen decoy links, and submission-timing checks, plus why each one fails against a browser-driving bot and where the false positives hide.
·24 min readWhy waiting rooms leak: race conditions and token reuse in queue systems
Traces the failure modes that let a few visitors carry more than their share of queue slots: token replay, time-of-check race conditions at admission, and the multi-tab arithmetic that turns one cleared spot into many.
·22 min readCertificate transparency: how CT logs work and what they reveal
Traces how Certificate Transparency turns CA mis-issuance into a public, append-only Merkle-tree record: SCTs, the gossip and audit model, how browsers enforce it, and why the same logs hand attackers a free subdomain map.
·23 min read