Subresource Integrity and the supply-chain risk of third-party scripts

When a page includes <script src="https://cdn.example.com/lib.js">, the browser does something it does for almost no other trust decision on the web: it runs whatever bytes come back, in the page’s own origin, with full access to the DOM, cookies, and every other script on the page. The same-origin policy that walls off evil.com from bank.com does not apply here. You asked for the script. The browser fetches it. It executes. If the CDN was serving the right file yesterday and a different file today, nothing in the platform notices. The page author trusted the URL, and the URL is a promise about a location, not about content.

Subresource Integrity is the platform’s answer to that gap. You pin the script to a cryptographic hash of the exact bytes you reviewed, the browser computes the hash of what it actually received, and if the two disagree it refuses to run the file. The mechanism is small, it shipped in every major browser years ago, and it would have stopped several of the worst third-party-script breaches on record. It is also barely used, supports only a couple of resource types, and falls apart the moment a resource is allowed to change. This post walks through how the integrity check works at the byte level, what it genuinely protects against, the specific things it cannot do, and why a feature this cheap to deploy still sits on a single-digit fraction of the pages that would benefit from it.

The sections that follow cover the threat the mechanism was built for, the syntax and parsing of the integrity attribute, the CORS requirement that trips people up, the hash-agility design, the gap around dynamic resources, the adoption numbers, the newer Integrity-Policy header, and where SRI sits in the larger supply-chain story.

The trust you can’t see

Every external script tag is a standing decision to execute someone else’s code in your security context. The browser model here is generous to a fault. A script loaded from a third-party host is not sandboxed against the page that loaded it. It can read document.cookie (unless the cookie is marked HttpOnly), rewrite form actions, attach listeners to every input, and exfiltrate whatever a user types. The page’s same-origin protections are about other origins reaching in; they say nothing about a script you yourself invited.

That model held up as long as you could trust the host serving the file. The web spent two decades getting comfortable with that assumption. Pull jQuery from a public CDN, pull your analytics from a vendor, pull a chat widget from a SaaS, and each of those hosts becomes a member of your trusted computing base whether you think of it that way or not. Frederik Braun, who co-edited the SRI spec, framed the original worry plainly: an included JavaScript file can “read and request towards everything that is on this website and all your other websites.” The CDN is not a static file server in the threat sense. It is a code-injection point with your permission.

The breaches that followed were not theoretical. In 2018 a group tracked as Magecart added card-skimming JavaScript to checkout flows at British Airways and Ticketmaster. The BA incident copied payment-card data, including CVV numbers, from the airline’s own payment page to an attacker-controlled domain; the ICO later put the count at roughly 429,612 customers and staff affected and issued a £20 million fine, at the time the largest in its history. The Ticketmaster leg of the same campaign came in through a third-party chat widget: the supplier Inbenta’s script was tampered with, and every site embedding it inherited the skimmer. The Register reported the same skimmer turning up on still more sites through a hacked feedbackembad-min-1.0.js file served from the analytics vendor Feedify. One compromised file, many victim sites, no change visible to any of them.

*The page pins a URL, not content. A CDN that swaps the file behind that URL inherits the page's full origin privileges.*

The shape of the problem is what makes it dangerous: the victim site ships correct HTML, the CDN ships a correct URL, and the only thing that changed is the bytes. SRI exists to make the bytes part of the contract.

What the integrity attribute actually says

The mechanism is one HTML attribute. You add integrity to a <script> or a <link>, and its value is the algorithm name, a hyphen, and the base64-encoded digest of the file:

1
<script src="https://cdn.example.com/lib-2.3.1.min.js"
2
        integrity="sha384-H8BRh8j48O9oYatfu5AZzq6A9RINhZO5H16dQZngK7T62em8MUt1FLm52t+eX6xO"
3
        crossorigin="anonymous"></script>

The spec calls this string integrity metadata. Its grammar is a hash-algorithm token, a -, the base64 digest, and an optional ?-prefixed option expression that no browser currently uses for anything. The valid algorithm token set is ordered: « "sha256", "sha384", "sha512" », corresponding to SHA-256, SHA-384, and SHA-512. The ordering is not cosmetic. The spec states that “the ordering of this set is meaningful, with stronger algorithms appearing later in the set,” and the browser relies on that ordering when it has to choose between several hashes (more on that below). MD5 and SHA-1 are not in the set; they were never options, which is one of the few security features that got the defaults right from day one.

You generate the digest from the exact file you intend to ship. The canonical recipe, straight from MDN, hashes the raw bytes and base64-encodes the binary digest:

1
cat lib-2.3.1.min.js | openssl dgst -sha384 -binary | openssl base64 -A

Two things about that command matter. It hashes bytes, not a parsed or normalized form, so a single trailing newline difference changes the hash. And it base64-encodes the binary digest, not a hex string, so you cannot paste a hex sha384sum output and expect it to validate. The public generator at srihash.org does the same thing in the browser and exists mostly so people stop pasting hex digests into the attribute.

When the browser fetches the resource, it computes the digest of the response body and compares it against the metadata. On a match, the resource loads normally. On a mismatch, the spec is blunt: the user agent “will refuse to render or execute responses that fail an integrity check, instead returning a network error as defined in Fetch.” The script does not run in degraded mode, there is no warning-and-continue, and there is no fallback to the un-verified file. The request fails the way a dropped connection fails, and the page sees the script as simply absent. That fail-closed behavior is the whole value proposition: a tampered file is indistinguishable from a file that never arrived, and a file that never arrived is a problem you already know how to handle. The page degrades the way it would for any failed resource, which is a far better outcome than running an attacker’s code with your origin’s full privileges.

*A mismatch produces a Fetch network error. There is no degraded execution and no fallback to the unverified bytes.*

One detail worth internalizing: SRI verifies content, not provenance. It does not check who signed the file, who served it, or whether the hash you pinned is the hash of a good file. If you copy the integrity value from a compromised upstream, you have faithfully pinned malware. The attribute guarantees that the bytes the browser runs are the bytes whose hash you wrote down. Whether those were the right bytes is your problem, decided at the moment you authored the tag.

Why crossorigin is not optional

The most common way to break an SRI deployment is to forget crossorigin. For a cross-origin resource, that single missing attribute turns a working integrity check into a guaranteed failure, and the reason is a side-channel defense rather than an oversight.

By default a cross-origin <script> loads in no-cors mode. The browser will execute the response but it deliberately does not let the page read the bytes; this is why a cross-origin script error is reported as an opaque Script error. with no detail. SRI needs to read the bytes to hash them. If the platform allowed an integrity check on a no-cors response, it would hand the page an oracle: vary the hash, observe pass or fail, and you could probe the content of a cross-origin resource you are not allowed to read, one guess at a time. So the rule is categorical. A cross-origin request that carries an integrity attribute must be made in CORS mode, which means adding crossorigin="anonymous" (or use-credentials) to the tag, and the serving CDN must answer with an Access-Control-Allow-Origin header that permits your origin. The spec puts it without hedging: “Subresource Integrity requires CORS and it is a logical error to attempt to use it without CORS.” Public CDNs that intend to be used this way send Access-Control-Allow-Origin: *, which is why jsDelivr, cdnjs, and the rest work with SRI out of the box and Google Fonts, which has historically not exposed the right CORS posture for its stylesheet, has been a recurring source of SRI friction.

*A cross-origin integrity check in no-cors mode is blocked on purpose. Allowing it would leak cross-origin content one hash guess at a time.*

Same-origin resources do not need crossorigin, because the page is already allowed to read them. The trap is specifically the cross-origin case, which is exactly the case SRI was built for. The result is a deployment that “works on my machine” when the file is local and silently 404-equivalents in production when it moves to a CDN, and the fix is one attribute that the failure mode does not point you toward.

Hash agility and the strongest-match rule

SRI lets you put more than one hash in the attribute, space-separated, and the design behind that is worth understanding because it is how the mechanism plans to outlive its own hash functions.

You can supply several digests, and they can use different algorithms:

1
<script src="/app.js"
2
        integrity="sha384-oqVu... sha512-Q2j2..."
3
        crossorigin="anonymous"></script>

When the metadata contains more than one algorithm, the browser does not check all of them. It runs the “get the strongest metadata from set” procedure: it finds the strongest algorithm present (SHA-512 beats SHA-384 beats SHA-256, per the ordered token set) and validates only against the hashes using that algorithm. If you list two SHA-384 hashes and one SHA-256, only the SHA-384 entries are checked, and the resource passes if it matches any of the strongest-algorithm hashes. That “any-of” rule is what makes safe rotation possible: during a deploy you can list the hash of the old file and the new file under the same algorithm, and either version validates while caches drain.

The reason to support multiple algorithms at all is forward defense. SHA-256 is not broken today, but the platform learned from MD5 and SHA-1 that a hash function’s retirement is a slow-moving certainty, not a surprise. By making the algorithm explicit in every attribute and ordering the token set by strength, the spec gives browsers a clean path: when SHA-256 eventually weakens, a page that already carries a SHA-512 hash alongside it gets upgraded protection for free, because browsers prefer the stronger digest automatically. Pages that pinned only SHA-256 keep working until that algorithm is actively dropped. This is the same hash-agility instinct you see in TLS 1.3 and in certificate transparency, and it is the part of SRI that aged best.

There is a subtlety the strongest-match rule introduces. If an attacker could get a weaker algorithm honored, they would prefer to attack that. SRI closes this by only ever honoring the strongest algorithm you listed, so an attacker cannot downgrade you by appending a SHA-256 hash of their malicious file to your SHA-512 metadata; the SHA-512 entry still wins and their file still fails. The downgrade vector that plagues protocol negotiation does not exist here because the choice is made locally from a fixed priority order, not negotiated with the server.

The gap: dynamic resources

Here is where the mechanism stops. SRI verifies a fixed sequence of bytes against a fixed hash. It has nothing to say about a resource that is supposed to change, and a large share of the third-party code on the modern web is exactly that.

Consider what SRI cannot protect. A tag manager or analytics loader that returns a different bundle per visitor, per experiment, or per release cannot be pinned, because there is no single hash. An advertising script whose payload is assembled at request time has no stable digest. A “latest” URL like cdn.example.com/lib/latest/app.js is by definition a moving target; pinning it would break the next release. Any resource fetched by JavaScript after page load through fetch() or XMLHttpRequest, rather than declared in an HTML tag, is outside SRI’s reach entirely, because the integrity check lives on the element, not on the network layer. And SRI applies only to a narrow set of elements: <script>, and <link> where the relation is stylesheet, preload, or modulepreload. It does not cover images, fonts loaded via @font-face, iframes, media, or anything fetched by a worker. The verification surface is scripts and stylesheets declared in markup, and nothing else.

*The verification surface is declared scripts and stylesheets with stable content. The dynamic and runtime-fetched majority sits outside it.*

This is not a bug, it is the boundary of what content hashing can do. A hash is an assertion about specific bytes. A resource that legitimately varies has no specific bytes to assert. The practical consequence is that SRI protects the easy half of the problem (the pinned library you vendored from a CDN) and leaves the hard half (the analytics blob that changes weekly, the tag manager that loads more tags) to other controls. The Polyfill.io attack lands precisely in that hard half, which is part of why it spread so far.

Why Polyfill.io got through

In February 2024 a company bought both the polyfill.io domain and the project’s GitHub account from its original maintainer. By June, sites embedding cdn.polyfill.io/v3/polyfill.min.js were serving malware. Sansec, which reported the incident on 25 June 2024, documented a payload that ran only on mobile devices (an isPc() check excluded desktop), skipped execution when it detected admin cookies or analytics vendors that might notice, randomized its activation by time of day, and redirected victims to sports-betting and other scam domains. More than 100,000 sites were embedding the script, with later host counts running higher.

SRI was the obvious defense, and it did not save anyone, for a structural reason. The whole selling point of Polyfill.io was that it returned a different bundle per browser. Send it an old Internet Explorer User-Agent and it shipped a fat polyfill set; send it a modern Chrome and it shipped almost nothing. That per-request tailoring is incompatible with a fixed hash. You could not write a working integrity value for polyfill.min.js because there was no single file behind that URL. The service’s core feature was the same property that made SRI inapplicable. Sites that wanted the convenience had to accept an unpinnable third-party script, and when the domain changed hands, they had no content check to fall back on. The lesson the incident drove home is narrow and uncomfortable: SRI rewards static, versioned dependencies and offers nothing to the dynamic ones, and the dynamic ones are often the more tempting target. The full anatomy of that incident is its own story, covered in the Polyfill.io supply-chain attack.

The adoption problem

If SRI is cheap, standardized, and would have blunted real breaches, the natural question is why so little of the web uses it. The numbers are not flattering. The HTTP Archive’s Web Almanac measured roughly 17.5% of integrity-carrying elements on desktop in its 2021 dataset, with most of those on <script> tags, but the per-page coverage tells the real story: the median page that uses SRI at all protects about 3.3% of its scripts with it. Among sites that bother, the overwhelming majority of their scripts still ship unverified. Of the algorithms in use, SHA-384 dominated at around 66% on mobile, SHA-256 next at about 31%, which tracks the example snippets the big CDNs publish.

That last point is the tell. SRI adoption clusters around hosts whose copy-paste embed code happens to include an integrity value. The Almanac found the protected scripts concentrated on a handful of origins (gstatic.com, cdn.shopify.com, code.jquery.com, cdnjs.cloudflare.com), which is to say developers get SRI when a vendor hands it to them pre-baked and rarely add it themselves. The attribute is not hard to write. It is hard to maintain. Every time the upstream file changes, the hash must change with it or the resource fails closed and breaks the page, and a feature that breaks your site when you forget a step is a feature most teams quietly avoid. The fail-closed behavior that makes SRI a good security control is the same behavior that makes it operationally annoying, and the operational annoyance wins.

There is a second, subtler reason. SRI defends against a server you do not control being compromised. For a long time, teams underweighted that threat relative to ones they could see, and the build-time tooling to generate and update hashes automatically lagged the standard by years. Bundlers like Webpack gained SRI plugins, but they were opt-in and easy to skip. The result is a control that is technically deployed almost everywhere (every browser supports it) and operationally deployed almost nowhere.

Integrity-Policy: closing the “forgot the attribute” hole

The single biggest weakness in SRI as originally shipped is that it is opt-in per element. Forget the attribute on one tag and that resource is unprotected, with nothing to flag the omission. A new mechanism addresses exactly that gap. The Integrity-Policy response header lets a document declare that certain resource types must carry integrity metadata, turning a missing integrity attribute from a silent gap into an enforced error.

The header is a structured-fields dictionary. A blocking policy looks like:

1
Integrity-Policy: blocked-destinations=(script), endpoints=(integrity-endpoint)

With that in place, any <script> that lacks integrity metadata, or that would load in no-cors mode, is blocked rather than executed, and a violation report of type integrity-violation is sent to the named reporting endpoint. The reportable fields include the document URL, the blocked URL, the destination type, and whether the policy was in report-only mode. There is a report-only twin, Integrity-Policy-Report-Only, with the same shape, meant to be deployed first so you can watch which scripts would break before you turn on enforcement. The only currently defined blocked-destinations values are script and style, which keeps the policy scoped to the two element types SRI already covers.

This is the same playbook Content Security Policy used: a report-only mode to find the violations, then an enforcing mode once the page is clean. It does not change what SRI verifies; it changes whether you are allowed to forget. The combination matters because the failure mode of plain SRI was never a bad hash, it was a missing one, and an enforced policy is the only thing that catches the tag someone added last week without an integrity value. As of 2026 the header is the newest piece of this puzzle and not yet universal across browsers, so treat it as a hardening layer for engines that support it rather than a guarantee, and check current support before relying on enforcement.

Where SRI sits in the supply-chain story

SRI is one control aimed at one link in a long chain, and its boundaries are most useful to understand by contrast with the links it does not touch. It verifies the bytes a browser executes against a hash you authored. It does nothing about the integrity of the package that produced those bytes, which is the territory of the npm and dependency-confusion incidents, where the malicious code was baked into a dependency long before it ever reached a browser. It does nothing about a build pipeline that was subverted, which is what the XZ Utils backdoor demonstrated at the source level. And it does nothing about a script that is supposed to change, which is the gap Polyfill.io drove a truck through. SRI is a last-mile content check, valuable precisely because it is the final gate before execution and worthless against threats that arrive already inside the gate.

The honest summary is that SRI is a good mechanism with a narrow blast radius and an adoption problem rooted in operations rather than capability. It would have stopped the British Airways and Ticketmaster skimmers, because those were static files swapped under a stable URL, which is the exact case content hashing was built for. It would not have stopped Polyfill.io, because that resource was dynamic by design. The web has had this defense in every browser for the better part of a decade and uses it on a few percent of the scripts that could carry it, mostly where a CDN pre-filled the attribute. The newest piece, an enforcement header that makes a missing hash an error instead of a shrug, is the first part of the design that addresses the actual failure mode, and whether it gets adopted any more widely than the attribute it polices is the open question. The mechanism was never the hard part. Remembering to use it, and keeping the hash current when the file moves, is the whole game, and that is a discipline problem the platform cannot hash its way out of.

Sources & further reading

W3C WebAppSec WG (2023), Subresource Integrity — the living spec: integrity grammar, the ordered hash token set, strongest-metadata selection, CORS requirement, and fail-closed behavior.
MDN (2024), Subresource Integrity — practical reference: the openssl hash recipe, supported elements, crossorigin requirement, and the Integrity-Policy headers.
HTTP Archive (2021), Web Almanac: Security — SRI adoption measurements: element percentages, median per-page coverage, algorithm mix, and the top hosting origins.
Frederik Braun (2015), Subresource Integrity — a co-editor’s account of the spec’s origin, the CDN trust motivation, and the cache-poisoning concern that shaped the design.
Sansec (2024), Polyfill supply chain attack hits 100K+ sites — the technical writeup of the polyfill.io takeover, the mobile-only payload, evasion logic, and redirect domains.
Sonatype (2024), Polyfill.io Supply Chain Attack Explained — timeline of the February 2024 domain change and the scope of affected sites.
The Register (2018), Card-stealing code that pwned British Airways, Ticketmaster pops up on more sites via hacked JS — the Magecart skimmer spreading through tampered third-party scripts including Feedify’s.
Wikipedia (2020), British Airways data breach — the 22-line skimmer, the ~429,612 affected figure, and the £20 million ICO fine.
W3C (2014), Subresource Integrity, First Public Working Draft — the original 2014 draft and editor list, for the standard’s starting point.
MDN (2023), CSP: require-sri-for — the deprecated CSP directive that predated Integrity-Policy as a way to require SRI page-wide.
srihash.org, SRI Hash Generator — the reference tool for producing a correct base64 digest and matching tag.