Magecart and web skimming: how card data gets stolen from the browser

A physical card skimmer is a sliver of plastic glued over a gas-pump reader. It copies the magnetic stripe as the card slides past, and the victim never sees it because the pump still works. Web skimming is the same idea moved into the browser, and it is cleaner. There is no hardware, no physical access, no risk of a technician spotting a foreign part. The attacker adds a few lines of JavaScript to a checkout page, the page keeps working exactly as before, and as the shopper types a card number into the form the script reads each keystroke and ships it to a server the attacker controls. The payment still goes through. The order confirmation still arrives. The only difference is that a copy of the card, the CVV, and the billing address left for somewhere else on the way.

The name attached to this is Magecart, originally a tag for a single group attacking Magento stores and now a loose umbrella for a dozen-plus crews running the same play. The mechanism is almost boringly simple compared to the malware families that need an exploit kit or a kernel bug. It runs in the page’s own security context, with the page’s own permission, doing something the browser was designed to allow. This post walks through how a skimmer actually reads a form, what the British Airways and Newegg code did line for line as far as it was published, the third-party and supply-chain vectors that turn one compromise into thousands, the way modern skimmers hide their traffic inside services every store already trusts, and the defenses that exist: Subresource Integrity, Content Security Policy, and the client-side controls that PCI DSS 4.0 made mandatory in 2025.

The trust the browser hands every script

Start with the thing that makes the whole attack class work. When a page includes a <script> tag, the browser runs whatever bytes come back in the page’s own origin. That script can read document.cookie unless the cookie is HttpOnly, it can attach a listener to every input field, it can rewrite a form’s action, and it can open a network request to any host the page’s Content Security Policy permits. The same-origin policy that stops evil.com from reading bank.com does not apply to a script bank.com itself loaded. You invited the code. It executes with your authority.

A checkout page is the worst possible place for that model to be generous, because a checkout page is where the most valuable plaintext on the web sits in DOM nodes for a few seconds. The card number, the expiry, the CVV, the cardholder name, the billing address: all of it lives in <input> elements that any script on the page can read by id, by name, or by walking the form. The skimmer does not need to break encryption or intercept the network. It reads the values before the browser ever serializes them for the legitimate POST. By the time TLS protects the real payment request, the copy is already gone.

*Both the legitimate payment request and the skimmer read the same input nodes. TLS protects the wire, not the values sitting in the form.*

This is why card skimming sits in a different bucket from the malware families that need a browser bug to get going. There is no memory-corruption exploit and no drive-by download here. The skimmer runs because the page told the browser to run it. The whole attack reduces to one question: how did the attacker’s JavaScript get onto the page in the first place. Everything else, the reading and the exfiltration, follows from the platform working as designed.

What the British Airways skimmer actually did

The British Airways breach in 2018 is the canonical case because RiskIQ published a clear teardown of the code, and the code was short. The attacker did not write a sprawling malware framework. They modified one file that British Airways already served: a copy of the Modernizr JavaScript library, version 2.6.2, loaded from a baggage-claim information page. The malicious logic was appended to the bottom of that file, roughly 22 lines, so the library’s real functions kept working and nothing on the page visibly broke. RiskIQ noted the original Modernizr script dated to December 2012 while the tampered server copy carried a Last-Modified header from August 21, 2018, the day the skimming began.

The logic itself was tuned to British Airways specifically. It bound a handler to two events on the page’s submit button, mouseup and touchend. That pairing is the tell of a competent skimmer: mouseup fires when a desktop user releases the mouse over the button, touchend fires when a mobile user lifts their finger, so binding both makes the same skimmer work across desktop and phone. When either event fired, the handler serialized the contents of the form with id paymentForm plus an element with id personPaying, packed the values into a JSON object, and sent them to a server at baways.com. That domain was chosen to look like British Airways infrastructure at a glance, and the attacker bought it a real SSL certificate from Comodo in mid-August rather than using a free one, which kept the exfiltration request from throwing a certificate warning. The mechanism is captured below in pseudocode; this is a description of the published structure, not a runnable skimmer.

*The skimmer's whole job: catch the button release on either input type, read the form, ship the JSON. Roughly 22 lines bolted onto a library that kept working.*

The fallout is worth stating in numbers because it is what pushed card skimming up the priority list for regulators. British Airways first put the figure around 380,000 transactions. The UK Information Commissioner’s Office, in its 2020 penalty notice, described personal data of approximately 429,612 customers and staff being affected, including the card numbers and CVV codes of around 244,000 payment-card holders. The ICO’s original July 2019 notice of intent proposed a fine of £183.39 million. By the time the final penalty landed in October 2020 it had been cut to £20 million, the reduction reflecting representations from BA, mitigating factors, and an explicit Covid-19 adjustment. Even at the reduced figure it was the largest fine the ICO had issued to that date.

Newegg, and why the same code keeps showing up

Three weeks after the British Airways code went public, RiskIQ and Volexity found the same skimmer family on Newegg. The reuse was almost lazy. The base code was recognizable from the BA incident; what the attacker changed was the name of the form being serialized and the domain the data went to. This time the exfiltration host was neweggstats.com, registered on August 13, 2018, initially parked and then repointed to a Magecart drop server, with an SSL certificate obtained to match. The first card was stolen on August 14 and the skimming ran until the code was pulled on September 18.

One detail of the Newegg case matters for understanding how these crews think about stealth. The skimmer was placed on the payment-processing step itself, not in a script that ran on every page. A visitor only reached that step after adding an item to a cart and passing a validated address, which means the skimmer only executed for users who were genuinely about to pay. That selectivity has two payoffs. It keeps the malicious code off the home page and category pages where a casual researcher or an automated scanner is more likely to be looking, and it raises the signal-to-noise of the stolen data, since almost every execution captures a real, complete card. Modern loaders formalize this with an explicit check: run the full skimmer only if the URL or the page contents indicate a checkout, and stay dormant otherwise. It is the same instinct that drives sandbox-evasion fingerprinting in other malware, applied to the much simpler problem of “am I on a page worth skimming.”

The reuse pattern across BA, Newegg, Ticketmaster, Feedify, and hundreds of smaller stores is the reason “Magecart” stopped meaning one group. RiskIQ and others ended up numbering distinct groups by infrastructure and tradecraft, and by the time the campaign was mapped out the same researchers counted on the order of 800 e-commerce sites hit through the various vectors. A skimmer kit is cheap to clone and re-theme. The hard part is never the JavaScript. It is getting the JavaScript onto the page, and that is where the two main vectors diverge.

Two ways in: hack the store, or hack a script the store loads

The British Airways and Newegg skimmers both ended up modifying a file on the victim’s own infrastructure, which means the attacker first got write access to something the store served. That is the direct vector. Find a vulnerability in the e-commerce platform, gain enough access to edit a served file or a database-backed content block, and append the skimmer. Magento and Adobe Commerce stores have been the recurring target here because they are common, they hold card flows, and they have had a steady supply of critical bugs.

The 2024 CosmicSting vulnerability, CVE-2024-34102, is the clearest recent example of the direct vector at industrial scale. It is an XML external entity flaw in Adobe Commerce and Magento with a CVSS score of 9.8. The useful chain, as Sansec documented it, is not a single magic request. The XXE lets an attacker read arbitrary files, and the file that matters is app/etc/env.php, which holds the store’s crypt encryption key. With that key the attacker forges a valid JSON Web Token, and the forged token grants unrestricted access to the Magento REST API. From there the attack is pure API calls: enumerate CMS blocks with a search request, then update a block with PUT to append skimmer JavaScript that renders on the checkout. Sansec reported stores being compromised at a rate of five to thirty per hour after exploit details went public, named a string of competing crews working the bug, and put roughly five percent of all Adobe Commerce and Magento stores as having ended up with a checkout skimmer that summer. High-profile victims included Ray-Ban, National Geographic, Segway, and Cisco. The defensive twist that caught many operators out: patching the XXE did not help if the crypt key had already been read, because the forged tokens kept working until the key was rotated.

*The direct vector at scale. The bug is an XXE, but the payload is ordinary API traffic once the encryption key leaks.*

The second vector is the one that makes web skimming a supply-chain problem rather than a per-store one. Instead of compromising the store, compromise something the store loads. The Ticketmaster leg of the original Magecart campaign came in this way. Ticketmaster did not have to be hacked. The attacker tampered with scripts belonging to third-party suppliers that Ticketmaster embedded: the customer-support chatbot from Inbenta, and separately a social-media integration from a firm RiskIQ wrote as SociaPlus. Inbenta’s compromised JavaScript ran on Ticketmaster’s pages with Ticketmaster’s permission, and the skimmer rode in with it. The same campaign hit analytics vendors PushAssist and Annex Cloud and a CMS platform, each of which fanned the skimmer out to every customer site loading the vendor’s script. RiskIQ’s read at the time was that the crews had figured out it is easier to compromise one supplier of scripts than a thousand individual stores, and a single compromised supplier instantly affects everyone downstream.

The Polyfill.io episode in 2024 is the purest illustration of the supply-chain shape, even though its payload was malvertising redirects rather than a card skimmer. A widely embedded open-source CDN, cdn.polyfill.io, changed hands. Sansec reported on June 25, 2024 that the new operator had begun serving malicious JavaScript to sites embedding the script, with the injected code redirecting mobile users toward scam destinations through a typosquatted googie-anaiytics.com lookalike. Estimates put well over 100,000 sites embedding the script at the time. The Polyfill payload was redirect fraud, not skimming, but the delivery mechanism is identical to the Ticketmaster pattern: a trusted third-party script becomes hostile and every embedding site inherits whatever it now does. The post on the Polyfill.io supply-chain attack goes through that incident in detail.

Where the stolen data goes, and how the traffic hides

Reading the form is the easy half. Getting the data out without tripping a monitor is where modern skimmers spend their cleverness, and the trend since 2018 has been to stop sending obvious POST requests to obviously bad domains.

The oldest exfiltration trick still in heavy use is the image beacon. The skimmer builds a new Image() and sets its src to an attacker URL with the stolen data encoded into the query string or the path. The browser dutifully fetches the “image,” the request carries the card data outbound, and to a casual look at the network tab it is just another image load that happens to 404 or return a 1x1 pixel. No fetch, no XMLHttpRequest, nothing that screams data exfiltration. A related move hides the data inside an actual image using steganography, so the bytes leaving look like a legitimate PNG or GIF rather than a base64 card blob. Researchers have also documented skimmers that open a WebSocket and use it both to pull the second-stage payload and to stream stolen fields out over a long-lived TCP connection, which keeps the card data out of discrete, greppable HTTP requests entirely.

The most pointed recent evolution is exfiltration through services the store has already whitelisted. Sansec documented a campaign in late 2025 where the loader, the skimmer payload, and the stolen cards all moved through two domains every store trusts: Google Tag Manager and Stripe. A legitimate-looking GTM container delivered the loader, which fetched the skimmer payload from a Stripe customer record using a hardcoded key and ran it via new Function(). The skimmer captured the full card, billing address, and order total, XOR-encoded the result, and uploaded each stolen record as a fake customer in the attacker’s own Stripe account, splitting the data across metadata fields. Because the destination is api.stripe.com, a host any payment-handling site already allows, a Content Security Policy that only restricts destinations by domain waves it straight through. The exact field layout of the metadata packing is documented in that write-up; what matters for the defense discussion is the shape, which is that the exfiltration channel was a service the merchant could not block without breaking payments.

That last move is the one defenses struggle with. The image beacon and the raw WebSocket both go to hosts a monitor can flag, but a POST to api.stripe.com from a payment page is indistinguishable from the page doing its job. As CSP and monitoring caught the loud channels, skimmers moved their traffic onto hosts the merchant could not afford to block.

The same evasion instinct shows up in the skimmer’s own code. Payloads are heavily obfuscated with string-array packing, self-executing functions, and base64-wrapped indicators, which is the same toolkit covered in deobfuscating anti-bot JavaScript. Some variants geofence to fire only for visitors in target countries, and at least one documented strain went dormant the moment Chrome DevTools opened, on the theory that an open inspector means a researcher rather than a shopper. The loaders frequently masquerade as the very analytics or tag-manager snippets that legitimately litter a checkout page, so a skimmer disguised as a Google Tag Manager block does not look out of place to an operator skimming their own source.

Subresource Integrity, and the gap it leaves

The defense that maps most cleanly onto the third-party-script vector is Subresource Integrity. The idea is to pin a script tag to a cryptographic hash of the exact bytes you reviewed. You add an integrity attribute holding a hash, the browser computes the hash of what it actually downloaded, and if the two disagree it refuses to run the file and returns a network error. The attribute takes a algorithm-base64hash value with sha256, sha384, or sha512 as the allowed prefixes, and for a cross-origin script you also need crossorigin="anonymous" so the fetch goes through CORS and the bytes are eligible for integrity checking. A typical pin looks like <script src="https://cdn.example.com/lib.js" integrity="sha384-..." crossorigin="anonymous">. If a CDN is compromised and starts serving a tampered file, the hash no longer matches and the browser drops the script instead of executing the skimmer.

That would have stopped the Ticketmaster-style and Polyfill-style attacks cold, because those depended on a known script being silently swapped for a malicious one. SRI’s limitation is structural and it is the reason it does not solve web skimming on its own. A hash pins exactly one version of a file. The moment a resource is meant to change, the pin breaks. Many of the most useful third-party scripts are tag managers, analytics, A/B testing, and personalization tools whose entire purpose is to update server-side without the embedding site redeploying. You cannot pin a hash to a file the vendor changes weekly, and you certainly cannot pin one to a script the vendor generates per request. SRI also does nothing about the direct vector. If the attacker has write access to the store and injects a skimmer inline into the page’s own HTML, there is no external script tag to pin and no hash to fail. The Subresource Integrity deep dive covers the mechanism and its single-digit adoption rate in more detail. For the purposes of skimming, SRI is a strong control for the specific case of a third-party script that is supposed to be stable, and silent on everything else.

Content Security Policy: domains, nonces, and the trusted-host hole

Content Security Policy attacks the problem from the other side. Rather than verifying the contents of a script, CSP restricts what is allowed to run and where data is allowed to go. Two directives carry most of the weight against skimming. The script-src directive controls which scripts execute, and connect-src controls which hosts the page may open network connections to, which is the directive that governs where a skimmer can send stolen data.

A weak script-src that allows 'unsafe-inline' or whitelists broad domains barely slows a skimmer down. The stronger pattern is nonce-based: the server emits a fresh random nonce on each response and only inline scripts carrying that nonce execute, so an injected inline skimmer with no matching nonce never runs. Pairing the nonce with 'strict-dynamic' extends that trust to scripts the nonced root script loads, while ignoring host allowlists, which makes the policy both stricter and easier to maintain than enumerating every CDN. On the exfiltration side, a tight connect-src that lists only the destinations the page genuinely needs means a skimmer trying to fetch an attacker domain is blocked by the browser, and a CSP report can fire on the attempt.

The hole in CSP is the one the Stripe and Google Tag Manager campaigns drove a truck through. CSP restricts by origin, not by intent. If your checkout legitimately talks to api.stripe.com and loads googletagmanager.com, those hosts are in your policy, and a skimmer that exfiltrates through them inherits the permission. A nonce stops an injected inline script, but it does nothing about a skimmer that arrived inside an already-trusted third-party script that carries its own valid nonce or runs under 'strict-dynamic'. CSP and SRI are complementary for this reason: CSP narrows where code and data can flow, SRI verifies that a specific trusted file has not changed, and neither one alone closes the supply-chain case where the trusted thing itself turns hostile. The full treatment of policy construction lives in the Content Security Policy post.

PCI DSS 4.0: the requirements that made this mandatory

For years the client-side of the payment page was a regulatory blind spot. PCI DSS, the card-industry standard every merchant handling cards is measured against, focused on the server, the network, and the cardholder-data environment. The skimmer lives in the shopper’s browser, on a page the merchant serves but does not always treat as part of the secured perimeter. Version 4.0 of the standard closed that gap with two requirements aimed squarely at Magecart, and after a transition period both became mandatory on March 31, 2025.

Requirement 6.4.3 governs the scripts on the payment page. Every script that loads and executes in the consumer’s browser on a payment page must be authorized, its integrity must be assured, and the merchant must keep an inventory of every script with a written justification for why each one is needed. In practice that pushes merchants toward exactly the controls above: a managed list of allowed scripts, integrity verification through SRI or an equivalent, and the discipline of knowing what is on the page at all. Requirement 11.6.1 is the detection half. It requires a tamper- and change-detection mechanism that evaluates the HTTP headers and the contents of the payment page, alerts personnel when an unauthorized modification appears, and runs at least every seven days or on a cadence set by the merchant’s own risk analysis. Together they say a merchant must both control which scripts run and notice when the page changes underneath them.

*The two requirements split the job: 6.4.3 controls what runs, 11.6.1 notices when something changed.*

There is a structural reason these requirements exist as detection-and-inventory rather than a single technical mandate. No one control covers every vector. SRI handles a stable third-party script, CSP narrows the blast radius, and neither catches a skimmer injected inline through a server compromise or one exfiltrating through a whitelisted API. A change-detection mechanism that diffs the rendered payment page against a known-good baseline is the backstop that can catch the cases the preventive controls miss, because a new script reference or a modified inline block shows up as a diff regardless of how it got there. The requirement to re-evaluate at least weekly is a direct answer to dwell time. The British Airways skimmer ran for roughly fifteen days. A seven-day detection floor would have cut that window in half at worst.

What the attack class tells you

Web skimming has stayed near the top of the e-commerce threat list for eight years for a reason that has nothing to do with sophistication. The technique is trivial. A first-year JavaScript developer could write the reading half of a skimmer in an afternoon, because reading form values is what the language is for. The attack persists because the browser’s trust model puts every script the page loads on equal footing with the page itself, and a checkout page necessarily loads scripts from parties the merchant does not fully control. The defenses that work are the ones that treat that trust as something to be earned per-file and per-destination rather than granted by inclusion, and even those leave a gap whenever the trusted party is the one that turns hostile.

The arc from British Airways to CosmicSting to the Stripe-channel skimmers is an arc of the data getting quieter on the wire while the entry point stays the same. The 22 lines bolted onto a Modernizr copy in 2018 sent JSON to an obviously fake domain over a freshly bought certificate. The 2025 version reads the same form fields, but it arrives through a tag manager, hides in a service the merchant pays for, and ships the card to a Stripe account the merchant cannot block. The thing being stolen has not changed in any of it. It is the card number, the expiry, and the three digits on the back, read out of a DOM node a half-second before the shopper clicks pay, by code the page was told to run.

Sources & further reading

RiskIQ / Yonathan Klijnsma, via BleepingComputer (2018), British Airways Fell Victim To Card Scraping Attack — the original teardown of the BA skimmer, the Modernizr modification, mouseup/touchend binding, and the baways.com domain.
Dan Schoenbaum (2019), Inside the Breach of British Airways: How 22 Lines of Code Claimed 380,000 Victims — line-by-line description of the BA payload, the paymentForm and personPaying serialization, and the JSON exfiltration.
Security Affairs (2018), Magecart stole customers’ credit cards from Newegg electronics retailer — the Newegg case, neweggstats.com registration, the 15-line insert, and code reuse from British Airways.
Threatpost (2018), Ticketmaster Breach: Just One Part of a Wide-Ranging Campaign — the supply-chain vector via Inbenta and SociaPlus and RiskIQ’s count of around 800 affected sites.
UK Information Commissioner’s Office, summarized by Pinsent Masons (2020), British Airways fined £20m over GDPR breach — the final penalty, the 429,612 affected figure, and the reduction from the £183.39m notice of intent.
Sansec (2024), CosmicSting attack & defense overview — the XXE-to-JWT-to-API chain of CVE-2024-34102, crypt-key theft from env.php, CMS-block injection, and the named exploiting groups.
Sansec (2025), Magecart skimmer turns Stripe into a malware command server — exfiltration through Google Tag Manager and the Stripe API, the new Function() loader, and the metadata packing of stolen cards.
Akamai (2023), Magecart Attack Disguised as Google Tag Manager — an inline loader posing as a GTM snippet, WebSocket exfiltration, base64 obfuscation, and checkout-only triggering.
Sansec, via Qualys (2024), Polyfill.io Supply Chain Attack: What You Need to Know — the cdn.polyfill.io takeover, the googie-anaiytics.com redirect, and the 100,000-plus affected sites.
MDN Web Docs, Subresource Integrity — the integrity attribute syntax, sha256/384/512 algorithms, the crossorigin requirement, and the refuse-to-load behavior on mismatch.
DataDome learning center (2025), PCI Requirements 6.4.3 & 11.6.1 Explained — the client-side script-management and tamper-detection requirements and the March 2025 mandatory date.
OWASP, Content Security Policy Cheat Sheet — nonce-based script-src, strict-dynamic, connect-src, and the limits of host-based policies.