Skip to content

The HeadlessChrome user-agent token and the long tail of headless signals

· 20 min read
Copyright: MIT
The string HeadlessChrome rendered as a monospace wordmark with the word Headless struck through in orange

There is a substring that ends a scraping session before the page has finished parsing. It is HeadlessChrome, eight extra characters that Chrome has been printing in its user-agent header since 2017, and for years it was the single cheapest bot signal anyone could ask for. No JavaScript, no challenge, no round trip. The first line of the first request carries a confession, and a regex against the User-Agent header is enough to act on it. The interesting part is not that the token exists. It is what happened when Chrome’s own engineers decided to take it out, and what that change did and did not fix.

Because removing a string from a header is the easy part. The token was never the real tell; it was the convenient one. Underneath it sits a long tail of subtler differences between a browser that draws to a screen and one that does not: permissions that report contradictory states, a graphics stack that falls back to software, codecs that are missing from the build, a debugging protocol whose side effects ripple back into the JavaScript heap. This post is about that tail. Where the token came from, why --headless=new still leaks, and which of the second-order signals are durable rather than cosmetic.

The route runs roughly chronological. First the origin of the token in the 2017 headless launch and why it was so trivial to match. Then the 2023 rewrite that unified headless and headful Chrome and dropped the substring, and what that did to the classic JavaScript checks. Then the long tail proper: the permissions inconsistency, the WebGL renderer, the codec gaps, the window-geometry tells. Then the CDP side-channel that became the most-discussed headless signal of the decade, and the V8 change in 2025 that quietly disarmed its most famous form. The through-line is that a header is a label, and a label is the first thing an adversary edits.

2017: where the token came from

Headless Chrome shipped in Chrome 59 for Mac and Linux in April 2017, with Windows following in Chrome 60. The pitch was modest and entirely legitimate. Run Chrome without the chrome, as the announcement put it, for automated testing and server environments where there is no display to draw to. You launched it with --headless, usually alongside --disable-gpu and --remote-debugging-port=9222, and you drove it over the DevTools protocol or through a wrapper like Puppeteer.

The mode was a separate code path. Old headless was an alternate browser implementation that happened to ship inside the same binary, and it did not share much with the browser that real users ran. That architecture is the reason it leaked. A great many things the full browser does because it has a window, a GPU process, and a human in front of it, headless simply did not do, because it had none of those. And one of the most visible differences was deliberate: the build advertised itself in the user-agent string.

A default old-headless UA on macOS looked like this:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36
(KHTML, like Gecko) HeadlessChrome/111.0.5555.0 Safari/537.36

The only difference from a normal Chrome string is the product token. Headful Chrome writes Chrome/111.0.5555.0; headless wrote HeadlessChrome/111.0.5555.0. Everything else matched. So the cheapest possible detector is a substring search on the User-Agent request header, evaluated at the edge before any application code runs, with zero client cooperation required. No vendor needed a JavaScript tag to catch this. It was free.

headful Chrome ...like Gecko) Chrome/111.0.5555.0 Safari/537.36 old headless ...like Gecko) Headless Chrome/111.0.5555.0 Safari/537.36 Eight characters. Matched at the edge, before any application code runs. The cheapest bot signal in the stack, and the first one any operator strips. *The entire difference is the product token. A header is a label, and a label is editable.*

The catch, and the reason the token was never as decisive as it looked, is that the header is a string the client controls. Puppeteer’s page.setUserAgent, Playwright’s context option, a Chrome launch flag, all of them rewrite it before the first request leaves. So any operator past the absolute beginner level was already sending a clean UA by 2018. The token caught the lazy and the careless and nobody else. What it was genuinely good for was correlation: a clean Chrome/ token in the header next to navigator.webdriver === true in the body is a contradiction, and a forged UA that does not match the rest of the environment is a louder signal than no forgery at all. The token’s value migrated from standalone evidence to corroborating evidence almost immediately.

That migration matters for everything that follows. Once detection moves from the header into the JavaScript environment, the question stops being what does it claim and becomes does the claim survive contact with the runtime. The headless detection catalog walks the full inventory of those runtime checks; this post stays on the narrower thread of the token itself and the tells that outlived it.

2018: the first JavaScript inventory

Before the token went away, researchers had already built out the JavaScript layer that would matter more. The canonical early write-ups, Antoine Vastel’s 2017 and 2018 headless-detection posts and Intoli’s rebuttal arguing it was not actually possible to block headless cleanly, established the menu that anti-bot vendors still draw from. Most of those checks targeted the same root cause as the token: old headless was a different build doing different things.

navigator.webdriver was the honest one. The WebDriver specification defines it, and Chrome sets it to true when the browser is under automation, so reading it is a one-line check that needs no cleverness. It is also the first thing any stealth setup disables, historically with --disable-blink-features=AutomationControlled, so by itself it is weak. As a cross-check against a clean UA, it is the same corroboration pattern as the token.

navigator.plugins.length === 0 worked because old headless shipped no PDF plugin entries where a real desktop Chrome reported a small fixed set. window.chrome being absent or stripped of its runtime member worked because the object that headful Chrome populates was not fully present. navigator.languages coming back empty worked for the same structural reason. None of these were clever. They were all the same observation from different angles: this build is not the build a person runs.

The most elegant of the lot did not test a missing feature. It tested an internal contradiction. Old headless could not actually surface a permission prompt, because there was no window to draw one in, so it short-circuited. Query the Notifications permission two ways and the two answers disagreed.

Specifically, Notification.permission, the legacy DOM property, returned "denied", while navigator.permissions.query for the same subject reported a state of "prompt". A genuine browser keeps those two views consistent. Headless did not, and the mismatch was effectively impossible to forge without re-implementing the permissions plumbing. This is the template for every durable headless signal: not a feature that is present or absent, but two observations that a real environment would keep in agreement and a fake one cannot.

2023: the rewrite that took the token out

In Chrome 112, shipped in early 2023, Chrome’s engineers replaced old headless with a new implementation. The change had been telegraphed since late 2022 behind a flag that started life as --headless=chrome and was renamed to --headless=new in Chrome 109. Selenium deprecated its setHeadless() convenience method in 4.8.0, removed it in 4.10.0, and told users to pass --headless=new as an argument instead, precisely because there were now two modes to choose between and the library should not pick for you.

The architecture was the point. Where old headless was a separate browser implementation that shared little with headful Chrome, new headless is the same Chrome. It creates platform windows, it just never displays them. Chrome now has unified headless and headful modes running the same code, which means the great majority of the structural differences the 2018 checks relied on simply stopped existing. And the user-agent followed the architecture. New headless drops the HeadlessChrome token and reports the ordinary Chrome/ product string. The confession in the header is gone by default.

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36

With the same codebase came the same JavaScript surface. navigator.plugins now returns the realistic set. window.chrome is present and populated. navigator.languages is no longer empty. The 2018 checklist, the one built on old headless being a different build, mostly went dark in a single release. A vendor still leaning on plugin counts or the absence of window.chrome was, from Chrome 112 onward, looking at a browser that passed those tests because it really was Chrome.

This is the moment the field’s center of gravity moved. When the easy structural tells evaporate, you are left with the residue: the handful of things that still differ even when the build is identical, because they depend not on the build but on the environment the build is running in. A browser with no display, no GPU, and no human is still a browser with no display, no GPU, and no human, however faithfully its code matches the headful path.

2017 Chrome 59 token born 2018 JS inventory: permissions tell 2023 Chrome 112 --headless=new token dropped 2025 Chrome 132 old mode split into a binary Old headless did not vanish in 2023. Since Chrome 132 it ships separately as chrome-headless-shell , which still prints the token by default. Two modes now coexist; only one of them confesses in the header. *The token did not disappear so much as split. New headless dropped it; the old mode lives on as a standalone binary that still carries it.*

One wrinkle keeps the token relevant. Old headless was not deleted. Since Chrome 132.0.6793.0 it ships as a standalone binary called chrome-headless-shell, which Puppeteer exposes through headless: 'shell'. People still reach for it because it is lighter and faster than driving the full browser. And it still prints HeadlessChrome by default. So the cheapest signal in the stack is not dead. It is narrower: it now catches the subset of automation that deliberately chose the lightweight old mode and did not bother to overwrite the header. That is a real population, and a free header match is still worth running, but it is no longer the front line.

The long tail, part one: things a browser without a screen cannot fully fake

Strip the token and the structural JavaScript checks and you reach the signals that survive the rewrite. They survive because they do not test the browser’s code. They test its situation.

Start with the permissions inconsistency, because it outlived old headless in a slightly different form. New headless still cannot present a real permission prompt, and the reporting can land in a state where the Permissions API and the legacy DOM property describe the Notifications permission differently for the same origin. The exact behaviour drifts across Chrome versions and platforms, so this is not a stable single-bit check the way it was in 2018, and the contradictory-pair value depends on the build. But the underlying reason is durable: a browser with no UI surface for a prompt is in an unusual state with respect to permissions, and unusual states have a way of leaving observable seams. Treat the specific values as version-dependent and verify them against the build in front of you rather than trusting a hardcoded pair.

The graphics stack is the sturdier of the two. By default, headless Chrome runs with no GPU, so WebGL falls back to a software rasterizer. The renderer string a page can read back through the WEBGL_debug_renderer_info extension, via the UNMASKED_RENDERER_WEBGL and UNMASKED_VENDOR_WEBGL parameters, then describes software rendering rather than real hardware. On Windows, normal Chrome reports an ANGLE-over-Direct3D string naming the actual GPU, something on the order of ANGLE (Intel, Intel(R) UHD Graphics ... Direct3D11 ..., D3D11). A software-rendered instance instead reports ANGLE over SwiftShader, a string containing SwiftShader and the marker 0x0000C0DE. On Linux the equivalent fallback names Mesa or llvmpipe.

UNMASKED_RENDERER_WEBGL, headful Chrome with a real GPU ANGLE (Intel, Intel(R) UHD Graphics Direct3D11 vs_5_0 ps_5_0, D3D11) UNMASKED_RENDERER_WEBGL, headless with software fallback ANGLE (Google, Vulkan ... (SwiftShader Device ... (0x0000C0DE)), ...) A software renderer on hardware that claims a gaming desktop is the contradiction. *The renderer the page reads back betrays software rendering. The signal is not "SwiftShader is a bot" but "this renderer disagrees with everything else the device claims to be."*

The renderer string on its own is not proof of automation. Plenty of real users run on virtual machines, remote desktops, or hardware where Chrome falls back to software for legitimate reasons, and an operator can pass flags to force a hardware-like path or spoof the string outright. The value is again in the cross-check. A device whose user-agent claims a current consumer laptop, whose screen reports a 2560-pixel display, and whose WebGL renderer is a software rasterizer is internally inconsistent, and the inconsistency is what scores. This is the same logic the WebGL and device-fingerprinting layer runs on: a single value is weak, a value that contradicts its neighbors is strong.

Codecs are a quieter member of the same family. Open-source Chromium builds, the kind a lot of automation runs on, omit the proprietary codecs that the official Google Chrome binary licenses, because H.264, AAC, MP3, and MP4 demuxing sit behind the proprietary_codecs and ffmpeg_branding="Chrome" build flags. Ask such a build, through HTMLMediaElement.canPlayType or MediaSource.isTypeSupported, whether it can play video/mp4; codecs="avc1.42E01E" or AAC audio, and it answers with the empty string where official Chrome answers "probably". A browser that sends an official Chrome/ user-agent but cannot claim the codecs that build ships with is contradicting its own header. This catches the Chromium-versus-Chrome mismatch specifically, not headlessness as such, but in practice the two overlap heavily because so much automation runs on the open-source build.

Window and display geometry round out this part of the tail. A headless browser has no real screen, so the properties describing one are easy to leave at defaults or set to values that do not hang together. window.outerWidth and window.outerHeight matching the inner dimensions exactly implies no browser chrome, which a real windowed browser never shows. screen.availWidth equal to screen.width with no taskbar or dock carved out, devicePixelRatio pinned to a flat 1 on hardware that should report fractional scaling, a viewport that never resizes across a session: none of these is decisive, all of them are cheap, and the operator has to get every one of them internally consistent with every other claim. The defense is not a single killer check. It is a budget of small consistency constraints, each trivial to satisfy alone and collectively expensive to satisfy all at once.

The long tail, part two: the CDP side channel

The most-discussed headless signal of the last few years has nothing to do with what the browser looks like and everything to do with how it is being driven. Puppeteer, Playwright, Selenium over CDP, and most of the modern automation stack speak the Chrome DevTools Protocol to the browser. And the act of speaking that protocol can leak.

The classic form rode on a single CDP command. To execute script in a page and read console output, an automation client typically calls Runtime.enable, which switches on the protocol’s Runtime domain. Once that domain is enabled, the browser starts emitting Runtime.consoleAPICalled events and, more importantly, changes how it handles certain objects when they would be serialized for the inspector. The detection exploited a getter side effect. A page creates an Error object, defines a getter on its stack property, and arranges for that error to be logged. When the Runtime domain is enabled, Chrome serializes the error for the would-be inspector, the serialization reads .stack, and the getter fires. The getter firing is the tell. No real user has the Runtime domain enabled, so in an ordinary browser the getter never runs.

page builds an Error getter on .stack page logs it console.debug(err) no CDP: getter silent Runtime domain enabled (automation present) inspector serializes the error -> .stack getter FIRES The page never touches CDP. It only observes whether a side effect that requires an enabled Runtime domain happens to occur. The driver convicts itself. *The page cannot see CDP directly. It plants a getter and watches whether the inspector's serialization trips it. If the getter fires, something is inspecting, and ordinary browsing never inspects.*

What made this signal beloved is that the page never has to touch the protocol. It cannot, from inside the sandbox. It only has to observe a side effect that the protocol, when active, produces in the page’s own object graph. The driver convicts itself by being attached. That is a categorically different kind of tell from a missing plugin or a software renderer, and it caught a large amount of stock Puppeteer and Playwright traffic, because both call Runtime.enable by default to wire up execution contexts and console relay.

The evasion conversation that grew around this signal is well documented and turns on not enabling the Runtime domain in the obvious way: creating execution contexts through Page.createIsolatedWorld, or enabling and immediately disabling the domain to capture Runtime.executionContextCreated without leaving it on, and injecting page scripts through Page.addScriptToEvaluateOnNewDocument. The point worth keeping is structural rather than tactical. The leak existed because a debugging protocol designed for developers was never designed to be invisible to the page being debugged, and the automation libraries enabled the noisiest part of it by default.

Then, in May 2025, the most famous form of the signal stopped working, and almost nobody noticed at the time. Two V8 changes, one landing on 7 May 2025 to avoid error side effects in DevTools and one on 9 May 2025 to apply a getter guard throughout error preview, changed how the inspector previews error objects. The fix lives in V8’s inspector value-mirror code and makes the preview skip user-defined getters, the ones with a real script id, rather than invoking them. After that, the inspector can serialize an error for preview without firing the page’s getter on .stack, .name, or .message. The classic console-getter trick goes quiet. Notably, the change was not motivated as an anti-detection measure; it was a correctness fix for side effects during inspection, and disarming the bot signal was a side effect of removing a side effect.

That sequence is the whole story of headless detection in one move. A signal that everyone in the field knew, that worked reliably for years, was neutralized not by an evasion vendor but by the browser’s own engineers fixing something unrelated. Detection built on incidental engine behaviour inherits the lifespan of that behaviour, and the engine team owes the detection nothing. The JavaScript-runtime fingerprinting thread covers the wider family of these error-object and inspector side channels and how V8’s 2025 changes reshaped them; the headless-specific lesson is narrower and sharper. The CDP getter trick was the HeadlessChrome token of its generation: famous, effective, and load-bearing right up until the moment the platform moved the floor.

What is actually durable

Pull back from the catalog and a shape appears. The signals that lasted are not the ones that found a missing feature. They are the ones that found a contradiction the environment could not resolve. The permissions inconsistency lasted across two headless architectures because a browser without a UI is in a genuinely odd state about prompts, and odd states leak. The WebGL renderer keeps biting because a software rasterizer on a device claiming gaming hardware is a claim at war with itself. The codec gap works because an official user-agent on a build that lacks the official codecs is a forgery the build cannot back up. Geometry, codecs, permissions, renderer: each is weak alone and each constrains the others, and the cost to an operator is not defeating any one check but keeping every claim consistent with every other claim across a whole session.

The token taught that lesson first, by being the opposite of durable. It was a single editable string, decisive against nobody who edited it, and the moment Chrome’s own architecture made the string unnecessary, it was gone from the default build entirely. The structural JavaScript checks of 2018 followed the same arc, killed in a single release when the build behind them merged with headful Chrome. The CDP getter trick was the most sophisticated of the bunch and still died to an unrelated correctness fix. Anything that rests on one incidental fact has the half-life of that fact, and in a codebase shipping every four weeks the half-lives are short.

What this leaves for a defender is unglamorous and correct. There is no clean detection of --headless=new, in the sense of a single property that flips. There is a stack of consistency constraints, individually cheap and collectively demanding, scored together with network-layer and behavioral evidence that a headless build cannot address from inside JavaScript at all. And there is the permanent footnote that the most reliable thing about every signal in this post is that one of them will be obsolete by the next time you check, retired not by an adversary but by a browser engineer fixing a bug you were quietly depending on. The HeadlessChrome token still rides in the header of chrome-headless-shell today, nine years after it shipped, catching exactly the operators who never read this far. That is the one thing about it that has not changed.


Sources & further reading

Further reading