Traces how a single browser-automation stealth patch moves through its life: a signal is found, the patch hides it, the patch itself becomes a fingerprint, and a new signal replaces the old one. With real examples and the economics of the treadmill.
A decision framework for choosing between a headless browser and a plain HTTP client at extraction scale: JS-dependence, per-page cost, fingerprint surface, brittleness, and the hybrid path most large crawlers actually take.
Traces the real resource cost of driving headless Chrome at scale: per-instance RAM, the multi-process tax, container failure modes, concurrency math, and the cost gap that pushes teams back to HTTP clients.
How to pull JavaScript-rendered data without launching a browser: finding the backend JSON, XHR, and GraphQL endpoints a page calls, replaying them, handling tokens and request signatures, and where the approach stops working.
A reference to the navigator object's fingerprinting surface: userAgent, platform, languages, hardwareConcurrency, deviceMemory, vendor, productSub, and webdriver, plus the cross-property consistency checks that catch a spoof.
Traces Selenium from Jason Huggins's 2004 JavaScriptTestRunner through Selenium RC's proxy hack, the 2009 WebDriver merger, and WebDriver becoming a W3C Recommendation in 2018.
Traces Puppeteer from the April 2017 headless-Chrome announcement through its CDP foundation, the stealth-plugin arms race, the team's departure to build Playwright, and the long shadow it cast over scraping.
How anti-detect browsers grew out of carding forums and affiliate multi-accounting into a commercial tool category, and why the work moved from JavaScript spoofing into the browser engine itself.