Why does porting a reCAPTCHA score threshold to hCaptcha block all legitimate users?

Both systems return a score field on siteverify using a 0.0-to-1.0 range, but they define it in opposite directions. reCAPTCHA's score is a legitimacy score where 1.0 means likely human, while hCaptcha Enterprise's score is a risk score where 1.0 means a confirmed threat. A rule that blocks below 0.5 ported straight across does not just shift the threshold; it inverts the logic, blocking every legitimate user and passing every confirmed threat. The comparator has to be reversed.

What compute mechanism does hCaptcha use that reCAPTCHA lacks?

hCaptcha adds a proof-of-work layer, described in its Pro documentation as dynamic proofs of work that run whether or not a visible challenge appears. The browser is handed a small cryptographic problem, solves it, and returns the result as part of the token exchange. The cost is invisible for a human loading one page but compounds for an operation minting thousands of tokens a minute. reCAPTCHA has no equivalent public layer because Google prefers to correlate requests against its identity graph instead.

Why did Cloudflare move from reCAPTCHA to hCaptcha in 2020?

Cloudflare gave three reasons. Google moved reCAPTCHA toward a paid model in early 2020, charging above roughly a million calls a month, which at Cloudflare's volume meant millions of dollars a year to serve free users. Its customers were also uneasy about sending more data to a Google ad-adjacent product. And Google's services are intermittently blocked in China, around 25 percent of internet users, so a reCAPTCHA challenge could fail to load and lock those users out, where hCaptcha worked.

hCaptcha vs reCAPTCHA: a technical comparison of detection and challenge design

Q: How many distinct score values does the free reCAPTCHA v3 tier actually return?

On the free tier the v3 engine only ever returns four discrete levels: 0.1, 0.3, 0.7, and 0.9. The finer eleven-level scale and the reason codes that explain a verdict unlock only when you attach a billing account and move to the Enterprise assessment API. So a free-tier rule that blocks below 0.5 is really making a binary 0.3-versus-0.7 decision, with two unused buckets on each side. The granularity people assume they have is a billing feature.

Q: Can automated solvers beat hCaptcha's image grid challenges?

Yes. A 2021 IEEE WOOT paper by Hossen and Hei built a solver that cleared hCaptcha's image challenges at 95.93 percent accuracy, averaging 18.76 seconds per challenge while running in a Docker container with 2 GB of RAM, three CPUs, and no GPU. Tested against 270 live challenges, it showed the visual puzzle alone is not the defence. Neither vendor relies on the grid as the real filter; the decision is mostly made before it loads, from behaviour, environment, and reputation.

Two CAPTCHA systems sit in front of a large share of the public web, and to a casual reader they look interchangeable. Both put a checkbox or an image grid on a login page. Both hand the browser a token. Both have a server endpoint you POST that token to, and both answer with a JSON blob that says pass or fail. Swap one for the other and most of the integration code does not change. hCaptcha was built that way on purpose.

Underneath the matching API surface they disagree about almost everything that matters. They disagree about what a score means, and the two scales run in opposite directions so that a naive port flips your allow logic into a block-everyone logic. They disagree about who the customer is and what the labeling work is for. They disagree about whether a user’s Google identity should feed the risk decision. This post is about those disagreements, not the marketing ones. It walks through how each system actually decides, where the public documentation stops and inference begins, and why a privacy argument and a pricing change pushed millions of sites from one to the other in a single year.

The sections below start with the shared API contract that makes the two look like twins, then split into how reCAPTCHA scores, how hCaptcha scores, and the inverted scales that trip up migrations. After that: the image-challenge design and why the two pick different puzzle types, the proof-of-work layer hCaptcha added that reCAPTCHA never had, the privacy and data-economics split, and the 2020 migration wave that reset the market. A short synthesis closes it out.

The shared contract that makes them look identical

The reason a comparison is even interesting is that the integration surface is nearly the same. On the page you drop in a script, a container element carrying a public sitekey, and a callback. reCAPTCHA loads https://www.google.com/recaptcha/api.js and renders into an element with the g-recaptcha class; hCaptcha loads https://js.hcaptcha.com/1/api.js and renders into h-captcha. When a user clears the widget, reCAPTCHA writes its token into a hidden form field named g-recaptcha-response and hCaptcha writes one named h-captcha-response. The sitekey is public and the secret key stays on your server. None of this is an accident of convergent design. hCaptcha’s own switch guide tells integrators they do not have to rewrite their callbacks or tag attributes because it is API-compatible with reCAPTCHA, and the migration is mostly a find-and-replace.

The server side mirrors that. You take the token your frontend collected and POST it with your secret to a verification endpoint. reCAPTCHA’s is https://www.google.com/recaptcha/api/siteverify; hCaptcha’s is https://api.hcaptcha.com/siteverify. Both reply with JSON. Both carry a boolean success, an ISO-8601 challenge_ts, a hostname, and an error-codes array when something is wrong. If all you do is check success, the two systems are drop-in equivalents and you can stop reading.

*The integration surface is intentionally near-identical, which is what made the 2020 migrations a find-and-replace for most sites. The disagreements live below this line.*

The interesting part is everything the success boolean hides. The version you choose changes what that boolean is reacting to, and the enterprise tiers add a numeric score that the two vendors define in opposite directions. That is where the comparison earns its keep.

How reCAPTCHA decides: the score and the eleven levels

reCAPTCHA has shipped in three generations that still run side by side. v2 is the checkbox, launched in 2014 as “No CAPTCHA reCAPTCHA,” where clicking “I’m not a robot” runs an advanced risk analysis pass and only escalates to an image grid when that pass is unsure. v3, shipped in 2018, dropped the checkbox entirely. It runs in the background, scores every page action, and per Google’s own developer docs it will never interrupt the user. Enterprise is v3’s engine sold through Google Cloud with extra signals and the full set of reason codes. For a comparison against hCaptcha, v3 and Enterprise are the interesting half, because that is where scoring replaces a binary challenge.

The v3 model is a risk-analysis engine that watches interactions and returns a score from 0.0 to 1.0, where 1.0 is very likely a good interaction and 0.0 is very likely a bot. You call grecaptcha.execute() with an action name at the moment that matters, a login or a checkout, and it returns a token tied to that action. Tokens expire after two minutes, so the call has to happen at the point of submission rather than at page load. Server-side, siteverify returns the score, the action you named (which you must check matches, or an attacker replays a token minted on a low-stakes page against a high-stakes one), the challenge_ts, and the hostname.

The score is not continuous in the way the 0.0-to-1.0 range suggests. On the free tier the engine only ever returns four discrete levels: 0.1, 0.3, 0.7, and 0.9. The finer-grained eleven-level scale, and the reason codes that explain a verdict, unlock only when you attach a billing account and move to the Enterprise assessment API. That detail matters for anyone reverse-engineering thresholds: a free-tier integration that “blocks below 0.5” is really making a binary 0.3-versus-0.7 decision with two unused buckets on either side. The granularity people assume they have is a billing feature.

Enterprise adds machine-readable reasons alongside the number. Google documents a small vocabulary of them, and they map cleanly onto the signal classes the engine cares about. AUTOMATION means the interaction matched the behaviour of an automated agent. UNEXPECTED_ENVIRONMENT means the request came from an environment the engine considers illegitimate for that site. TOO_MUCH_TRAFFIC flags an abnormal volume from the traffic source. UNEXPECTED_USAGE_PATTERNS covers behaviour that diverges from the site’s learned baseline. LOW_CONFIDENCE_SCORE is the honest one: not enough traffic on this site yet to score with confidence. The assessment object also exposes extended_verdict_reasons and a challenge field whose values include PASS, FAIL, and NOCAPTCHA. The dedicated reCAPTCHA Enterprise post walks the reason codes and the additional signals over the free tier in more detail; here the point is narrower. reCAPTCHA’s number measures how human you look, and 1.0 is the good end.

*On the free tier the 0.0-to-1.0 range collapses to four buckets; the eleven-level granularity and the reason codes are a billing-account feature.*

What feeds the score is the part Google does not publish. The docs describe it as adaptive risk analysis over user interactions, and outside research fills in the rest. The 2016 Black Hat Asia paper “I’m Not a Human: Breaking the Google reCAPTCHA,” by Suphannee Sivakorn, Iasonas Polakis, and Angelos Keromytis, probed the v2 advanced risk analysis directly and found that browser environment, the age and history of the requesting Google cookie, and the presence of a logged-in Google account all moved the checkbox decision. A request carrying a well-aged Google session cookie sailed through where a fresh one drew a challenge. That finding is old now, and Google has rebalanced the signals since, but the shape holds: reCAPTCHA’s edge has always been that Google can correlate the request against an identity and a browsing history it already owns. The reCAPTCHA v3 scoring post goes deeper on the modern signal mix.

How hCaptcha decides: the inverted scale and the reason vector

hCaptcha runs the same three-mode story. There is the visible image-grid challenge anyone has seen. There is a passive mode the company markets as “99.9% passive,” which by its own Pro docs tries to keep visible challenges below 0.1 percent of real users by evaluating an interaction across thousands of factors before deciding whether to show a puzzle. And there is Enterprise, which exposes a numeric risk score and the reasons behind it.

Here is the trap. hCaptcha Enterprise also returns a score field on siteverify, also on a 0.0-to-1.0 range, but it means the opposite of reCAPTCHA’s. hCaptcha’s score is a risk score: 0.0 is no risk, 1.0 is a confirmed threat. reCAPTCHA’s score is a legitimacy score: 1.0 is the good end. hCaptcha’s own migration documentation calls this out explicitly, warning that for v3 and Enterprise the score logic inverts and the conditional in your mitigation code has to be reversed. A team that ports a “block if score below 0.5” rule straight across does not get a slightly-off threshold. They get a rule that blocks every legitimate user and waves through every confirmed threat. Same field name, same range, opposite polarity.

*Same field name, same numeric range, opposite meaning. Porting a threshold across without flipping the comparator inverts your allow logic.*

Alongside the number, hCaptcha Enterprise returns score_reason, an array of reason codes that explain what drove the risk score, the same interpretability idea as reCAPTCHA’s reason codes but pointed the other way: these say why something looks malicious rather than why it looks fine. hCaptcha does not publish a complete enumerated list of those reason strings the way Google publishes its five, so anything claiming to be the full vocabulary is reverse-engineered from observed traffic, not vendor-documented. The signal families the company does describe in prose include automation behaviour, environment anomalies, and network reputation such as VPN or proxy usage, which lines up with the same broad classes reCAPTCHA names. The enterprise tier also accepts an rqdata parameter, an additional signed payload the integrator passes through to bind the challenge to a specific user or action context; the precise field layout inside rqdata is not public, and what circulates is inferred from traffic rather than documented. The hCaptcha challenge pipeline post traces the sitekey-to-passcode flow and the proof-of-work layer in full.

The structural difference from reCAPTCHA is what hCaptcha cannot see. It has no Google account to correlate against, no decade of search and Gmail and Chrome telemetry tied to the same cookie. Intuition Machines, the company behind hCaptcha, built the scoring on what it can measure at the edge plus its own cross-site network, and it leans harder on two things reCAPTCHA treats as secondary: the image challenge as an active test, and a proof-of-work tax that runs whether or not a puzzle is shown.

The image challenge: why the two pick different puzzles

When a challenge does appear, the two systems ask different questions, and the difference is downstream of why each was built. reCAPTCHA’s image grids historically asked users to label street furniture: traffic lights, crosswalks, buses, fire hydrants, storefronts. That is not a coincidence. reCAPTCHA’s labeling work fed Google’s own products, and in 2014 the Street View and reCAPTCHA teams reported an algorithm that read the hardest distorted-text reCAPTCHA at 99.8 percent accuracy. When your own neural net can read your text CAPTCHA almost perfectly, the text CAPTCHA is finished as a bot filter, and Google said as much at the time, moving reCAPTCHA toward behavioural signals and away from the puzzle. The grid that remained was doing double duty as a Street View and Maps labeling tool.

hCaptcha’s puzzles look similar on the surface, an image grid asking you to pick every square containing some object, but the labeling economy behind them is the whole business rather than a side effect. Intuition Machines started as a machine-learning shop that needed enormous volumes of human-labeled data for clients, became a heavy buyer of annotation labour, and built hCaptcha as the mechanism: the customer who needs a dataset labeled pays, the publisher who shows the challenge gets paid, and the user solving the grid does the labeling. That is why hCaptcha’s image categories drift toward whatever its customers are paying to have annotated rather than toward street scenes. The anti-bot vendor economics post digs into how that two-sided market prices out.

The bad news for both is that the image grid stopped being a hard problem for machines years ago. The 2021 paper “A Low-Cost Attack against the hCaptcha System,” by Md Imran Hossen and Xiali Hei and presented at the IEEE WOOT workshop, built an automated solver that cleared hCaptcha’s image challenges at 95.93 percent accuracy, taking 18.76 seconds on average per challenge, running in a Docker container with 2 GB of RAM, three CPUs, and no GPU. Tested against 270 live challenges, it showed that the visual puzzle alone is not the defence. The 2023 USENIX Security study “An Empirical Study & Evaluation of Modern CAPTCHAs,” in which 1,400 participants solved 14,000 CAPTCHAs, made the same point from the human side: the puzzles are slow and annoying for people and weak against machines, a bad trade in both directions. Which is exactly why neither vendor relies on the grid as the actual filter anymore. The click is theatre; the decision was mostly made before the grid loaded, from behaviour, environment, and reputation. This is the same lesson the history of CAPTCHA traces across the whole field: every visual scheme has a roughly four-to-six-year shelf life before machine vision catches it.

The proof-of-work layer hCaptcha added and reCAPTCHA never had

The one mechanism hCaptcha has that reCAPTCHA does not is a compute tax. hCaptcha’s Pro documentation describes “advanced dynamic proofs of work” that run on requests whether or not a visible challenge appears, with the stated purpose of making automated bypass more expensive rather than impossible. The browser is handed a small cryptographic problem, solves it, and returns the result as part of the token exchange. For a human loading one page the cost is invisible. For an operation minting thousands of tokens a minute, the cost compounds into real CPU time and real money.

The exact parameters are not published by hCaptcha, and what the research and reverse-engineering community describes, often under the label HSW for the WebAssembly worker that computes it, is inferred from observed client behaviour rather than vendor docs. The general shape is a hash-based puzzle whose difficulty the server can dial up per request, computed in WebAssembly so it runs fast in a real browser and resists trivial reimplementation in a scripted client. The economic logic is the part worth stating plainly: proof of work does not try to tell humans from bots. It taxes everyone equally and bets that the attacker’s per-token margin is thinner than the defender’s tolerance for a few milliseconds of client CPU. The proof-of-work anti-bot post covers how hCaptcha, Kasada, and Anubis all reach for compute as a tax, and why the idea came back after a decade out of fashion.

reCAPTCHA has no equivalent public layer. Google’s bet is different and it is a bet only Google can place: it would rather correlate the request against the identity graph and the global view of traffic it already has than charge the client CPU cycles. When you can see the same cookie across a quarter of the web’s properties, you do not need to make the client prove it spent compute. You already know whether you have seen it behave like a person elsewhere. That asymmetry, identity-and-scale on one side, compute-tax-plus-edge-signals on the other, is the real architectural fork between the two systems, and it falls straight out of who owns what data.

*The fork falls out of data ownership: Google correlates against an identity graph it already has; hCaptcha taxes client compute and leans on its own edge signals.*

The privacy split and the data economics

The cleanest difference between the two is not technical at all, and it is the one that moved the market. reCAPTCHA is a Google product, and Google’s business is advertising. The signals it collects to score a request, the cookie history, the cross-property correlation, the browser and behaviour telemetry, are the same kind of data Google monetizes elsewhere, and a site embedding reCAPTCHA is sending its visitors’ interaction data into that system. Privacy regulators in the EU have circled this for years, and the GDPR posture of reCAPTCHA is a recurring headache for European sites precisely because the data flows to Google.

hCaptcha built its pitch on the inverse. The company states it retains no personally identifiable information, does not sell personal data, and is not in the advertising business, positioning the product as the privacy-preserving option. That is a marketing claim with a real structural basis: Intuition Machines makes money from the labeling marketplace and from enterprise contracts, not from profiling the people who solve challenges. Whether any third-party CAPTCHA is truly privacy-neutral is a longer argument, since both systems necessarily fingerprint the browser to score it. But the directional difference is real. One vendor’s incentive is to know more about the user; the other’s is to label an image and forget them.

That incentive split is the same one that produced the systems’ technical forks. Google can lean on identity because it already collects identity. hCaptcha leans on proof of work and a labeling economy because it deliberately does not. The privacy argument and the architecture are the same argument viewed from two angles.

The 2020 migration wave that reset the market

For most of the 2010s reCAPTCHA had no serious rival, and then in a single year a chunk of the web moved. The trigger was money and the amplifier was Cloudflare. In early 2020 Google moved reCAPTCHA toward a paid model, charging customers above roughly a million calls a month. Cloudflare, which had been running reCAPTCHA across its free tier at enormous volume, calculated that continuing would add millions of dollars a year just to serve free users. On 8 April 2020, Matthew Prince and Sergi Isasi published Cloudflare’s move from reCAPTCHA to hCaptcha.

The post laid out three reasons and only one of them was cost. There was the pricing change. There was privacy: Cloudflare’s customers were uneasy about feeding more data to Google, and Cloudflare’s own commitments sat awkwardly next to embedding a Google ad-adjacent product on every challenge page. And there was reach. Google’s services are intermittently blocked in China, which the post noted accounts for around 25 percent of all internet users, so a reCAPTCHA challenge could simply fail to load for a large population, locking those users out of Cloudflare-protected sites entirely. hCaptcha worked where Google did not. In a detail that says a lot about the economics, Cloudflare proposed that rather than hCaptcha paying Cloudflare for the traffic, as the labeling model would normally have it, Cloudflare would pay hCaptcha.

*The migration window: a 2020 pricing change at Google, amplified by Cloudflare's volume, put hCaptcha on millions of sites almost overnight.*

The effect was immediate. hCaptcha went from a small alternative to a system fronting a meaningful slice of the web, because Cloudflare’s footprint carried it there in one deployment. The interesting coda is that Cloudflare did not stay. In 2022 it launched Turnstile, its own CAPTCHA-alternative built to run without the visible image grid at all, and began moving its challenge traffic onto it. The same logic that made reCAPTCHA expensive and awkward for Cloudflare eventually applied to depending on any third party, so Cloudflare built its own. hCaptcha kept the millions of independent sites that had followed Cloudflare’s lead and built out its Enterprise tier in the years since. The Cloudflare Turnstile internals post covers what replaced both.

What the comparison comes down to

Strip away the matching API and the two systems answer the same question with opposite resources. reCAPTCHA scores a request by correlating it against an identity and a global traffic view that only Google possesses, returns a number where high means human, and never charges the client anything because it does not have to. hCaptcha scores a request from edge signals and its own cross-site network, returns a number where high means threat, taxes the client with proof of work because it cannot fall back on a Google-scale identity graph, and funds itself by turning the challenge into paid labeling. The inverted score scale is the small, sharp symbol of a deep difference: the same float on the same range, defined backwards, because the two companies are measuring from opposite ends of what they can see.

For anyone integrating, the practical warnings are concrete. The success boolean is portable; the score field is a trap, and porting a threshold without flipping the comparator turns an allow-list into a block-list. The free reCAPTCHA score is four buckets wearing an eleven-level costume, and the granularity is a billing feature. The image grid is not the defence in either system and has not been for years; the decision is made before the puzzle renders, from behaviour and environment and reputation, and the grid is mostly there to do labeling work or to slow a borderline case down. And the privacy difference is structural, not cosmetic: it falls directly out of how each company makes its money, which is also why their detection architectures diverged in the first place.

The clearest measure of how far the puzzle has fallen as a filter is the 2021 result that a solver cleared hCaptcha’s image grids at nearly 96 percent accuracy on a 2 GB container with no GPU. When a commodity machine reads the test better than a tired human at a fraction of the cost, the grid is no longer the wall. It is the turnstile in front of the wall, and the wall is the scoring engine you never see.

Sources & further reading

Prince, M. and Isasi, S. (2020), Moving from reCAPTCHA to hCaptcha — Cloudflare’s own account of the April 2020 migration, with the cost, privacy, and China-blocking reasoning and the pay-them-instead arrangement.
Google for Developers (2024), reCAPTCHA v3 — the v3 developer guide: the 0.0-to-1.0 score, actions, the two-minute token, and the siteverify response fields.
Google Cloud (2025), Interpret assessments for websites — the Enterprise assessment doc with the eleven levels, the four free-tier buckets, and the reason codes including AUTOMATION and UNEXPECTED_ENVIRONMENT.
hCaptcha (2024), Developer Guide — the integration surface: data-sitekey, the h-captcha-response token, the api.hcaptcha.com/siteverify endpoint, and the response fields.
hCaptcha (2024), Switch from reCAPTCHA to hCaptcha — the drop-in compatibility guide that documents the inverted Enterprise score and the reversed mitigation logic.
hCaptcha (2024), Pro Features — the “99.9% passive” mode description and the dynamic proof-of-work language.
Sivakorn, S., Polakis, I. and Keromytis, A. (2016), I’m Not a Human: Breaking the Google reCAPTCHA — Black Hat Asia paper probing how cookies, Google identity, and browser environment moved the v2 risk decision.
Hossen, M. I. and Hei, X. (2021), A Low-Cost Attack against the hCaptcha System — IEEE WOOT paper reporting a 95.93 percent solver on a 2 GB, no-GPU container.
Searles, A. et al. (2023), An Empirical Study & Evaluation of Modern CAPTCHAs — USENIX Security study, 1,400 participants solving 14,000 CAPTCHAs, on the human cost versus the machine weakness.
Google Online Security Blog (2014), Street View and reCAPTCHA technology just got smarter — Google’s note that its own model read the hardest reCAPTCHA text at 99.8 percent, and the shift toward behavioural signals.

hCaptcha vs reCAPTCHA: a technical comparison of detection and challenge design

The shared contract that makes them look identical

How reCAPTCHA decides: the score and the eleven levels

How hCaptcha decides: the inverted scale and the reason vector

The image challenge: why the two pick different puzzles

The proof-of-work layer hCaptcha added and reCAPTCHA never had

The privacy split and the data economics

The 2020 migration wave that reset the market

What the comparison comes down to

Sources & further reading

Frequently asked questions

Why does porting a reCAPTCHA score threshold to hCaptcha block all legitimate users?

How many distinct score values does the free reCAPTCHA v3 tier actually return?

What compute mechanism does hCaptcha use that reCAPTCHA lacks?

Why did Cloudflare move from reCAPTCHA to hCaptcha in 2020?

Can automated solvers beat hCaptcha's image grid challenges?

Further reading

reCAPTCHA v3 scoring: how the 0.0 to 1.0 score is computed and what feeds it

reCAPTCHA v2's bframe challenge: image grids, risk analysis, and the token lifecycle

reCAPTCHA Enterprise: the additional signals and reason codes over the free tier