reCAPTCHA Enterprise: the additional signals and reason codes over the free tier

Free reCAPTCHA v3 hands your backend one number between 0.0 and 1.0 and stops there. You get a score, you pick a threshold, you guess at why a given request looked bad. reCAPTCHA Enterprise hands you the same number plus a structured explanation of how it was reached, a finer-grained version of the score itself, and a set of account-security verdicts that the free tier never computes. The score is the part everyone talks about. The reason codes and the account models are the part you actually pay for.

The question this post answers is narrow and specific: what does the Enterprise tier add over free v3, concretely, at the level of field names in the assessment response? Not “it has more features” in a marketing sense. Which JSON fields appear that free v3 never returns, what do their enum values mean, where does the extra signal come from, and what does it cost per call. We will walk the assessment API as the primary interface, the reason-code and score-granularity differences, Account Defender and its annotation feedback loop, the MFA and password-leak-detection add-ons (including the cryptography that lets Google check a leaked password without seeing it), and the per-assessment pricing that changed sharply in 2024. We close on the 2026 rebrand to Cloud Fraud Defense, because the product you integrate today is no longer called what the docs called it last year.

The assessment is the unit of everything

Everything in the Enterprise tier hangs off one server-side call: projects.assessments.create. The browser runs grecaptcha.enterprise.execute(), gets a token, ships it to your backend, and your backend POSTs an assessment request to the reCAPTCHA Enterprise API. The response is an Assessment resource. Free v3, by contrast, verifies the token against siteverify and returns a flat JSON blob with a score and a couple of fields. The Enterprise Assessment is a much larger object, and the shape of that object is where the tier difference lives.

The request carries an event object. The documented fields include token (the response token from execute()), siteKey, expectedAction, userIpAddress, userAgent, and optionally ja3 and ja4 TLS client fingerprints that you compute at your edge and pass in. That last pair matters. Free v3 never lets you feed network-layer fingerprints into the scoring; Enterprise does, and the docs name both the legacy salesforce/ja3 format and the newer FoxIO-LLC/ja4. If you want the background on why those fingerprints carry signal, we wrote it up in TLS fingerprinting: from ClientHello bytes to JA4.

The response splits into several sub-objects, and not all of them are populated on every call. The ones that matter for the tier comparison:

*The Assessment response. Orange branches are populated only on the Enterprise tier; the white ones exist on Standard too. The free v3 siteverify response carries none of this structure.*

Two fields are worth noting before the score discussion. tokenProperties.valid tells you whether the token parsed at all, and if it did not, invalidReason carries one of MALFORMED, EXPIRED, DUPE, MISSING, BROWSER_ERROR, UNEXPECTED_ACTION, KEY_MISMATCH, or DOMAIN_MISMATCH. A DUPE means the token was already redeemed, which is the single most common reason a perfectly good integration starts failing in production: a token is single-use, and a retry or a double-submit burns it. UNEXPECTED_ACTION fires when the action baked into the token does not match the expectedAction you passed, which is the documented check against an attacker replaying a token minted on a low-stakes page against a high-stakes endpoint. The decision about where that check runs, edge or origin, is its own topic, covered in server-side vs client-side bot detection.

The token itself is short-lived and bound to context. The two-minute expiry that free v3 enforces carries over: a token minted on page load and submitted by a user who took a coffee break comes back EXPIRED. KEY_MISMATCH and DOMAIN_MISMATCH are the two that catch misconfiguration rather than abuse. The first means you minted the token under one site key and verified it against an assessment scoped to a different one, which happens when a team rotates keys without updating both ends. The second means the page that ran execute() was served from a hostname not registered to the key. None of these are bot signals; they are the API telling you the plumbing is wrong, and a production integration should branch on valid and invalidReason before it ever looks at the score. A surprising number of “reCAPTCHA is blocking my users” reports turn out to be a DUPE storm from a frontend that fires execute() twice, or an EXPIRED rate that tracks how long a form sits open.

One structural point separates this from siteverify. The free verify endpoint is a single host-controlled call that returns a verdict; the Enterprise assessment is an authenticated Google Cloud API call, which means it runs under a service account, shows up in Cloud audit logs, and is subject to IAM. That is a different operational posture. It also means the assessment can be enriched with server-side context the browser never had, the IP and user-agent as your edge actually saw them, and the JA3 or JA4 you computed, rather than whatever the client claimed.

Eleven levels instead of four

The score lives at riskAnalysis.score, a float from 0.0 to 1.0 where higher is more human. That is identical to free v3 on the surface. The difference is resolution.

reCAPTCHA computes the score on an internal eleven-level scale. The free and unbilled tiers do not expose all eleven. Without a Google Cloud billing account attached to the project, the API quantises the score down to four discrete buckets: 0.1, 0.3, 0.7, and 0.9. Attach billing and the full eleven levels unlock. This is documented behaviour, not a rumour: the assessment-interpretation page states plainly that only those four levels are available before you add a billing account, and that adding one enables all eleven.

*The same underlying score, two resolutions. Unbilled projects see four snapped values; a billing account unlocks the eleven-step ladder, which is what makes a threshold like 0.5 meaningful instead of nonexistent.*

Why does this matter beyond cosmetics. A threshold strategy needs gradations to work. If your only available scores are 0.1, 0.3, 0.7, and 0.9, then a decision boundary at 0.5 is just a relabelled version of “is it 0.3 or 0.7,” and you can never tune it finer. The eleven-level scale gives you the room to set a boundary at, say, 0.6 and have it mean something different from 0.7. For a site that wants to challenge a thin slice of borderline traffic rather than hard-blocking it, that resolution is the difference between a usable policy and a coin flip. The detailed mechanics of how the underlying number gets computed are the same engine free v3 uses, which we covered in reCAPTCHA v3 scoring; Enterprise changes what you are allowed to see, not how it is measured.

Reason codes: the why behind the number

Free v3 gives you a score and, through siteverify, almost no explanation. Enterprise populates riskAnalysis.reasons, an array of ClassificationReason enum values that name the factors that pushed the verdict. The documented values, with the meanings Google gives them:

AUTOMATION means the interaction matched the behaviour of an automated agent. UNEXPECTED_ENVIRONMENT means the event came from an environment Google considers illegitimate. TOO_MUCH_TRAFFIC means the traffic volume from this source ran higher than normal. UNEXPECTED_USAGE_PATTERNS means the interaction diverged significantly from what the site usually sees. LOW_CONFIDENCE_SCORE is the honest one: too little traffic reached this site for reCAPTCHA to produce quality risk analysis, so the score is a low-confidence guess. There are also SUSPECTED_CARDING and SUSPECTED_CHARGEBACK reasons that surface payment-fraud signal on the risk verdict itself.

LOW_CONFIDENCE_SCORE deserves a moment. It tells you the model is uncertain because it lacks data, not because the request looked bad. A new site key that has not accumulated traffic will throw this constantly, which is why the docs recommend waiting roughly 48 hours after wiring up a key before you act on its scores. Treating a LOW_CONFIDENCE_SCORE 0.3 the same as a high-confidence 0.3 is a classic integration mistake. The free tier never tells you which one you are looking at.

Beyond the basic reasons array, the billed tier exposes extendedVerdictReasons, a richer set of explainability strings that the docs describe as advanced and gate behind a billing account and Enterprise subscription. The tier comparison page draws the line cleanly: Essentials gets no explainability reasons, Standard gets basic reasons, and Enterprise gets advanced ones. So the reason codes are not one feature, they are a graduated feature, and the depth of explanation scales with what you pay.

The reason codes change how you write the policy, not just how you debug it. With free v3 you have a number and a threshold, and every decision is “block below X.” With reasons in hand you can route differently per cause. A low score driven by AUTOMATION is a candidate for a hard block; the same low score driven by LOW_CONFIDENCE_SCORE is a candidate for a soft challenge or a pass, because the model is admitting it does not know. TOO_MUCH_TRAFFIC on an otherwise unremarkable request points at rate-based abuse and might be better handled by a rate limiter than a block. UNEXPECTED_ENVIRONMENT is the one most likely to fire on a legitimate user running an unusual setup, a privacy browser, a hardened configuration, a corporate device, and treating it as automatically hostile is how you lock out exactly the security-conscious users you would rather keep. The point of the reasons is to let a policy be specific instead of a single dumb threshold, and a policy that ignores the reasons is leaving the most valuable part of the tier on the table.

It is worth being clear about what the reasons are not. They are a classification of why the score landed where it did, not a transcript of the raw signals. reCAPTCHA does not hand you the underlying telemetry, the mouse paths, the timing distributions, the fingerprint entropy. It hands you a verdict and a short, enumerated set of contributing factors. The signal collection that produces those verdicts overlaps heavily with what every other anti-bot stack collects, and the runtime side of it is the subject of how anti-bot systems fingerprint the JavaScript runtime. Enterprise simply chooses to summarise the conclusion in named codes rather than leaving you to infer it from a bare number.

*The escalation in what the assessment response carries as you move up tiers. The account-security and fraud verdicts only appear on Enterprise; explainability deepens from none to basic to advanced.*

Account Defender: a per-site model, not a per-request score

The single biggest thing Enterprise adds is accountDefenderAssessment. This is not a bot score. It is a verdict about an account, and producing it requires reCAPTCHA to build a model of your users over time rather than scoring each request in isolation.

The mechanism is documented in plain terms. You load the reCAPTCHA script across your site to collect what Google calls horizontal telemetry passively as users move through login, signup, and account pages. You report important actions through execute(). Then, critically, you pass a stable account identifier on the assessment so reCAPTCHA can tie events to the same user across sessions. The current parameter for this is accountId; the older hashedAccountId still works but is deprecated, and the docs note you could hash identifiers with SHA-256 HMAC under a salt of your choice if you did not want to send a raw identifier.

With that identifier in place, accountDefenderAssessment.labels returns one or more of four AccountDefenderLabel values. SUSPICIOUS_LOGIN_ACTIVITY means the request carries a high risk of credential stuffing or account takeover. SUSPICIOUS_ACCOUNT_CREATION means a high risk of abusive signup. RELATED_ACCOUNTS_NUMBER_HIGH means the request is linked to an unusually large cluster of related accounts, the kind of signal that catches one actor running a hundred sock-puppet registrations. And PROFILE_MATCH is the positive one: the attributes of this request match attributes reCAPTCHA has seen before for this particular user, so it is probably really them. As of mid-2026 the docs also describe an account-takeover risk score with its own risk and trust reasons, an Enterprise-only numeric companion to the binary labels.

*Account Defender is a closed loop. The assessment verdict is only as good as the annotations you feed back. Confirm an MFA pass with PASSED_TWO_FACTOR and reCAPTCHA marks that profile trusted, which is what later produces PROFILE_MATCH.*

The feedback loop is the part people skip and then wonder why the labels are useless. After you know the outcome of an event, you call projects.assessments.annotate with an AnnotateAssessmentRequest. That request carries an annotation of LEGITIMATE or FRAUDULENT, and a reasons array with real-time event details like CORRECT_PASSWORD, INCORRECT_PASSWORD, PASSED_TWO_FACTOR, or FAILED_TWO_FACTOR. The docs are explicit that timing matters: send those reason annotations within seconds or minutes of the event, because they feed real-time detection. When you annotate a verified login as legitimate after a successful MFA, reCAPTCHA marks that user’s profile as trusted, and that trust is precisely what later produces a PROFILE_MATCH for the same person. No annotations, no trusted profiles, no profile matches. The model starves.

There is a cost-of-ownership consequence buried in this design that is easy to miss at integration time. Account Defender needs the site-wide passive telemetry to be actually deployed across the workflow, not just on the login form, or it has nothing to model. It needs a stable accountId that survives across sessions and devices for the same user, which forces a decision about what identifier you are comfortable sending. And it needs the annotation calls wired into the part of your backend that knows the eventual outcome of a login or a signup, which is often a different service from the one that created the assessment. A team that ships only the script tag and the assessment, and skips the telemetry rollout and the annotations, will get back accountDefenderAssessment objects that are mostly empty or low-signal and will conclude the feature does not work. It works; it was not fed.

RELATED_ACCOUNTS_NUMBER_HIGH is the one label that captures something a per-request score structurally cannot. It is a verdict about the graph, not the request: this identity sits in a dense cluster of accounts that look related, which is the shape of one operator running many accounts. The Enterprise-only Related Accounts API exposes that grouping directly. A single request from such an account can look completely clean on the bot score, because the individual session is a real browser driven by a real-ish flow, and still be obviously part of a fraud ring once you look at the neighbourhood. That is the kind of signal that only exists because reCAPTCHA is holding state across accounts, and it is one of the clearest demonstrations of why the account model is a different product from the score.

This is a genuinely different product from a bot score, and it is the line that separates Enterprise from the free tier most clearly. Free v3 has no concept of an account at all. It cannot, because it never receives a stable identifier and never builds per-user history. The comparison between this account-centric design and how a captcha-only competitor approaches the same problem is worth reading alongside hCaptcha vs reCAPTCHA.

Password-leak detection without seeing the password

The feature that earns the most engineering respect is privatePasswordLeakVerification. It tells you whether the username and password a user just submitted appears in Google’s corpus of known-breached credentials. It does this without Google ever learning the password, and without your servers learning Google’s breach list. The cryptography is the interesting part, and it is worth getting right rather than hand-waving.

This went generally available on June 14, 2022, announced by Badr Salmi and Aaron Malenfant on the Google Cloud blog. The blog itself only says the feature uses a privacy-preserving API that hides the credential details and the result from Google’s backend. The mechanism underneath is private set intersection with commutative encryption, and the implementation details live in the reference docs and helper libraries.

Here is the flow as the documentation describes it. Client-side helper libraries (Java or TypeScript) take the username and password and compute two things. First, lookupHashPrefix, the first 26 bits of a SHA-256 hash of the username, used purely to shard the breach database into a manageable bucket. Second, encryptedUserCredentialsHash, a Scrypt hash of the credentials that the client has encrypted under a key only the client holds. Your backend sends both to Google in the assessment request. Google looks up the 26-bit prefix bucket, takes your encrypted credential hash, re-encrypts it under Google’s own key, and returns reencryptedUserCredentialsHash along with encryptedLeakMatchPrefixes, the list of encrypted breach-database hashes that fall in the same prefix bucket. Because the encryption commutes, your client can now decrypt the re-encrypted value with its own key and is left with a value encrypted only under Google’s key, in the same space as the entries in encryptedLeakMatchPrefixes. The client checks for membership. A match means the credential pair is breached.

*The commutative-encryption handshake. The username prefix is the only cleartext-derived value that crosses the wire, and 26 bits is coarse enough that a prefix bucket holds many usernames, which is the k-anonymity that protects the lookup.*

The lineage here is worth stating because it is not invented for marketing. The same private-set-intersection design with k-anonymity and blinding underpins Google’s consumer Password Checkup, published in February 2019 by Jennifer Pullman, Kurt Thomas, and Elie Bursztein, who tied it to a corpus of roughly four billion leaked credentials. The academic treatment of the tradeoffs between hash-prefix bucketing and full PSI was laid out the same year in “Protocols for Checking Compromised Credentials.” And a closely related compromised-credential PSI method is covered by US patent 11,366,892, granted June 21, 2022 to Shape Security, the bot-defense firm F5 acquired, which is a reminder that this corner of the field has overlapping patents from more than one vendor.

The honest caveat: the exact byte layout of encryptedUserCredentialsHash, the specific elliptic curve, and the Scrypt parameters are not fully pinned down in the public web docs. The 26-bit prefix and the field names are documented; the rest is implemented inside the helper libraries. Where this post says “commutative encryption over a prefix bucket,” that is the documented shape, and the precise group and parameters are an implementation detail Google ships as a library rather than a spec.

MFA you do not have to build

Enterprise can run a step-up multi-factor challenge for you. The accountVerification object on the assessment lets you hand reCAPTCHA an endpoint to verify, and as of the current docs that endpoint is email; the docs note phone and the mobile SDKs do not support this flow yet. You set an endpoints array with the address, and the response comes back with a requestToken, an encrypted string valid for 15 minutes that you send back to trigger the actual challenge.

The result of a verification surfaces in latestVerificationResult, an enum whose values read like an operations runbook: SUCCESS_USER_VERIFIED on success, ERROR_USER_NOT_VERIFIED when the user fails the challenge, plus a spread of error states for incomplete onboarding, disallowed recipients during the testing phase, exhausted per-recipient or per-customer quota, and internal errors. The accountVerification response also carries lastVerificationTime, the timestamp of the last successful verification on that device, which is what lets you decide not to re-challenge a device that passed MFA an hour ago.

The point of building this into reCAPTCHA rather than your own auth stack is the loop from the Account Defender section. A PASSED_TWO_FACTOR annotation, or an MFA verification reCAPTCHA itself ran, is the strongest legitimacy signal you can feed the per-site model. The MFA result and the account model are the same system viewed from two angles: one challenges the user, the other remembers that they passed.

What it costs, and the 2024 reset

reCAPTCHA Enterprise bills per assessment. Not per page view, not per seat. Each create assessment call is the billable unit. This is the deepest structural break from free v3, which was a free product with no per-call meter, and it changed sharply in 2024.

Google announced new pricing in January 2024, fully in effect from August 1, 2024. The free allowance dropped from one million assessments a month to ten thousand. Above that, the Standard tier charges a flat eight dollars a month covering usage from 10,001 up to 100,000 assessments. Cross 100,000 and you move into Enterprise per-assessment pricing at $0.001 each, which is one dollar per thousand. Very high-volume users negotiate a subscription with volume discounts through sales, and the Enterprise tier proper carries a twelve-month commitment. The shape of that pricing, and how it compares to the rest of the anti-bot market, is the subject of the economics of anti-bot vendors.

*The pricing reset that pushed a lot of free-tier integrations into a billing decision. The hundred-fold cut in the free allowance is the reason this product showed up on so many engineering radars in 2024.*

What is not cleanly itemised in the public billing docs is whether each add-on feature carries its own per-assessment surcharge on top of the base meter. The billing page describes the assessment-based model and the free allowance, but it does not publish a separate line item for Account Defender versus fraud prevention versus MFA. The tier-comparison page is the better guide there: it tells you which features exist at which tier rather than what each adds to the bill. The practical reading is that the assessment is the meter, the tier is the gate, and the exact per-feature accounting is something you confirm on an invoice or with sales rather than from a public rate card.

Transaction fraud as a separate verdict

fraudPreventionAssessment is the part of the response that has the least to do with bots and the most to do with money. It is a separate verdict, with its own score, aimed at payment fraud rather than automation. The free tier has no equivalent because the free tier never sees a transaction; it sees a page interaction and stops there.

The sub-object carries a transactionRisk score plus a set of structured verdicts: stolenInstrumentVerdict for a payment instrument that looks stolen, cardTestingVerdict for the pattern where an attacker runs many small authorisations to validate stolen card numbers, and behavioralTrustVerdict for whether the buyer’s behaviour matches a trustworthy pattern. Alongside those sit riskReasons, an array of explainability codes specific to fraud. The documented values include HIGH_TRANSACTION_VELOCITY (transactions arriving faster than normal), EXCESSIVE_ENUMERATION_PATTERN (the signature of someone walking through a space of values, such as testing card numbers), SHORT_IDENTITY_HISTORY (an account or identity with almost no prior history, which is what a fresh fraud account looks like), GEOLOCATION_DISCREPANCY (the geography does not line up with what is expected), and ASSOCIATED_WITH_FRAUD_CLUSTER (the request ties to a known cluster of fraudulent activity).

To get any of this you have to feed reCAPTCHA the transaction. That means passing transaction details on the assessment: cart value, payment method, shipping, the buyer and seller identities. This is a meaningfully larger integration than dropping a script tag, and it is API-only, which the tier-comparison page lists as an Enterprise-exclusive. The reason carding shows up both here, as cardTestingVerdict, and back in the basic risk reasons, as SUSPECTED_CARDING, is that the same underlying behaviour leaves a footprint at two layers: the bot layer notices the automation, the fraud layer notices the financial pattern. A request can score fine on automation and still trip cardTestingVerdict, which is exactly the case the free tier would wave through.

The firewall hook, briefly

One Enterprise feature blurs the line between scoring and enforcement. firewallPolicyAssessment lets reCAPTCHA return not just a verdict but an action to apply at your edge, expressed as a FirewallPolicy. The action types in the API are AllowAction, BlockAction (serve an HTTP error before the request reaches your backend), IncludeRecaptchaScriptAction (inject the reCAPTCHA script), RedirectAction (a 307 to a reCAPTCHA interstitial), SubstituteAction (transparently serve a different page than the one requested), and SetHeaderAction (set a header and forward to the backend). This is how reCAPTCHA plugs into Google Cloud Armor and similar edge layers, turning a score into an enforced policy without your application code making the decision per request. It moves part of the decision from your origin to the edge, which is the architectural choice covered in more depth in server-side vs client-side bot detection.

What you are actually buying, and what it is called now

Strip the marketing away and the Enterprise tier is three things stacked on the free v3 engine. It is a finer score, eleven levels instead of four. It is an explanation, the reason codes and extended verdict reasons that tell you why. And it is a set of account-aware verdicts, Account Defender with its annotation loop, MFA, fraud prevention, and password-leak detection, none of which the free tier can compute because none of them are per-request bot scores. The free tier scores a request. Enterprise scores a request, explains the score, and then keeps a model of the human behind it.

The naming has moved underneath all of this. On May 16, 2026 Google announced Cloud Fraud Defense as the umbrella product, with reCAPTCHA positioned as its bot-defense pillar. Existing customers became Fraud Defense customers automatically, no migration, same site keys, same assessment API. So the integration you write against projects.assessments.create today is unchanged; the product page it lives under is the one that renamed itself. The fields in this post are the durable part. What the dashboard calls the product is not.

If there is one number to keep from all of this, it is 26. Twenty-six bits of username prefix is the entire privacy budget that lets a company check a user’s password against four billion leaked credentials while the company learns nothing about the breach list and Google learns nothing about the password. The rest of the Enterprise feature set is competent engineering you could reason your way to. That one is the piece that is genuinely hard to build yourself, and it is the clearest answer to the question of what the tier is for.

Sources & further reading

Google Cloud (2026), REST Resource: projects.assessments — the Assessment resource schema, with riskAnalysis, accountDefenderAssessment, tokenProperties, and the password-leak fields.
Google Cloud (2026), Detect and prevent account-related fraudulent activities — Account Defender mechanism, the four labels, accountId, and the annotation feedback loop with reason codes.
Google Cloud (2026), Interpret assessments for websites — the eleven score levels, the four available without billing (0.1/0.3/0.7/0.9), and the ClassificationReason meanings.
Google Cloud (2026), Compare features between reCAPTCHA tiers — which features sit at Essentials, Standard, and Enterprise, including explainability depth and account/fraud defenses.
Google Cloud (2026), Detect password leaks and breached credentials — the lookupHashPrefix, the 26-bit prefix, the commutative-encryption verification flow.
Salmi, B. and Malenfant, A. (2022), Announcing reCAPTCHA Enterprise password leak detection in GA — the June 14, 2022 GA announcement and the privacy-preserving-API claim.
Pullman, J., Thomas, K. and Bursztein, E. (2019), Protect your accounts from data breaches with Password Checkup — the consumer PSI design, k-anonymity, blinding, and the ~4 billion-credential corpus.
Li, L. et al. (2019), Protocols for Checking Compromised Credentials — academic analysis of hash-prefix bucketing versus PSI and their bandwidth and leakage tradeoffs (ACM CCS 2019).
Zhao, Y., Jiang, J. and Liu, R. (2022), US Patent 11,366,892: Detecting compromised credentials by improved private set intersection — filed 2019, granted June 21, 2022, assigned to Shape Security; a related compromised-credential PSI method.
Google Cloud (2026), Configure multi-factor authentication — the accountVerification endpoints, the 15-minute requestToken, and the latestVerificationResult enum.
Google Cloud (2024), Billing information — the assessment-based meter, the 10,000-per-month free allowance, and the tiered pricing after the 2024 change.
InfoQ (2026), Google Introduces Cloud Fraud Defense as Successor to reCAPTCHA — the May 16, 2026 rebrand, with reCAPTCHA as the bot-defense pillar and no required migration.