Skip to content

Sticky sessions vs rotating IPs: when each makes or breaks a scrape

· 21 min read
Copyright: MIT
One IP held steady across a session versus a fresh IP per request, with a consistency check between them

There is a single decision sitting under every proxy-backed scrape, and most people make it by accident. Do you hold one exit IP for the length of a session, or do you hand each request a fresh one? The defaults push you toward rotation, because rotation is what proxy marketing sells and what spreads load off any single address. But a lot of targets are built on the assumption that one human sits behind one IP for the duration of a visit, and the moment your traffic violates that assumption, you stop looking like a noisy user and start looking like exactly what you are.

The wrong choice fails in two opposite directions. Rotate when the target tracks session continuity, and you trip a consistency check: the cookie you earned on IP A arrives from IP B, and the whole session is thrown out. Stay sticky when you should rotate, and you concentrate a thousand requests on one address until rate limits or reputation scoring shut it down. Neither failure announces itself clearly. Both look like “the scrape stopped working,” and the fix is never the code you were staring at. This post is about making that decision on purpose.

The roadmap. First, what “session” and “sticky” actually mean once you stop trusting the marketing, including the load-balancer sense of the words that collides with the scraping sense. Then the real axis: which targets bind state to an IP and which do not, with the cookie mechanics that decide it. Then the four anti-bot consistency checks that turn a bad rotation into a block. Then the cases where rotation genuinely wins, where stickiness is mandatory, and the awkward middle where you want both at once. We close on the cost, because every IP you hold and every IP you burn shows up on an invoice.

What sticky and rotating actually mean

A web session is two halves glued across a stateless protocol. The client holds a token, almost always a cookie, and the server keeps state keyed by that token. RFC 6265 defines the mechanism: Set-Cookie sends a value out, the Cookie header carries it back, and that round trip is the only reason a server remembers anything about you between requests. None of the standard cookie attributes (Secure, HttpOnly, SameSite) say anything about a network address. A plain session cookie is a bearer token. Whoever presents it is treated as the party it was issued to, no matter where they present it from. Every IP-binding scheme later in this post exists because that default offers no location guarantee at all.

“Sticky” is overloaded, and the overload causes real confusion. In the load-balancer world it means server affinity, not exit-IP stability. HAProxy draws the line precisely: affinity uses information below the application layer, typically the source IP, to keep a client pinned to one backend, while persistence uses application-layer information, typically a cookie, to do the same. Their phrasing is that persistence makes them “100% sure that a user will get redirected to a single server,” whereas affinity means the user “may be redirected to the same server.” Affinity is a hint. Persistence is a contract. That distinction matters here because a target doing source-IP affinity behind its CDN will route a rotating client to a different backend on every request, scattering whatever per-backend state it was building.

When a proxy provider sells you a sticky session, they mean the other end of the pipe. The provider holds your exit IP stable for a configured window so every request in that window leaves from the same address. That is the feature you buy to keep a session coherent. Rotation is the opposite default: a new exit IP per request, drawn from the pool, with no attempt to keep two requests on the same address.

Same five requests, two routing policies sticky session all exit 203.0.113.7 rotating per request five distinct exits, five geolocations one identity the target can trust as continuous five strangers, or one user teleporting *The same workload under the two policies. The question is never which is "better" in the abstract; it is what the target keys identity on.*

How long is sticky? It depends on the provider, and the spread is wide. IPRoyal exposes the controls directly: a _session- tag takes an 8-character alphanumeric string to name the session, and a _lifetime- tag sets the hold, from one second to seven days. Decodo defaults to a 10-minute sticky window and caps it at 1440 minutes (24 hours), set with a sessionduration field in the username. ScrapeOps holds a session for 10 minutes from the last request, an inactivity timeout rather than a hard clock, so a session stays alive as long as you keep using it. The numbers vary, but the shape is the same: you name a session, you set a hold, and the provider does its best to keep one IP under it.

That “best” is load-bearing. Residential and mobile exit IPs are real devices on real networks, and they go offline mid-session whether you like it or not. IPRoyal’s default is to silently swap in a replacement when the held IP drops, unless you set _killswitch-1, which makes the request return HTTP 410 instead of quietly rotating under you. That switch is the difference between “I lost the IP and the provider hid it” and “I lost the IP and got told.” For a stateful flow, the second is what you want, because a silent swap mid-checkout looks identical to a hijack from the target’s side.

The real axis: does the target bind state to an IP?

Forget the proxy vocabulary for a moment. The decision turns on one property of the target: does it tie the state it issues you to the address that earned it? If yes, rotation under that state is self-sabotage. If no, rotation is free and stickiness is wasted IP-hours. Everything else is detail.

Plenty of targets do not bind. A public product listing, an open search-results page, a price grid behind no login: these hand out a bearer cookie at most, and they validate it as a bearer token. You can present it from anywhere. For these, rotation is the correct default and stickiness buys nothing but slower IP cycling. The economics there is straightforward, and we cover the wider cost model in the economics of a scraping operation.

The binding targets are where the decision bites. Anything with a login binds, because a session that survives an IP change from a residential address in one country to a datacenter in another is exactly the session a stolen-credential attacker would carry, and defending against that attacker is the same defence that catches your scraper. Carts and checkout flows bind, because the server stitches cart state, CSRF tokens, and the originating address together to stop cart-stuffing and fraud. Multi-step forms bind when a token minted on step one is checked against the address on step three.

Then there are the anti-bot vendors, who bind on purpose and document part of it. Cloudflare’s cf_clearance cookie is the clearest published case. Cloudflare says the cookie is “securely tied to the specific visitor and device it was issued to,” and the open-source tooling built around it is blunter than the docs: the long-standing advice, repeated across the projects that scrape these cookies, is to “make sure you use the same IP and UA as when you got it.” Present a valid cf_clearance from a different exit IP and you are challenged again as if you had never solved anything. The cookie itself is fine; the context it arrives in is wrong, and the context is half the check. We trace that cookie’s full lifecycle in Cloudflare’s cf_clearance cookie.

DataDome’s published docs are quieter on the IP binding specifically. They describe the datadome cookie as 128 bytes, encrypted, carrying no PII, with a one-year expiry, used for “both server-side and client-side detection.” The docs do not spell out an IP lock. What practitioners observe in traffic, and what the reverse-engineering writeups converge on, is that a datadome token earned on one IP behaves like a fresh, unknown client when replayed from another: no stored fingerprint, no passed-challenge memory, a clean re-evaluation. The exact server-side rule is not public. What follows from the observed behaviour is the same operational constraint either way, which is that you keep the cookie and the IP together or you keep neither. The cookie internals are covered in the DataDome cookie lifecycle.

Akamai’s _abck is the third instructive case, and here the binding extends past the IP. The cookie is set after a sensor_data payload, collected by Akamai’s obfuscated client script, is posted and accepted. Replaying that cookie cleanly requires the IP, the user agent, and the TLS fingerprint to match what was present when the sensor data was generated. A single mismatch against the observed request fingerprint invalidates it. That is three axes of consistency, not one, and rotating the IP breaks the first of them before the other two even get evaluated. The payload mechanics are in Akamai’s _abck cookie.

How tightly the token is bound to where it came from bearer, ignores IP bound to IP + UA + TLS public listing login session datadome observed cf_clearance _abck Left of this line, rotate freely; the token travels. Right of it, every rotation under a live token is a fresh block, because the token is partly a claim about the network it came from. Published bindings: cf_clearance (Cloudflare docs), _abck (IP+UA+TLS, observed). datadome IP lock is inferred from traffic, not documented. *Position is approximate and the right two are partly inferred from observed behaviour, not vendor spec. The operational rule is the same across the bound end: hold the IP or lose the token.*

The four consistency checks that punish the wrong call

When a target says no to your rotation, it is usually one of four mechanisms doing the talking. None of them is exotic. All of them are cheap to run server-side, which is why they are everywhere.

The first is direct cookie-to-IP binding, the mechanism in the section above. The server records the address that earned a token and refuses the token from any other address. This is the bluntest check and the easiest to reason about. It fires on the first request that presents a bound token from the wrong IP. The defence is not clever: you keep the token and the IP married for the token’s life, which means a sticky window at least as long as the token you are reusing. For cf_clearance, that token defaults to a 30-minute clearance window, so a sub-30-minute sticky session that outlives the cookie is wasted and a sub-cookie sticky session that dies first is a self-inflicted block.

The second is impossible travel. Borrowed wholesale from account-security tooling, it compares consecutive requests on one session and computes the speed a real human would need to get between the two IPs’ geolocations. Log in from Frankfurt, then make the next request from São Paulo four seconds later, and no commercial flight covers that; the speed exceeds the few-hundred-miles-per-hour ceiling that security teams set, and the session is flagged. This is precisely the signal a naive rotating fleet generates: a single cookie presented from a dozen cities in a minute is the most impossible traveller imaginable. The same false-positive that plagues legitimate VPN and roaming users is the true-positive that catches a rotated session.

One cookie, two cities, four seconds apart Frankfurt t = 0s São Paulo t = 4s ~5,800 mi implied speed exceeds any aircraft; session flagged the same logic that catches stolen-credential logins catches a rotated cookie *Impossible travel is account-security tech repurposed for bot defence. A rotating fleet presenting one cookie is its ideal target.*

The third is ASN and IP-type incoherence. A session that begins on a residential ISP and continues on a datacenter range has not just moved, it has changed the kind of network it lives on, and that transition almost never happens to a real user mid-session. Anti-bot scoring weights IP type heavily before it even looks at behaviour: carrier and residential addresses carry high trust because they are shared by real people, while datacenter ranges sit near the bottom. The reputation gap is large. Industry trust-score buckets put mobile-carrier IPs in the high-80s to high-90s, residential in the rough middle, and datacenter near the floor, which is why a rotation that crosses those buckets inside one session reads as a tell rather than as movement. How vendors actually assign that reputation is the subject of how anti-bot vendors detect residential proxies and ASN reputation.

The fourth is fingerprint consistency above the IP layer, and it is the one that catches the over-correction. Rotate the IP perfectly and you still ship the same TLS ClientHello on every request, because the JA3 or JA4 fingerprint is a property of your client library, not your network. A target that links sessions by TLS fingerprint sees one client behind a dozen IPs and joins them right back up. A 2026 study on detecting bad bots through TLS handshakes leans on exactly this: the JA4 fingerprint stays consistent through IP rotation, which is what makes it useful as a bot signal in the first place. And the inverse is just as damning. Reuse one IP while varying the TLS fingerprint, which is what happens when you spread requests across mismatched HTTP libraries, and you have a single address presenting as several different clients, which is a stronger bot signal than either alone. The fingerprint and the IP have to tell the same story. The full mechanism is in TLS fingerprinting: from ClientHello bytes to JA4.

The through-line across all four is that anti-bot systems do not score a request in isolation. They score the sequence, and consistency across that sequence is the thing being measured. Rotation, done wrong, is a machine for manufacturing inconsistency.

When rotation actually wins

None of this is an argument against rotation. Rotation is the right default for the largest category of scraping, which is high-volume reads of stateless pages, and for that category stickiness is a liability.

The clearest case for rotation is rate-limit avoidance on unbound targets. A single IP making a thousand requests a minute against a product catalogue is the easiest thing in the world to throttle. Spread that across a pool and no single address carries enough volume to cross a per-IP threshold. The page hands out a bearer cookie at most, so there is no session to break. You parallelise wide, you cycle IPs fast, and the cap you respect is the global one, not a per-address one. The discipline that keeps this from looking abusive even when distributed is the subject of rate limiting yourself.

Rotation also wins when the target’s defence is primarily IP-reputation-based and per-IP. If the block is “this address has made too many requests” rather than “this session is incoherent,” then a fresh address resets the counter and you pay nothing for the reset because there is no session state to carry. This is the regime where a large rotating residential pool is straightforwardly the right tool, and where the cost question is just price-per-gigabyte against success rate.

And rotation is mandatory when you are deliberately presenting as many independent visitors. Scraping a thousand public profiles where each should look like a separate first-time visitor with no history is a rotation job by definition. Stickiness there would do the opposite of what you want: it would tie a thousand unrelated reads to one identity and one address, concentrating exactly the pattern you were trying to disperse. The mistake people make is reaching for rotation’s strength (dispersal) on a target that punishes dispersal (a bound session). The tool is not wrong. The match is.

When stickiness is not optional

On the other side sit the flows where rotation is not a tuning choice but a guaranteed failure. Any authenticated flow needs a sticky session for the duration of the authenticated work, because the login token is bound and impossible-travel scoring is watching. You log in on one IP, you do everything that session requires on that same IP, and you let the IP go only when the session is done. The sticky window has to be at least as long as the work, which is why the kill-switch behaviour matters: a silent mid-session IP swap on an authenticated flow is indistinguishable from a session hijack, and the target treats it as one.

Cart and checkout flows are the same story with money attached. The server stitches cart contents, CSRF tokens, and the originating address into one fraud-detection picture, and an IP change between adding to the cart and checking out reads as the cart being driven by someone other than the person who filled it. Multi-step forms that mint a token on one step and validate it on a later step fail the same way: rotate between steps and the token is presented from the wrong address, and CSRF validation, which often pins to the originating IP, rejects it.

The harder cases are the ones where the binding is invisible until it bites. A search flow that issues a query token on the first page and validates it, IP-bound, on the pagination requests will work perfectly for one page and then fail silently on page two if you rotated underneath it. The failure does not say “you rotated.” It says “invalid token” or, worse, it returns a plausible-looking empty result set, and you spend an afternoon debugging your parser when the bug was in your proxy policy. This is the most expensive class of mistake, because it is the quietest, and the only defence is to assume binding and use stickiness wherever a flow carries a token across requests.

A short decision path does a token cross requests? no yes rotate per request sticky for the flow, window > token life cap the global rate, not the per-IP one keep IP, UA, and TLS constant together *The whole decision compressed. "Does a token cross requests" is a better question than "is this site protected," because it predicts the failure mode directly.*

The middle: sticky pools and per-account pinning

Most real operations are not all-rotate or all-sticky. They are a fleet of sticky sessions, run in parallel, each one coherent on its own. You want the dispersal of a pool and the coherence of stickiness at the same time, and the way you get both is to make the session, not the request, the unit of rotation.

The pattern is one sticky exit IP per logical identity. Each account, or each independent crawl context, gets its own named session and holds one IP for the length of its work. The fleet rotates at the session boundary, not inside a session. Twenty accounts means twenty sticky sessions on twenty IPs, each internally consistent, the fleet as a whole spread across twenty addresses. This is the shape that satisfies both constraints: no single IP carries enough volume to trip a rate limit, and no single session ever presents from two addresses. The cookie and identity plumbing that makes this hold together across a fleet is its own subject, covered in session and cookie management across a proxy fleet.

The constraint that makes pinning non-trivial is geo-coherence. If an account’s IPs are allowed to wander across countries between sessions, you reintroduce the impossible-travel and ASN-incoherence signals at the account level instead of the request level. An account that logs in from Germany on Monday and Brazil on Tuesday and Japan on Wednesday looks like a compromised account even if each individual session was perfectly sticky. So the pool for one identity should hold geography roughly constant, ideally same country and same region, across sessions and not just within them. The trust the account builds is partly trust in its location stability, and rotating geography throws that away.

Mobile proxies complicate the picture in a way that cuts both directions. Carrier-grade NAT puts hundreds or thousands of real subscribers behind a single public IPv4 address, which is why mobile IPs carry such high trust: a target cannot block one without blocking a crowd of paying customers. Cloudflare’s own analysis of CGNAT found that “a single IPv4 address may represent hundreds or even thousands of users,” with some regions showing far higher user-to-IP ratios than others. For a scraper, that shared-IP property means the per-IP reputation signal is weak by construction, so the target leans harder on session-level consistency and fingerprinting to tell users apart on a shared address. The IP stops being a useful identity, so everything above the IP carries more of the load. Stickiness on a mobile IP buys you crowd cover; it does not buy you a free pass on the consistency checks, because those are exactly what the target falls back on when the address is shared.

What it costs to choose wrong

Every IP you hold and every IP you burn is a line on an invoice, and the sticky-versus-rotating decision is a cost decision before it is a detection one. Sticky sessions on residential or mobile pools are billed by bandwidth and by hold time, and a flow that needs a 25-minute sticky window to outlive a 30-minute clearance cookie is paying for those minutes whether or not it sends traffic in all of them. Rotation spends differently: you burn through addresses, and on a clean unbound target that burn is cheap because every fresh IP works. On a bound target the same burn is ruinous, because every rotation throws away a session you paid to establish, and you re-pay the establishment cost (the challenge solve, the sensor payload, the login) every time. The solve cost dominates the proxy cost the moment a target binds, which is the whole argument of the economics of a scraping operation.

The most expensive failure is the silent one. A rotation policy that quietly breaks a bound session does not throw an error you can alert on. It returns a 200, an empty result, a re-challenge page that your parser reads as “no data,” and your dashboards stay green while your data goes hollow. The fix is to instrument for it: track the rate at which sessions survive their full intended length, watch for the cookie-issued-on-A-presented-from-B pattern in your own logs, and treat a sudden drop in per-session request depth as the same kind of alarm as a spike in 403s. Without that, the cost of a wrong rotation policy is not a block you can see. It is a slow corruption of the data you are paying to collect, discovered weeks later when someone notices the numbers are wrong. The instrumentation that catches it is the subject of scraping observability.

The decision itself is small. Hold the IP when a token crosses requests; let it go when nothing does; keep the IP, the user agent, and the TLS fingerprint telling one story for as long as the session is alive. What makes it hard is that the targets do not tell you which kind they are, and the penalty for guessing wrong arrives quietly, dressed up as a parser bug or a flaky page. The operators who get this right are not the ones with the biggest pools. They are the ones who decided, per flow, what the target keys identity on, before they sent the first request.


Sources & further reading

Further reading