How proxy networks source IPs: SDKs, residential peers, and the ethics question
Buy a gigabyte of residential proxy traffic from any of the big vendors and you get the same promise: the request leaves from a real home connection, on a real consumer ISP, with a real residential IP that no anti-bot system will flag as a datacenter range. That is the entire product. The IP has to belong to an actual house, an actual phone, an actual smart TV in someone’s living room. Which raises the question the marketing pages tend to skip. Whose house? Whose phone? And did they agree to this?
That question has a long and uncomfortable answer. The same residential IP can be sourced four very different ways, sitting on a spectrum from a clearly-consented opt-in screen at one end to silent malware on a pirated game at the other. The vendors at the clean end and the criminals at the dirty end sell into the same market, route through the same kinds of devices, and are nearly indistinguishable from the buyer’s seat. This post is about that supply chain. Where the IPs come from, the mechanisms that put them there, and why “ethically sourced” has become the most contested phrase in the industry.
The plan: start with the peer-payout apps that pay you cents to share bandwidth, then the internet-sharing SDKs that bundle the same thing invisibly into other apps, then the free-VPN model that turned its own users into the product, and finally the malware end, where there is no consent at all. Along the way, the recurring fault line is consent, what was disclosed, what the user actually understood, and how far the real behaviour drifts from the dialog box they tapped through.
The product is a real person’s connection
Start with what the buyer is paying for, because it explains everything downstream. Anti-bot systems weight the network layer heavily. A request from a known datacenter ASN, an Amazon or Hetzner or OVH range, carries a reputation penalty before the first byte of the application payload is even parsed. (We cover the network-layer side of this in how anti-bot vendors detect residential proxies and ASN reputation.) A request from Comcast, Jio, or Deutsche Telekom does not. The IP belongs to the pool an ISP hands out to paying household customers, so it looks like exactly what most legitimate traffic is: a person at home.
That reputational difference is the whole value. It is why residential and mobile bandwidth costs orders of magnitude more than datacenter bandwidth, and why a market exists to acquire as many real consumer IPs as possible. The catch is structural. An IP that belongs to a real household is, by definition, attached to a real household’s device and a real household’s internet bill. You cannot rent that IP without the cooperation, or the compromise, of whoever owns the line. Every sourcing model below is a different answer to the same problem: how do you get traffic to exit from a stranger’s home connection?
The FBI put the spectrum plainly in a 2026 public alert, splitting residential proxy enrollment into two buckets. Either “the owner of the device provides consent,” or “the owner of the device does not provide consent and is unaware their IP address is being used.” Everything else is detail on top of that binary.
Peer-payout apps: the honest end of the market
The cleanest version is the one where the device owner is also the seller. Install an app, let it run in the background, get paid a small amount for the bandwidth that flows through your connection. Honeygain launched this model in 2019 as a standalone passive-income app. IPRoyal’s Pawns, Peer2Profit, PacketStream, EarnFM, and a long tail of similar apps followed. The pitch is straightforward and, on its face, consensual: your internet sits idle most of the day, so rent it out.
*The peer-payout flow. The device owner installs the app knowingly and gets paid; the buyer's traffic exits from the owner's residential IP.*The economics tell you who is actually getting paid. Proxyway’s market research put SDK-style payouts at roughly $0.001 to $0.03 per user per day, or framed by the rate cards, somewhere around $100 to $400 per thousand daily active users per month in the better regions. ScrapeOps walked through the other side of the same trade and the asymmetry is stark: a provider that pays an app owner about five cents per monthly active user can push ten to twenty gigabytes through that user’s line and resell it for a hundred to several hundred dollars. The bandwidth seller captures a sliver. The aggregator captures the rest. That gap is the reason the whole supply chain exists.
Even at the honest end there are two problems worth naming. The first is that the device owner has no control over what their IP is used for. The traffic is “anonymized, encrypted, and used by businesses for tasks like SEO monitoring, ad fraud detection, competitive analysis,” in the vendors’ own description, but the seller cannot inspect it and cannot decline a particular job. If a buyer uses the exit node for credential stuffing or to hit a site that later files an abuse complaint, the complaint lands on the household’s IP, not the vendor’s. The second is that “idle bandwidth” is a softer sell than the reality. The device is an open relay for third-party traffic for as long as the app runs.
Internet-sharing SDKs: the same thing, made invisible
The peer-payout app at least puts the deal in front of the person taking it. The SDK model takes the identical bandwidth-sharing engine and embeds it inside someone else’s app, where the end user came for a flashlight, a video player, or a mobile game and never sought out a proxy product at all.
Mechanically it is the same component. Honeygain, after launching as a standalone app, “later rolled out an SDK, allowing developers to embed the same monetization layer into their own apps.” Bright Data ships Bright SDK. Infatica runs an invitation-only SDK aimed at high-DAU VPN and utility apps. The developer drops the library in “just like any other monetization or analytics component,” and from then on a share of their users’ devices become exit nodes. The developer gets paid per active user or per thousand IPs; the proxy vendor gets inventory; the user gets, in the best case, an opt-in screen during onboarding and, in the worst case, a line buried in a terms-of-service document they never read.
The consent question lives entirely in that screen, and the screen is controlled by the app publisher, not the proxy vendor. This is the structural weakness. Bright Data’s own sourcing requirements insist the “SDK is enabled only after the user opted-in” through a “clear consent screen,” that partner apps carry a “native UI with accessible settings for opt-out,” and that participation is documented in the app’s terms. On paper that is a real consent process. In practice the wording, prominence, and honesty of the dialog are delegated to whoever shipped the app, and their incentive is to maximize enrollment, not comprehension.
How wide that gap can get became a live story in June 2026, when an independent researcher working with a security firm reverse-engineered Bright Data’s iOS SDK as it appeared in a set of smart-TV apps. The technical finding was about scale versus disclosure. The SDK could route up to 200 gigabytes of traffic per month per device, with the cap “far higher” in some countries, while at least one Roku app’s consent screen, for a service called Petflix, told users it would use the device and connection only “occasionally.” Two hundred gigabytes a month is not “occasionally.” The smart-TV angle made it worse: a television is a device nobody thinks of as a computer, left on for hours, rarely scrutinized, and the apps named in the writeup were the kind of free streaming utilities that users install without a second thought.
*The disclosure said "occasionally"; the SDK's monthly ceiling was 200 GB per device. The vendor's claimed average was around 50 MB a day, which still leaves a wide gap to the cap a user never sees.*Bright Data pushed back on the characterization. Its position is that the opt-in screen names the company, links its privacy policy and license, and lets users decline in two steps while keeping the app, and that the SDK reaches only approved domains, collects no browsing history or personal data, uses only the device’s IP, and averages around 50 megabytes a day on Wi-Fi rather than running flat-out at the cap. Both things can be true at once. The vendor’s controls can be real and the user’s understanding can still be near zero, because the number that governs the worst case, the 200 GB ceiling, is not the number on the screen. That is the core of the SDK problem. Consent is collected against a description that does not bound the behaviour.
Free VPNs: when the users are the inventory
The free-VPN model is where sourcing and irony meet. A VPN is a product people install specifically to protect their network privacy. Several of the largest residential proxy pools were built by turning that exact product into the harvesting mechanism.
The canonical case is Hola, and it is worth telling because it set the template. Hola Networks shipped a free peer-to-peer VPN at the end of 2012. By late 2014 the company had a network of roughly nine million IPs and started selling access to it through a sister brand, Luminati, which years later renamed itself Bright Data. The architecture routed users’ traffic through each other to defeat region blocks, which meant every Hola user was simultaneously an exit node for every other user, and, once Luminati existed, for paying outside customers. The scheme surfaced in May 2015 when the operator of the imageboard 8chan, Fredrick Brennan, traced a flood of POST-request attack traffic against his site back to the Luminati network. A spammer had simply bought access to Hola’s users and pointed them at the target.
Hola’s defense was that the bandwidth arrangement had always been in the terms of service. Pressed on whether users actually knew, the company’s founder conceded the obvious. Asked if 100 percent of users understood they were part of a peer-to-peer network, the answer “is no.” The same month, security researchers publishing under the “Adios, Hola” banner went further, and a Vectra analysis showed the client carried a built-in console that stayed active even when the user was not browsing, capable of listing and killing processes and pushing more software to the machine. The bandwidth sale was the business model; the remote-control surface was a vulnerability sitting on top of it. Either way the user signed up for free region unblocking and became sellable infrastructure.
That pattern, a free network tool whose real product is its users’ connections, has repeated at every scale since. At the criminal end it stops being a gray-area business model and becomes a federal case.
The malware end: no consent, by design
Strip away the consent screen entirely and you arrive at the largest residential proxy networks ever measured, built from devices whose owners never agreed to anything and never knew.
The reference case is 911 S5, dismantled by a US-led international operation in May 2024 and described by the Department of Justice as “likely the world’s largest botnet ever.” It was associated with more than 19 million unique IP addresses across some 190 countries, including 613,841 IP addresses in the United States alone. The administrator, a 35-year-old Chinese national named YunHe Wang, was arrested in Singapore on 24 May 2024 and had earned roughly $99 million renting out those hijacked IPs between 2018 and July 2022.
The sourcing mechanism is the part that matters here, because it is the free-VPN model with the consent removed. 911 S5 spread its proxy backdoor inside free VPN applications: MaskVPN, DewVPN, PaladinVPN, ProxyGate, ShieldVPN, and ShineVPN. Those VPNs were themselves bundled into pirated games and cracked software distributed through a pay-per-install affiliate program. As Brian Krebs documented when he first exposed the operation in 2022, the VPN “performed largely as advertised for the user,” letting them browse anonymously, while it “quietly turned the user’s computer into a traffic relay for paying 911 S5 customers.” The user installed a cracked game. The game installed a free VPN. The VPN installed a proxy backdoor. At no point was there a dialog asking permission to relay strangers’ traffic, and the IPs sold downstream were used for, among other things, billions of dollars in pandemic-relief fraud.
*The 911 S5 chain. Consent is collected for the top-level product and quietly inherited downward to things the user never sees.*911 S5 was not a one-off. In January 2026 Google’s Threat Intelligence Group described disrupting IPIDEA, which it called “one of the largest residential proxy networks in the world,” and the report reads like a field catalog of every sourcing method at once. IPIDEA and a dozen affiliated proxy and VPN brands, among them 360 Proxy, 922 Proxy, PIA S5 Proxy, and several “VPN” apps, drew their pool from four named internet-sharing SDKs (Castar, Earn, Hex, and Packet), from more than 600 trojanized Android applications and over 3,000 distinct malicious Windows binaries masquerading as things like “OneDriveSync” and “Windows Update,” and from cheap off-brand Android set-top boxes shipped with the proxy payload already inside. The SDK side was marketed exactly like the legitimate ones: developers embed the kit, “they are then paid by IPIDEA usually on a per-download basis.” The overlap with known malware was direct; Google tied the EarnSDK enrollment domains to the BadBox 2.0 botnet, and noted IPIDEA exit nodes overlapping with the Aisuru and Kimwolf families.
What that report makes unavoidable is that the SDK channel and the malware channel are not separate industries. They are the same pool, fed from both ends, sold through the same storefronts. Over a single seven-day window in January 2026, Google counted more than 550 distinct tracked threat groups, including state-linked actors from China, North Korea, Iran, and Russia, routing through IPIDEA exit nodes for credential spraying and access to victim environments. The buyer paying for “residential IPs” to do legitimate price aggregation and the state actor paying for the same pool to launder an intrusion are drawing from one inventory.
Why the buyer cannot tell the difference
Here is the structural problem that makes the ethics question hard rather than academic. From the position of someone buying proxy bandwidth, a cleanly-consented Honeygain peer and a malware-conscripted 911 S5 victim present identically. Both are residential IPs on consumer ISPs. Both carry good reputation with anti-bot systems precisely because they are real households. The buyer’s tooling cannot see the consent screen that did or did not appear on the far device, and the vendor between them has every incentive to describe the whole pool as “ethically sourced” regardless of how any individual IP got there.
The phrase “ethically sourced” has become a compliance artifact for exactly this reason. Vendors publish trust-center pages describing opt-in flows, KYC on partners, GDPR and CCPA posture, and two-click opt-out. Those documents describe the front door of the supply chain, the SDK partnerships the vendor controls directly. They say much less about the resale and aggregation layers where pools get mixed, where one vendor buys inventory from another, and where a clean-sourced node and a conscripted one end up in the same rotation. The honest reading is that “ethically sourced” describes a vendor’s intended sourcing policy, not a verifiable property of every IP they sell.
For anyone building crawling infrastructure, this is also a practical risk, not only a moral one. Buying residential bandwidth means routing your traffic through devices you do not control, whose owners may not know, and whose IPs may already be burned by the last buyer or flagged by an abuse complaint. The same node that gives you clean residential reputation today can be a known-bad exit node tomorrow, which is why proxy reputation is a moving target and why pool hygiene matters as much as raw pool size. (The operational side of that, rotation and burn-rate, is its own discipline; see proxy pool management and the broader residential vs datacenter vs mobile tradeoffs.) The cleaner your sourcing, the smaller and more expensive your pool, and the market rewards the opposite.
The detection arms race runs the other way too
The flip side of all this sourcing effort is that anti-bot vendors have spent years learning to undo it. If residential IPs are valuable because they look like real households, the defensive move is to figure out which residential-looking IPs are actually proxy exit nodes. ASN reputation, the density of distinct accounts behind a single IP, the mismatch between an IP’s claimed geography and the latency or TLS characteristics of the connection, all feed models that try to re-flag laundered residential traffic. A residential IP running 200 gigabytes of third-party scraping traffic a month behaves nothing like the household it belongs to, and that behavioural signature is detectable even when the IP reputation is clean. The sourcing arms race and the detection arms race are the same race seen from opposite ends, and they share a cadence with the rest of the anti-automation field, where a working technique has a shelf life measured in months. (We trace that cycle in the lifecycle of a stealth patch.)
The exact features each vendor weights are not public, and the ones that are documented change often enough that any specific list dates quickly. What does not change is the underlying tension. A proxy network’s value is proportional to how convincingly its IPs pass as ordinary people, and a detection vendor’s value is proportional to how reliably it can tell that they are not. Both sides are optimizing against a population of real households who, in the cleanest case, agreed to a vaguely-worded screen and, in the worst case, agreed to nothing at all.
What the supply chain actually looks like
Step back and the four sourcing models form a single gradient with consent as the axis. At one end, a person installs Pawns or Honeygain, reads roughly what they are signing up for, and gets paid a few cents to relay traffic. A step over, an SDK puts the same engine inside an app the user came for something else, behind a consent screen the app publisher wrote and the user skimmed, with a true traffic ceiling that the screen never mentions. A step further, a free VPN markets privacy while selling its users as exit nodes, disclosed only in a terms-of-service document nobody opened. And at the far end, malware bundled into pirated software conscripts millions of machines whose owners are not told and not paid, feeding the same market through the same brands.
The uncomfortable conclusion is that these are not four separate markets with the criminal one quarantined off to the side. The IPIDEA disruption showed one operator running consented SDKs and trojanized binaries and malware-laden set-top boxes into one pool, sold under thirteen brand names. The honest peer-payout app and the 19-million-device botnet are points on a continuum, not opposite categories, and the vendors in the middle have a commercial reason to keep the distinction blurry. “Where do residential proxy IPs come from” has a clean-sounding answer on every vendor’s trust page and a much messier one in the takedown reports, and the gap between those two answers is the entire ethics question. The number to remember is the one from the smart-TV SDK case: a consent screen that said “occasionally,” sitting on top of a device that could relay 200 gigabytes a month. That distance, between what the user was told and what the device actually did, is where this whole industry lives.
Sources & further reading
- Google Threat Intelligence Group (2026), Disrupting the world’s largest residential proxy network — technical takedown report on IPIDEA naming its SDKs, trojanized apps, set-top-box payloads, and C2 architecture.
- The Hacker News (2026), Free apps are quietly turning smart TVs into web-scraping proxies for AI — the June 2026 reverse-engineering of Bright Data’s iOS SDK, the 200 GB cap, and the “occasionally” consent screen.
- U.S. Department of Justice (2024), 911 S5 botnet dismantled and its administrator arrested in coordinated international operation — the 19-million-IP figure, the VPN app names, and the YunHe Wang arrest.
- Krebs on Security (2024), Treasury sanctions creators of 911 S5 proxy botnet — how the free VPNs bundled the proxy backdoor and the operation’s history through Cloud Router.
- The Hacker News (2015), Hola — a widely popular free VPN service used as a giant botnet — the original Hola/Luminati bandwidth-resale disclosure.
- Vectra AI (2015), Technical analysis of Hola vulnerabilities enabling cyber attacks — the built-in console and remote-control surface inside the Hola client.
- Proxyway (2026), Internet sharing SDKs: a closer look at the emerging app monetization method — how SDK monetization works, the named providers, supported platforms, and payout economics.
- ScrapeOps (2026), The crazy economics of residential and mobile proxies — the per-MAU payout versus resale asymmetry that funds the supply chain.
- Trend Micro (2023), Hijacking your bandwidth: how proxyware apps open you up to risk — the security risks of proxyware and how it gets bundled silently with other software.
- FBI (2026), Evading residential proxy networks: protecting your devices from becoming a tool for criminals — the consent/no-consent split and the SDK-partnership enrollment mechanism.
- Bright Data (2026), Ethically sourcing residential proxies — a vendor’s own description of opt-in requirements, partner KYC, and opt-out controls, useful as the “front door” view of the supply chain.
Further reading
Proxy pool management: rotation, health checks, and burn-rate economics
Traces how a working proxy pool is operated: rotation strategies, the difference between a banned IP and a dead one, health-check state machines, sticky versus rotating sessions, and the per-GB cost model that decides whether a crawl is profitable.
·22 min readResidential vs datacenter vs mobile proxies: detection, cost, and use cases
A vendor-neutral comparison of the three proxy types: how each is sourced, how each gets detected at the ASN and reputation layer, what a gigabyte actually costs, and which job each one fits.
·19 min readSession and cookie management across a proxy fleet
How identity stays coherent when a crawler rotates IPs: binding cookies and sessions to exit nodes, what breaks when a session leaks across IPs, and the signals anti-bot systems use to catch the mismatch.
·22 min read