Skip to content

The geolocation-vs-latency check: catching proxies with round-trip time

· 19 min read
Copyright: MIT
Geolocation vs latency: the speed-of-light floor that catches proxies

A residential proxy gives a scraper something a datacenter IP never can: an address that a geolocation database swears belongs to a Comcast subscriber in New Jersey, an AS that reads as consumer broadband, a reputation score that opens doors. The IP is the disguise. It is also a claim, and claims can be checked against physics.

Here is the claim under pressure. The IP says New Jersey. The server is in Virginia. Light in fiber covers that distance in a few milliseconds round trip. If the SYN-ACK comes back and the client’s ACK lands forty milliseconds later, the packets travelled too far for the story to hold. Something is sitting between the address and the machine that actually answered, adding distance that the geolocation lookup never sees. A proxy can spoof an IP. It cannot spoof the time it takes a photon to cross an ocean, and it can only ever make that time longer, never shorter. That asymmetry is the whole game.

This post works through how that check is built. We start with the speed-of-light floor and why latency has a hard lower bound but no useful upper bound. Then the TCP three-way handshake as the cleanest place to measure round-trip time, the cross-layer trick that compares the TCP handshake against the TLS handshake to catch a proxy without ever geolocating anything, and JA4L, the latency member of the JA4+ suite that bakes this into a fingerprint. We close on what the technique can and cannot prove, because a latency anomaly is evidence, not a verdict.

The floor that cannot be lowered

Geolocation of an IP address is mostly a lookup. A database maps the address to a city, an ISP, a latitude and longitude, and the server trusts it. Those databases are built from registry data, ISP disclosures, and a layer of measurement, and for most consumer IPs they land within city-level accuracy. A study of geolocation databases cited in the SNITCH paper found more than 80 percent of analysed IPs had an error under 100 km. Good enough to place an address on a map. Useless against an adversary who picked that address precisely because of where the map puts it.

Latency is different. Latency is not a record someone wrote down; it is a measurement the server takes itself, in real time, on the connection in front of it. And it sits on top of a constant nobody gets to negotiate. Signals in optical fiber propagate at somewhere between half and two-thirds the speed of light in vacuum. The JA4L documentation pins the working figure at 0.128 miles per microsecond, about 0.206 km per microsecond, in fiber. That gives a hard relationship between distance and time: a given round-trip time corresponds to a maximum possible distance, because the signal cannot have travelled faster than light to cover it.

The direction of that inequality is what matters. Real networks are slower than the fiber floor. They route around obstacles, queue in buffers, cross router hops that each add processing delay, and almost never follow the great-circle path between two points. So measured latency is always at or above the speed-of-light minimum for the true distance, never below it. An intermediary, a proxy or a VPN, can only add to that. It inserts an extra leg of network into the path, and every extra leg costs time. There is no operation a proxy can perform that subtracts latency. You can make a connection look slower; you cannot make it look closer than the physics allows.

So the check has a clean shape. Take the IP’s claimed location. Compute the speed-of-light minimum RTT from there to the server. Measure the actual RTT. If the measured value is wildly larger than the floor for the claimed location, the address is not where it says it is, or there is something in the path that the address does not account for. The SNITCH work, presented at the MADWeb 2025 workshop by a team at Fujitsu Research of Europe, builds exactly this into a server-side VPN detector. Their phrasing of the limit is the one to remember: a user trying to fake a location can only introduce delay, which raises the uncertainty of where they are, but cannot reduce latency below the physical bound.

server speed-of-light reach for measured RTT claimed IP location (per geolocation DB) claim sits OUTSIDE the circle the timing allows. measured RTT is too small for the claimed distance: geolocation spoofing. *A direct connection whose measured RTT places the real client well inside the radius its claimed IP would require. The address says one thing; the timing draws a smaller circle, and the claim falls outside it.*

The reverse case is the more common one in practice. A claimed location of New Jersey, a server in Virginia, and a measured RTT that corresponds to a transatlantic path. The floor is not violated here; the proxy added delay rather than subtracting it. But the delay is wrong for the claim. Three hundred milliseconds of round trip from an IP that geolocation insists is two states away is the signature of traffic that originated somewhere else and was tunneled in. One residential-proxy detection writeup states the heuristic plainly: an IP that belongs to a residential ISP in New York with a TCP RTT consistently above 300 ms is most likely tunneling traffic from overseas.

Measuring RTT where the machine cannot stall

To use any of this you need a clean number for round-trip time, and not every measurement gives you one. Application-layer timing is noisy. If you measure how long a full HTTP request takes, you fold in server think-time, TLS computation, and whatever the application did before it answered. The packet you want is one the remote machine has no choice but to send immediately, with no application logic between receiving and replying.

The TCP three-way handshake is that place. When a client opens a connection it sends a SYN. The server replies SYN-ACK. The client completes with an ACK. Those packets are generated by the kernel’s TCP stack, not by any application, and the stack answers a SYN with a SYN-ACK as fast as it can schedule the packet. There is essentially no think-time to hide latency. JA4L makes the same argument for using the earliest packets: they are low-level and machine-generated, so there is nearly zero processing delay in creating and sending them. The interval between them is almost pure network time.

From the server’s vantage point the useful measurement is the gap between the SYN-ACK it sent and the ACK it received back. That round trip, halved, is a one-way estimate of the distance to whatever completed the handshake. SNITCH measures precisely this interval, SYN-ACK out to client ACK in, as its TCP-based RTT. The same handshake also leaks the operating-system identity through the initial TTL and the TCP options, which is a separate fingerprint the TCP/IP stack fingerprinting and p0f passive OS posts cover, but the timing alone is what concerns us here.

client server SYN SYN-ACK ACK TCP RTT SYN-ACK out to ACK in *The server times the interval between the SYN-ACK it sends and the ACK that comes back. The kernel has no application logic to run between the two, so the gap is close to pure network round-trip time.*

Two refinements make this sturdier in production. First, you take more than one sample. The fiber floor is a minimum, so the smallest RTT you observe across a few measurements is the cleanest estimate; jitter only ever pushes a sample higher. SNITCH measures seven TCP and TLS round trips per client and keeps the minima, discarding connections whose jitter, an RTT standard deviation above 40 ms, marks them as too unstable to judge. Second, you watch the relationship between the minimum RTT and the smoothed RTT the kernel maintains. The Aroma project leans entirely on this ratio, dividing the Linux kernel’s tcpi_min_rtt by its smoothed tcpi_rtt and reading a persistently low ratio as a proxy tell. The intuition is that a single direct path keeps minimum and smoothed RTT close together, while an intermediary that has to maintain its own separate connection onward introduces a divergence the two values expose.

The cross-layer trick: TCP versus TLS

The speed-of-light check needs a geolocation database, a server location, and a propagation model. There is a sharper method that needs none of those. It catches a particular and very common kind of proxy using nothing but two timings taken on the same connection, comparing them against each other.

The insight belongs to BADPASS, work by Elisa Chiapponi and colleagues presented at ISPEC 2022, built around how residential-proxy services actually move packets. A residential-proxy network does not give the customer a clean tunnel to the exit node. The customer connects to a gateway the provider operates. That gateway terminates the TCP connection. It then relays the application data through its own infrastructure out to the residential device that owns the exit IP, which makes the final hop to your server. The TCP handshake and the TLS handshake therefore travel different distances.

Walk the packets. The TCP three-way handshake completes against the provider’s gateway, because that is the machine terminating TCP. So the TCP-handshake RTT measures the distance to the gateway, which the provider keeps close to the server for performance. The TLS handshake that follows is application data. It does not stop at the gateway; it is forwarded through the proxy infrastructure to the residential exit node and back. So the TLS-handshake RTT measures the full, longer path. On a direct connection the two handshakes traverse the same path and their RTTs come out close together. On a residential-proxy connection the TLS RTT is meaningfully larger than the TCP RTT, and that gap is the signature. BADPASS reads a connection as proxied when the TLS RTT exceeds the TCP RTT, and direct when the two are similar.

direct client server TCP TLS TCP RTT ≈ TLS RTT residential proxy server gateway (near server) exit node (claimed IP) TCP TLS forwarded the long way TLS RTT ≫ TCP RTT *On a direct connection both handshakes follow one path and time out together. Through a residential proxy the TCP handshake stops at a gateway kept close to the server, while the TLS handshake is relayed out to the distant exit node and back, so the TLS round trip runs much longer than the TCP round trip.*

What makes this elegant is that it never asks where anyone is. There is no database lookup, no propagation model, no assumption about the server’s coordinates. It compares a connection against itself. The two handshakes are a built-in control and treatment. That self-referential quality also dodges the largest source of error in the absolute-latency approach, which is that geolocation databases are only city-accurate to begin with. SNITCH adopts BADPASS as its first-pass proxy detector for exactly this reason, then layers its landmark-based geolocation check on top to catch the harder case, a full-tunnel VPN where the whole connection including TCP is encapsulated, so the TCP and TLS handshakes no longer split.

That harder case is worth naming because it bounds the trick. The TCP-versus-TLS gap exists only when something terminates TCP early and forwards a higher layer. A full-tunnel VPN does not do that. It encapsulates the entire IP flow, so the client’s TCP handshake rides inside the tunnel all the way to the real destination, and TCP and TLS RTT stay close together just as they would on a direct connection. SNITCH notes that VPNs maintaining a continuous TCP connection to the destination defeat the handshake-difference method, which is the gap their landmark approach was built to close. A SOCKS or HTTP proxy that breaks the TCP connection is caught by the cross-layer split; a VPN that tunnels it is not, and you fall back to the absolute speed-of-light check against geolocation.

CalcuLatency and the cross-layer threshold in numbers

The TCP-versus-TLS idea generalises. Any pair of measurements where one rides the application layer end-to-end and the other only reaches an intermediary gives you the same lever. CalcuLatency, presented at USENIX Security 2024 by a University of Michigan group, builds a detector on exactly that generalisation, combining several timing techniques against one another rather than betting on a single signal.

The reasoning the paper states is the clean version of the BADPASS intuition. Application-layer latency, browser to server, is end-to-end by construction; the request has to reach the actual server for the application to answer. A network-layer measurement, by contrast, may only reach the proxy. So the difference between an application-layer RTT and a network-layer RTT is a reliable indicator of an intermediary in the path. CalcuLatency pulls its measurements from WebSocket RTT, the TCP handshake, an ICMP ping, and a modified version of 0trace that does hop enumeration from inside an already-established connection so stateful firewalls and NATs do not simply drop the probe.

The numbers give the technique a concrete shape. CalcuLatency settles on a 50-millisecond RTT difference as the threshold for labelling a connection a remote VPN or proxy. In 98 percent of all direct measurements across both their testbed and crowdsourced datasets, the RTT difference fell below that 50 ms line. On the other side, 89.1 percent of testbed VPN measurements and 63.9 percent of the crowdsourced VPN measurements showed an RTT difference above 50 ms. Under a more conservative configuration, restricted to measurements where the ICMP ping succeeded and the modified traceroute reached the client’s network, they report a false negative rate of 2.9 percent and a false positive rate of 0.95 percent. The crowdsourced dataset spanned 144 autonomous systems across 37 countries on all six inhabited continents, which matters because a latency detector that only works on a clean lab network is not a detector.

The honest part of the CalcuLatency result is where it admits the floor. Roughly two-thirds of the VPN connections that slipped under the 50 ms threshold did so because the VPN server happened to sit close to the user, within about 650 miles. The detector cannot tell a nearby proxy from a direct connection, because a nearby proxy adds little latency, and little added latency is exactly what the method is built to see. The 650-mile figure is not a flaw in the implementation; it is the distance at which the added round trip drops below the noise. Push the proxy further and the signal returns. This is the same wall every latency method hits, and it is why latency is one input to a score rather than the score itself.

JA4L: latency as a fingerprint

The JA4+ suite, released by John Althouse at FoxIO in 2023, took a pile of network-fingerprinting ideas and gave them a consistent shape and naming. JA4 is the TLS client fingerprint, the successor to JA3, covered in the TLS fingerprinting post. JA4H fingerprints HTTP headers. JA4L is the latency member, and it encodes the speed-of-light reasoning above into a measured value you can attach to a session.

JA4L measures the latency between the first few packets of a connection, in microseconds, for the same reason the handshake measurement is clean: those packets are machine-generated with near-zero processing delay. It is split into a client-side and a server-side value. For TCP, the published formulas label the three handshake packets A, B, and C and compute the server measurement as half the interval (B − A) and the client measurement as half the interval (C − B), each carrying the relevant TTL. For QUIC the suite uses the equivalent early packets of that handshake. Each half-interval is a one-way latency estimate, and through the fiber constant it converts to a maximum distance: D = jc/p, with c the speed of light in fiber and j the measured latency. The documentation’s statement of the bound is the one that matters for detection. The far endpoint may be closer than the computed distance, but it is physically impossible for it to be farther, because the speed of light is constant.

That last sentence is the entire detection primitive. JA4L does not tell you where a client is. It tells you the farthest a client can possibly be, and that ceiling is enough. If a session’s other fingerprints, its TLS JA4, its HTTP JA4H, its claimed geolocation, say the client is in one place, and the JA4L distance ceiling cannot reach that place, the story is inconsistent. Althouse frames the canonical use as session integrity: a session token that suddenly changes location, operating system per the JA4L TTL, and application per JA4 and JA4H is a session worth revoking, because a legitimate user does not teleport.

JA4+ suite JA4 TLS ClientHello JA4H HTTP headers JA4L latency (µs) JA4X X.509 cert D = jc/p → maximum distance client cannot be farther than this ceiling *JA4L is the latency member of the JA4+ suite. It converts the early-packet timing into a one-way latency and, through the fiber constant, into a maximum possible distance. The output is a ceiling, not a position.*

JA4L being a member of a suite is the point worth dwelling on. A latency value in isolation answers a narrow question. Combined with the TLS and HTTP fingerprints on the same connection it becomes a cross-check, and the cross-check is where the strength is. The HTTP/2 fingerprinting and Accept-header triad posts cover the application-layer signals that ride alongside; JA4L is the network-distance signal that those cannot lie about without breaking physics. A scraper can forge a Chrome-shaped TLS ClientHello and a perfect header order, and still get caught because the packets took too long to arrive for the IP it presented.

What the check proves, and what it only suggests

Latency-based proxy detection is strong in one direction and weak in another, and treating it as symmetric is how false positives happen. The strength is the floor. A measured RTT genuinely cannot place a client closer than the speed of light allows, so a claimed location that the timing cannot reach is a hard inconsistency, not a probabilistic one. When the geometry says impossible, it is impossible. That half of the test carries real weight.

The other half is softer, and every careful source on this says so. A latency that is merely too large for a claim has many innocent explanations. A satellite link adds hundreds of milliseconds with no proxy in sight. Carrier-grade NAT, an enterprise firewall, a congested peering point, a bad Wi-Fi hop, a mobile connection mid-handoff, all inflate RTT and all jitter it. The residential-proxy writeup that gave us the 300 ms heuristic immediately qualifies it: the signal belongs in a weighted risk score, not a binary block, because NATs and enterprise firewalls can legitimately alter packets and timing. CalcuLatency draws the same line with its 650-mile blind spot, and SNITCH builds an explicit jitter filter and error margin into its detector precisely because raw latency is too noisy to convict on alone. The honest version is that high latency raises the location’s uncertainty rather than fixing it, which is the same thing the SNITCH authors say about an adversary’s own ability to manipulate it.

So in a real detection stack latency is a weight, not a gate. It feeds a score alongside the ASN reputation of the IP, the TLS and HTTP fingerprints, the OS-versus-user-agent consistency check, and whatever behavioral signals the client surfaces. None of those is conclusive alone. Together they make a proxy expensive to hide. The latency signal’s specific contribution is that it is the one input grounded in a physical constant rather than a learned heuristic or a database someone has to maintain. ASN reputation can be poisoned with fresh IP ranges. Fingerprints can be mimicked. The speed of light cannot be lobbied. A proxy operator can move an exit node closer to shrink the added delay, and the better residential networks do exactly that, but closer is the only direction available, and there is a floor under how close a tunneled connection can pretend to be while still terminating somewhere else. The defender does not need to know where the client is. Knowing the client is farther than it claims is already enough to stop trusting the claim.


Sources & further reading

Further reading