Skip to content

Akamai's queueing and rate control: waiting rooms at the CDN edge

· 23 min read
Copyright: MIT
The words EDGE QUEUE as a monospace wordmark with a single orange arrow passing through a gate

A waiting room is a strange thing to build into a content delivery network. The whole point of a CDN is to make requests faster, to absorb load, to push bytes as close to the user as physics allows. A waiting room does the opposite. It deliberately holds people back, parks them on a holding page, and lets them through a few at a time. Akamai sells both, and it sells them through the same edge platform, which raises a question worth answering carefully: what does it actually mean to queue people at the CDN edge, and how is that different from bolting a dedicated queue vendor onto the front of your site?

The answer turns on where the decision gets made. Akamai has two distinct ways to queue traffic, and they sit at opposite ends of a spectrum. One is a managed cloudlet that does crude percentage-based admission with almost no per-visitor state. The other is general-purpose edge compute that can run a full queue connector, validate a cryptographically signed token on every request, and never touch your origin to do it. Both run before the request reaches your servers. Neither is a queue in the strict ordered-line sense that the word implies. That gap, between what “waiting room” suggests and what the edge actually does, is the thing this post is about.

Here is the route. First, why the edge is the right place to queue at all, and what Akamai’s edge gives you to work with. Then the Visitor Prioritization cloudlet, the older managed answer, and exactly how its percentage admission works. Then EdgeWorkers, the programmable runtime, including its execution limits and event model, because those limits shape what a queue can and cannot do there. Then the connector pattern, where a vendor like Queue-it or CrowdHandler ships code that runs inside an EdgeWorker and validates a token on every hit. Finally, an honest comparison: edge-native admission control versus a dedicated queue vendor, and where each one breaks.

Why queue at the edge

When traffic spikes past what an origin can serve, you have three places to intervene. You can scale the origin, which is slow, expensive, and bounded by your slowest dependency (usually a payment processor or an inventory database that does not scale horizontally). You can shed load at the origin with a 503, which protects the database but gives every rejected user a broken page and no sense of order. Or you can intercept requests before they reach the origin and decide, per request, whether this one proceeds or waits.

The edge is the natural place for that third option because it already sits in the request path. Akamai’s network terminates the TLS connection, inspects the request, checks its cache, and only then talks to your origin. A waiting room slots into that flow as one more decision between “request arrives” and “request goes to origin.” Crucially, the visitors who get held never reach your servers at all. The holding page is served from the edge or from Akamai’s NetStorage, so a million people sitting in a queue cost your origin nothing. That is the entire economic argument for edge queueing, and it is a strong one.

There is a second argument, and it is about integrity. If the queue logic runs in the browser (a JavaScript snippet that redirects you to a holding page and lets you back after a countdown), then anyone who can read JavaScript can skip it. Open the network tab, find the redirect, request the protected URL directly. Akamai’s own marketing for the EdgeWorkers connector pattern says this plainly: running the check at the edge puts it “beyond the reach of tech-savvy visitors who might manipulate client-side code to skip the online queue.” The check happens on a server you do not control, against a secret you never see. That is the difference between a speed bump and a gate. The deeper mechanics of why client-side and even server-side queues still leak are covered in why waiting rooms leak; here the point is narrower, that the edge is simply a better place to put the gate than the browser.

Visitor Akamai edge admit? / hold? admitted origin held waiting page (edge) *The admission decision happens at the edge, before origin. Held visitors are served a static page and never cost the origin anything.*

The Visitor Prioritization cloudlet

Akamai’s oldest answer to this problem is a cloudlet. Cloudlets are small, single-purpose applications that run inside the Akamai delivery configuration, each one bought separately and wired up through Property Manager, the same console where you configure caching and routing. Visitor Prioritization is the one built for high-demand traffic. Akamai describes it as a “front-end shock absorber” for cases where peak volume overwhelms an origin that has to do real per-transaction work: checkout, donation pages, registration forms, anything that hits a database that will not scale on demand.

The mechanism is blunt, and that bluntness is the point. Visitor Prioritization does not maintain an ordered line. It does admission by probability. You configure what fraction of new visitors get let through to origin and what fraction get sent to a waiting room, and the cloudlet rolls the dice per visitor. Akamai’s documentation calls this “probability and percentage based prioritization for entry into the waiting room and origin.” Set the origin allowance to 30 percent and roughly seven in ten new arrivals see the holding page; the rest pass. There is no position number, no estimated wait, no fairness guarantee about who arrived first. It is a valve, not a queue.

What keeps it from churning people in and out of the waiting room is a cookie. The behavior in Property Manager (its JSON option name is visitorPrioritization) supports an allowed-user cookie: once a visitor wins the dice roll and is let through, the cloudlet sets a cookie so that subsequent requests from the same person skip the lottery and go straight to origin. The relevant options are allowedUserCookieEnabled, allowedUserCookieLabel, and allowedUserCookieDuration, the last of which is bounded to a window measured in seconds (the documented range tops out at 600). There is a parallel waiting-room cookie, configured through waitingRoomCookieEnabled and waitingRoomCookieLabel, that pins a held visitor to the waiting room so they do not get a fresh coin flip on every reload. The holding page itself is not dynamic. You download a template named vpwaitingroom.html, customize it, and drop it into a vp directory in NetStorage; the cloudlet serves that static asset with a configurable waitingRoomStatusCode and points held visitors at the right waitingRoomDirectory.

Visitor identity, the thing the cookie attaches to, can be derived several ways. The behavior exposes userIdentificationByCookie with a userIdentificationKeyCookie naming which cookie value identifies the user, userIdentificationByHeaders with userIdentificationKeyHeaders, userIdentificationByParams, and userIdentificationByIp. That flexibility matters because the cloudlet has no login system of its own; it borrows whatever signal you already have to tell two visitors apart.

request arrives at edge allowed-user cookie present? yes origin no dice roll vs. admit % pass set allowed cookie → origin hold set waiting cookie → vpwaitingroom.html *Visitor Prioritization is a valve. A cookie short-circuits the lottery for already-admitted visitors; everyone else faces a per-request probability.*

The honest assessment of Visitor Prioritization is that it solves overload, not fairness. Percentage admission keeps origin load roughly capped, and the cookie keeps an admitted shopper from being kicked back out mid-checkout. What it does not do is order anyone. A visitor who arrived an hour ago and a visitor who arrived a second ago face the same coin flip. There is no “you are number 4,812 in line,” because there is no line. For a great many sites that is fine, a donation page during a disaster appeal does not need strict ordering, it needs to not fall over. For a hyped product drop where customers expect first-come-first-served, percentage admission feels arbitrary and generates support tickets. That gap is exactly what the connector vendors sell into.

It helps to be precise about what the percentage actually controls, because it is easy to misread. The admit fraction does not cap concurrent users at origin and it does not set a request rate. It sets the odds that any given new arrival, one without an allowed-user cookie, gets waved through on that request. The effective load reaching origin is the arrival rate of cookieless visitors multiplied by that fraction, plus all the traffic from already-admitted visitors who carry the cookie and bypass the lottery entirely. So the knob behaves differently under a slow ramp than under a thundering-herd spike. When ten thousand people hit the page in the same second, a 30 percent setting admits roughly three thousand of them at once, which may still be more than the origin can take. The cloudlet is a probabilistic throttle, not a hard rate limiter, and operators who treat the percentage as a concurrency ceiling get surprised. Tuning it during an event is reactive: watch origin health, dial the fraction down if the database strains, dial it up as headroom appears. There is no closed-loop controller doing this for you in the base cloudlet; the number is what you set it to.

The allowed-user cookie’s short duration is the other detail people misjudge. With a documented ceiling of 600 seconds, an admitted visitor’s pass expires in at most ten minutes, after which a fresh request faces the lottery again unless they are mid-session on a flow that keeps refreshing the cookie. That is deliberate. A long-lived admission cookie would let the admitted population accumulate without bound across a multi-hour event, slowly starving new arrivals and defeating the throttle. A short window keeps the admitted set roughly proportional to recent activity. The cost is that a visitor who steps away for fifteen minutes can come back to find themselves re-queued, which is the kind of edge behavior worth explaining on the waiting-room page rather than leaving as a surprise.

EdgeWorkers and what its limits allow

The newer and more capable path is EdgeWorkers, Akamai’s programmable edge runtime. It runs JavaScript on V8, the same engine as Node and Chrome, with a chunk of the platform features stripped out and hard execution limits bolted on. You write a small module, bind it to a property and a set of paths in Property Manager, and your code runs on Akamai’s edge servers in the request path. This is the same runtime that hosts Akamai’s bot-detection hooks; the bot side is covered in Akamai Bot Manager scoring, and the queueing use of it is a close cousin, both intercept the request before origin and decide what happens next.

The execution limits are not a footnote. They define what a queue can do inside an EdgeWorker, so they are worth stating exactly. EdgeWorkers come in resource tiers, and the per-event-handler ceilings differ sharply between them. On Dynamic Compute, the default tier, each event handler gets a maximum of 20 milliseconds of CPU time, up to 2.5 MB of memory, and an initialization wall-time cap of 500 milliseconds. Basic Compute is tighter, 10 ms of CPU and 1.5 MB. Enterprise Compute is the generous tier, 70 ms of CPU and 4 MB of memory per handler. These are CPU-time budgets, not wall-clock, so waiting on a sub-request does not burn the whole allowance, but the raw compute window is small. You cannot run a heavy cryptographic workload or hold a large in-memory data structure. You can validate a token, check a cookie, look up a small value, and emit a redirect, which happens to be exactly what a queue connector needs.

CPU time budget per event handler (ms) 10 Basic 20 Dynamic 70 Enterprise *Per-handler CPU ceilings across the three tiers. A token-validating queue connector lives comfortably inside even the smallest, which is why it can run on every request.*

The event model matters just as much as the limits. An EdgeWorker hooks into the request lifecycle through named handlers, and where you put your code determines what you can see and change. The handler that fires for every incoming request, before the cache is even consulted, is onClientRequest. It can read and modify request headers and cookies. The companion handler onClientResponse fires just before the response goes back to the client and can read and modify response headers and cookies, which is where you set a session cookie. There are origin-facing handlers too, onOriginRequest and onOriginResponse, that only fire on a cache miss. And there is responseProvider, the one handler that can synthesize a response from scratch via createResponse, turning the EdgeWorker into a surrogate origin.

For a queue, the interesting pair is onClientRequest and onClientResponse. The decision (does this visitor have a valid pass, or do they go to the waiting room) belongs in onClientRequest, because it runs before cache and before origin on every single hit. The cookie housekeeping (extending a session’s validity) belongs in onClientResponse. This is precisely the split the queue vendors use.

One more piece completes the picture. An EdgeWorker can keep state across requests using EdgeKV, Akamai’s distributed key-value store. EdgeKV is eventually consistent by design, it propagates writes globally and, by Akamai’s own published target, converges within 10 seconds or less for at least 80 percent of operations. That consistency model is fine for configuration and reference data, and it is workable for coarse counters, but it is the reason an edge-native queue cannot easily maintain a single authoritative ordered line. If two edge regions each admit visitors against a shared counter that takes seconds to converge, they can over-admit. The standard way around this is to keep the authoritative queue state somewhere central and use the edge only to validate a token that the central system already issued, which brings us to the connector pattern.

The connector pattern: a vendor’s queue running inside an EdgeWorker

The most powerful edge-queueing setup on Akamai today is not Akamai’s own cloudlet. It is a third-party queue vendor shipping a connector that runs as an EdgeWorker. Queue-it announced its Akamai EdgeWorkers connector as part of an expanded partnership with Akamai on 10 November 2021. CrowdHandler shipped a comparable EdgeWorkers integration. The architecture is the same in both cases, and it is worth walking through because it shows how the edge and a central queue split the work.

The central queue lives at the vendor. That is where the ordered line actually exists, where positions are assigned, where the throughput knob (“admit N visitors per minute”) is turned. When a visitor reaches the front and is released, the vendor’s system issues them a signed token and redirects them back to the protected site with that token attached as a query-string parameter. From that moment on, the edge connector takes over. Its job is not to run the queue. Its job is to check, on every request, whether the visitor is carrying a valid pass.

That check is what runs in onClientRequest. Queue-it’s connector validates the HTTP request context (URL, cookies, headers) entirely at the edge, and Queue-it is explicit that “this validation occurs locally inside the edge worker and there is no communication with the Queue-it backend during this process.” That last clause is the whole performance story. The edge does not phone home on every request. It validates the token cryptographically against a shared secret it already holds, decides locally, and only the visitors who lack a valid token get redirected, via an HTTP 302, to the waiting room. CrowdHandler quotes its edge check as adding an average of about 20 ms to page load, precisely because it keeps the decision local rather than making an API call per request.

Visitor EdgeWorker (onClientRequest) Queue backend request, no token 302 → waiting room waits in line redirect back with signed token request + queueittoken verify HMAC locally set session cookie → origin *The central queue owns ordering and throughput. The EdgeWorker only validates the token it was handed, with no call back to the backend on the hot path.*

What is in that token, and how does the edge trust it without calling home? The exact internal layout used by the current private connector releases is not public, Queue-it hosts its in-depth connector docs for v4 and above in private repositories specifically to keep them away from people trying to defeat them. But the older v3 KnownUser libraries are open source, and the token format they validate is well documented. The query-string parameter is queueittoken. Inside it, fields are carried as short prefixed segments: q_ for the queue ID (a GUID that is the visitor’s unique place), e_ for the event ID, ts_ for the timestamp, ce_ for a cookie-extendable flag, rt_ for the redirect type, and h_ for the hash. The hash is an HMAC-SHA256 computed over the entire token minus the trailing hash segment, keyed with a secret shared between the vendor and the connector (Queue-it’s self-service platform issues a 72-character secret key). The connector recomputes the HMAC over the received token and compares; if it matches and the timestamp is still valid, the visitor is genuine and gets a session cookie. If not, back to the waiting room.

This is why the edge does not need to call the backend. The token is self-authenticating. Possession of a token signed with the shared secret is proof that the central queue released this visitor, and the edge can verify that proof with one HMAC computation, well inside the 10-to-20-millisecond CPU budget. The token’s timestamp bounds how long it is good for, so a captured token is not a permanent skeleton key, and the session cookie set afterward keeps a legitimate visitor from re-queuing on every click. The same token-plus-cookie shape appears across queue vendors; the mechanics of position, token, and the cookie that follows are laid out in general terms in how virtual waiting rooms work, and Queue-it’s specific safety-net behavior and token handling in Queue-it’s architecture. What is specific to Akamai here is only the host: the validation code runs as an EdgeWorker, in onClientRequest, against paths you bound it to in Property Manager.

Two deployment details are worth noting because they affect operability. The connector’s integration configuration (which paths to protect, which to ignore, what the secret is) can be propagated either inline in the EdgeWorker code or pulled dynamically, so config changes do not always require a full code redeploy. And because the connector is real code rather than a fixed cloudlet, it can do things the Visitor Prioritization cloudlet cannot: protect only specific paths, vary behavior by request attributes, and integrate the queue’s notion of identity rather than a borrowed cookie.

The token’s short life is what keeps this safe, and it is worth dwelling on why. An HMAC-signed token proves the central queue released this session, but a signature alone says nothing about when. Without an expiry, a token captured off the wire or pulled from a shared link would be a permanent pass, and the first person through the queue could mint access for a thousand friends. The ts_ timestamp closes that. The connector rejects a token whose validity window has passed, so a leaked token is useful only for a short window and only until the visitor exchanges it for a session cookie. The session cookie then carries the visitor for the rest of their visit, which is why the URL token is stripped after the first successful validation: leaving it in the address bar would invite exactly the link-sharing the timestamp is meant to limit. None of this makes the system unbreakable, the classes of failure (token reuse inside the validity window, race conditions around issuance) are real and are dissected in why waiting rooms leak. The point here is narrower. The edge’s entire trust model is one HMAC and one timestamp, and both have to be right for the gate to mean anything.

There is also a subtle interaction with caching that the connector has to respect. Akamai’s edge caches responses, and a protected page must never be served from cache to a visitor who has not been validated, or the queue is moot. This is why the decision lives in onClientRequest, which fires before the cache lookup, rather than in a later handler. A request without a valid token is redirected to the waiting room before the edge ever checks whether it has a cached copy of the protected page. Requests that do carry a valid token proceed past the check and can then be served from cache or origin as normal. The connector also has to be careful not to let its own redirects or token-bearing URLs become cache keys that pollute the cache for other users. Getting this wrong produces the worst failure mode of all, a cached protected page that leaks to everyone, which is a queue that quietly does nothing.

Edge-native admission versus a dedicated queue vendor

Set the two approaches side by side and the trade is clear. Visitor Prioritization is admission control without ordering. The EdgeWorkers-plus-connector pattern is a real ordered queue whose enforcement happens at the edge but whose ordering happens centrally. The first is simpler to stand up and has no external dependency. The second gives you the thing customers actually want during a hyped sale, a fair line with a visible position, at the cost of running a vendor’s central queue and shipping their code to your edge.

VP cloudlet EdgeWorker connector ordering none (percentage) fair line + position queue state cookie only central queue origin calls none none on hot path trust anchor edge cookie HMAC-signed token *The cloudlet caps load. The connector orders the line. Both keep the held traffic off your origin.*

Where does the edge-native side break? The consistency problem is real. An ordered queue needs a single authoritative view of who is next, and an eventually-consistent edge store cannot give you that across regions without over-admission. The vendors sidestep this by keeping ordering central and using the edge only for token validation, which works precisely because validation is stateless, one HMAC, no shared counter. The moment you try to push the ordering itself onto the edge, you are fighting CAP theorem with a 10-second convergence window, and you will either over-admit or serialize everything through one region and lose the edge’s latency benefit. This is the same wall every “queue at the edge” project hits, and it is why the mature designs are central-queue, edge-enforcement rather than edge-everything. The general scaling lessons are drawn out in designing a fair queue at scale.

Where does the cloudlet break? It cannot give a customer a position, and it cannot promise first-come-first-served. For load-shedding on a checkout that just needs to not melt, that is acceptable and even elegant in its simplicity. For a sneaker drop or a concert on-sale where the perception of fairness is the product, percentage admission reads as a slot machine, and the support load reflects it.

There is also the matter of what the edge cannot stop. A waiting room controls how many requests reach origin; it does not, by itself, tell a human from a script. A bot with a valid token is admitted exactly like a person with a valid token, because the token only proves “the central queue released this session,” not “this session is human.” That is a separate layer, and on Akamai it is a separate product, Bot Manager, which scores requests on telemetry the queue connector never looks at. Queue and bot defense are complementary: one rations capacity, the other judges intent. A scripted client that can sit in the queue patiently and carry the token correctly defeats the queue while doing nothing the queue was designed to catch. The Akamai-specific bot-side machinery (the _abck cookie and sensor payload) is its own subject, covered in Akamai Bot Manager’s _abck cookie.

What the edge actually buys you

Strip the marketing away and the value of queueing at Akamai’s edge comes down to two concrete things. Held traffic never touches your origin, so a flash crowd of a million people costs your database nothing, it costs Akamai some static-page serving and your EdgeWorker a few milliseconds of CPU per request. And the enforcement point lives on a server the visitor cannot rewrite, validating a secret the visitor never sees, which is what makes the gate a gate instead of a suggestion. Everything else, the ordering, the position numbers, the throughput control, lives in a central queue that the edge merely enforces on behalf of.

That division is the durable lesson. The edge is excellent at the stateless half of queueing (validate this token, set this cookie, redirect or admit) and poor at the stateful half (who is next, how many have I let in this minute) because its storage is built for availability over consistency. Akamai’s own cloudlet sidesteps the stateful half entirely by replacing ordering with a coin flip. The connector vendors keep the stateful half at home and rent only the stateless half from the edge. Both are honest answers to the same constraint, and the constraint is not Akamai’s, it is the physics of coordinating state across a planet-sized network in real time.

The piece most teams underweight is the token. A waiting room at the edge is only as trustworthy as the signature on the pass it checks, and the entire integrity argument (the reason an edge queue beats a browser queue) rests on an HMAC the visitor cannot forge and a timestamp that keeps a leaked token from living forever. Get the secret management and the token lifetime right and the edge enforces a fair line for the price of one hash per request. Get them wrong and you have built a very fast, very distributed way of letting the wrong people in.


Sources & further reading

Further reading