WAF evasion concepts: encoding, fragmentation, and why blocklists fail
A signature-based web application firewall makes a promise that sounds reasonable until you state it precisely. The promise is: I will read the bytes of this request, recognize the ones that look like an attack, and block them. The trouble is hidden in the word “recognize.” The firewall does not see the request the way the application sees it. It sees a copy, decoded and normalized through its own pipeline, matched against patterns its authors wrote in advance. The application that finally runs the request decodes it through a different pipeline. Any place where those two pipelines disagree is a place where a payload can be malicious to the application and invisible to the firewall at the same time.
That gap is not a bug in one product. It is the structural reason signature-based blocking is bypassable in concept, and it is why the people who study these systems keep finding new bypasses years after the old ones were patched. This post is about the mechanism, not the payloads. We will walk through how encoding and normalization create blind spots, how fragmentation splits a signature across a boundary the matcher never crosses, how parser differentials let the firewall and the backend read two different requests out of the same bytes, and why the alternative, a positive security model that allows only what it recognizes, trades one hard problem for a different one. If you want the ground floor first, the companion post how a WAF actually works covers signatures, anomaly scoring, and the Core Rule Set in detail; this one assumes you know that a WAF inspects a transformed copy of the request and asks what goes wrong with that copy.
The shape of the problem: two parsers, one byte stream
Start with the thing a WAF actually inspects. It is never the raw request as the application sees it. ModSecurity, the reference open-source engine, runs every input through a transformation pipeline before matching. A rule names an ordered list of transformation functions, and the documentation is blunt that the order matters: the engine copies the data, applies each function in sequence, and runs the pattern against the result. Decode the URL escapes, fold to lowercase, strip nulls, collapse whitespace, then match. The raw bytes are left untouched; only the copy is transformed.
This is the right design. You cannot write a signature for SQL injection that survives every encoding of the same characters, so you normalize first and write the signature against the normalized form. The problem is that “normalize” is not one operation. It is a stack of guesses about how the downstream application will interpret the bytes, and every guess can be wrong.
*The WAF and the application each decode the same wire bytes through their own pipeline. Every disagreement between the two boxes is a candidate bypass.*The rest of this post is a tour of where those two boxes diverge.
Encoding: when the WAF and the app disagree on what a byte means
The first and oldest class of evasion is encoding. The payload arrives wrapped in some reversible transformation, and the question is whether the firewall unwraps it the same way and the same number of times as the application.
ModSecurity ships a long list of decode transformations precisely because the web stacks they protect speak many encodings. urlDecode reverses percent-encoding. urlDecodeUni does the same but additionally understands the %uXXXX form that old Microsoft IIS servers accepted, where %u002f means a slash; the function has special handling for full-width ASCII codepoints, where a higher byte signals a wide character and gets folded back to its ASCII equivalent. htmlEntityDecode turns < and its hex and decimal cousins back into single bytes. There is jsDecode for JavaScript escape sequences, cssDecode for CSS 2.x escapes, and base64Decode for the obvious. Each one exists because some downstream parser will perform that decode, so the firewall has to perform it first to see what the parser will see.
Two failure modes follow directly from that list.
The first is the decode the firewall skips. If the application performs an HTML-entity decode on a value but the relevant WAF rule did not include htmlEntityDecode in its transformation pipeline, then a payload encoded as entities reaches the application as live characters while the firewall only ever saw the harmless entity text. The signature was written against the decoded form; the firewall never produced the decoded form; no match. This is not hypothetical sloppiness. Getting the pipeline complete and correctly ordered for every variable is genuinely hard, and the ModSecurity manual’s own examples are deliberately elaborate because a slightly different ordering opens a hole.
The second is the decode the firewall performs one time too few or one time too many. Double encoding is the canonical case. Encode a payload once, then percent-encode the percent signs of that encoding, and you have a string that needs two URL-decode passes to reveal the attack. A firewall that decodes once sees a still-encoded, harmless-looking string. An application stack that happens to decode twice, perhaps because a proxy decodes once and the framework decodes again, sees the payload. The mismatch is in the count of decode operations, and neither side is obviously wrong; they simply made different choices about how many layers to peel.
*Same value, two peels. The firewall stops after one decode and sees nothing; a stack that decodes twice reaches the live payload.*Charset and internationalization make this worse, because now the bytes themselves are ambiguous before any percent-decoding even starts. A request can declare one character set, contain bytes valid in another, and be decoded into different strings depending on which side trusts which declaration. Full-width Unicode forms of ASCII characters, the codepoints urlDecodeUni exists to fold, are a standing example: a wide-form character that a normalizer collapses to an ASCII metacharacter is dangerous after normalization and innocent before it. If the firewall normalizes Unicode and the application does too but with a different normalization form, or if one side applies case-folding that maps a non-ASCII letter onto an ASCII keyword, the keyword the signature hunts for can materialize only on one side of the pair. The general lesson is that “what string is this” is not a fact about the bytes. It is the output of a decoder, and there is more than one decoder in the path.
Case variation and inline comments are the low-tech members of this family. A matcher that is case-sensitive against a SQL keyword can be dodged by mixed case; the fix is to fold case in the pipeline, which CRS does. Comment sequences and stray whitespace inserted inside a keyword break a naive substring match while the SQL parser, which ignores comments, reads the keyword intact. These are old and mostly handled by mature rule sets, but they illustrate the same principle as the exotic encodings: the signature is matched against a normalized form, and anything the normalizer fails to canonicalize is a place where the application’s view and the firewall’s view come apart.
Pipeline order is part of the signature
It is easy to think of the transformation list as preprocessing, a tidy-up step before the real work of matching. It is not separable from the match. The same payload run through the same set of transformation functions in two different orders produces two different strings, and the signature matches at most one of them. ModSecurity’s manual makes this point with deliberately long example pipelines, and the reason they are long is defensive: each function in the chain closes off an encoding that, left undecoded, would let the payload reach the operator in a form the pattern does not recognize.
Consider what order buys you. If you lowercase before you URL-decode, then a percent-escape whose hex digits include a letter, say a payload that relies on an uppercase escape sequence, gets folded before it is decoded, and the decode may then fail to recognize a form it would have recognized in the original case. If you remove nulls after a decode that can introduce a null, you catch it; if you remove them before, the decode reintroduces one and it survives into the matched string. None of these orderings is universally correct. The correct order is the one that produces the same final bytes your backend produces, and since the backend’s decode order is itself a property of its framework and runtime, the firewall is once again being asked to predict a specific downstream behavior. A rule author who gets the order subtly wrong has not written a weaker signature. They have written a signature for a string the application will never construct.
This is why “just add the missing decode” understates the difficulty. Adding a decode changes the order, and changing the order can open a different hole while closing the one you aimed at. The pipeline is a small program, and like any small program operating on adversarial input, its correctness is a function of the exact sequence of operations and not merely the set of them.
Fragmentation: splitting the signature across a boundary
Encoding hides a payload by changing what its bytes mean. Fragmentation hides it by changing where its bytes are. If a signature matches a contiguous string, then any technique that splits that string across a boundary the matcher does not reassemble will defeat it, even though no individual fragment is encoded at all.
The clearest version lives in HTTP itself. A request body can be sent in chunks, with Transfer-Encoding: chunked telling the receiver to reassemble a sequence of length-prefixed pieces into one body. A multipart form splits a body into parts separated by a boundary delimiter. HTTP/1.1 even allows a single header’s value to be folded across multiple lines historically. Every one of these is a legitimate feature, and every one creates a question: does the firewall reassemble the pieces into the same final byte stream the application will, before it matches? If the firewall matches each chunk independently, a signature split across a chunk boundary never appears as a contiguous string to the matcher, but the application concatenates the chunks and runs the whole thing.
HTTP parameter pollution is the same idea at the parameter level. Send a parameter twice. Now ask which value each side uses. Some stacks take the first occurrence, some take the last, some concatenate both, some build an array. If the firewall inspects the first occurrence and judges it benign while the application reads the last, a benign-looking duplicate shields a malicious one. Nothing here is encoded. The attack is entirely in the disagreement about how to resolve a duplicate, and that disagreement is documented behavior, not a flaw in any single component.
*No fragment matches a signature on its own. The complete pattern exists only in the reassembled body, and only one of the two parsers reassembled it.*There is a subtler reason fragmentation is hard to defend against than the encoding gaps that precede it. With encoding, the firewall at least sees all the bytes; it just decodes them wrong. With fragmentation, the firewall may make a deliberate choice not to look at all the bytes as a unit, because reassembling everything is expensive. Buffering an entire chunked body before inspecting it costs memory and latency, and a streaming firewall that inspects chunks as they arrive trades that cost for the exact blind spot fragmentation exploits. The performance-correctness tension is real: the cheapest place to inspect is per-piece as it streams past, and the only safe place to inspect is after reassembly. Anything in between is a partial reassembly with a partial blind spot.
Multipart bodies sharpen this further because the boundary delimiter is attacker-influenced. The client declares the boundary string in the Content-Type header, and both the firewall and the backend must use that declaration to find where each part starts and ends. If the two implementations differ on what counts as a valid boundary, what to do with a malformed one, or how much surrounding whitespace and line-ending sloppiness to tolerate, then they will carve the body into parts differently. One of them may see a part the other treats as noise inside a neighbor. The payload lives in whichever part the backend reads and the firewall skips.
The defense against fragmentation is conceptually simple and operationally hard: reassemble first, match second, and make sure your reassembly is byte-for-byte what the backend will do. That is exactly the requirement that turns out to be unsatisfiable in general, because there is more than one backend, and they do not agree with each other either.
Parser differentials: the deep version of the problem
Encoding gaps and fragmentation are special cases of a single underlying condition: the firewall and the application parse the request differently. The general name for this is a parser differential, and it is where the most durable bypasses live, because it does not depend on the WAF forgetting a decode step. It depends only on the WAF and the backend being two different programs that each parse HTTP, which they always are.
The cleanest recent demonstration is the WAFFLED study (Akhavani and colleagues, ACSAC 2025), which fuzzed the structural parsing of content types rather than the payloads inside them. The idea is to keep the malicious code completely intact and instead mutate the container: the boundary in a multipart/form-data body, the wrapper fields in JSON, the DOCTYPE and namespace handling in XML. A WAF that parses the container slightly differently from the backend will either fail to find the part that holds the payload or fail to associate it with the field the backend reads. The payload sails through unmodified, so no signature on the payload itself ever gets a chance to fire, because the firewall never located the payload as inspectable content.
The numbers are worth stating because they show this is not a corner case. The study reported 1,207 distinct bypasses across five widely deployed WAFs: AWS, Azure, Google Cloud Armor, Cloudflare, and ModSecurity. The bypasses concentrated in the three structured content types, with the JSON, multipart, and XML categories each contributing hundreds. The multipart family included things like altering the boundary delimiter or removing the \r\n before a boundary so that one parser recognizes a part and the other does not, and disrupting the Content-Disposition header so the firewall fails to attribute a part to a field. A field study in the same work found that more than 90% of websites accepted application/x-www-form-urlencoded and multipart/form-data interchangeably, which means an attacker can usually choose the container that maximizes the parser disagreement. Vendors treated these as real: Google rated the issue at its top severity tier and paid a bounty, and the other affected vendors acknowledged and began remediating.
A second, narrower example shows how small the differential can be. ModSecurity historically set its REQUEST_FILENAME variable after URL-decoding the request target. So when a request path contained an encoded question mark, the engine decoded it into a real ?, treated everything after it as the query string, and excluded that tail from the filename it inspected. The backend, parsing the raw target, never saw a query delimiter there and processed the whole thing as path. The Core Rule Set leans on REQUEST_FILENAME heavily, so a value placed in that decoded-away tail slipped past a large swath of rules. This was assigned CVE-2024-1019 and fixed in ModSecurity 3.0.12; the equivalent behavior in the older v2 line was reported as not fixed at the time of the writeup. The whole bypass turns on a single ordering decision, decode-then-split versus split-then-decode, and that one decision was enough.
Parser differentials are not unique to WAFs. The same disagreement, applied to where one request ends and the next begins, is HTTP request smuggling: two servers in a chain disagree about message boundaries and an attacker injects a request the front end never saw as a request. If that topic interests you, the desync mechanics are covered in HTTP request smuggling and the protocol-downgrade variant in HTTP/2 downgrade smuggling. A WAF sitting in front of a smuggling-vulnerable pair has an even harder job, because the request it should be inspecting may not exist as a discrete request from its vantage point at all.
Why this is structural and not a patching problem
It is tempting to read the previous sections as a list of bugs. Add the missing decode, reassemble the chunks, fix the filename ordering, and the holes close. Each individual hole does close that way. The reason the category never closes is that the WAF is solving a problem it cannot fully solve: it must predict exactly how an arbitrary downstream stack will interpret bytes, using a parser that is not that stack.
Two forces make this permanent.
The first is parser diversity. A WAF protects many applications, and those applications run on different servers, frameworks, and language runtimes, each with its own quirks for charset selection, duplicate-parameter resolution, multipart boundary handling, and Unicode normalization. To match the backend’s interpretation exactly, the firewall would have to know which backend it is in front of and emulate that backend’s parser bug-for-bug. WAFFLED’s premise is exactly this: it pairs WAFs against specific frameworks and finds disagreements per pair. A fix that aligns the firewall with one framework’s parsing can misalign it with another’s. There is no single canonical HTTP parse to converge on, because the deployed web does not have one.
The second is that signatures describe attacks the authors already knew about. A blocklist, the negative security model, allows everything except recognized-bad patterns. That means it cannot, even in principle, block an attack shape no one has written a pattern for yet, and it inherits a maintenance treadmill where each new bypass technique needs a new or revised rule. Mature rule sets manage this with layered defenses and scoring rather than single brittle signatures. The Core Rule Set, for instance, does not block on one rule match; it assigns each matching rule an anomaly score and blocks when the summed inbound score crosses a threshold (5 by default), with a separate outbound threshold (4 by default) for responses, and a paranoia-level dial that trades coverage against false positives, from PL1’s near-zero-false-alarm baseline up to PL4’s aggressive rules that flag a great deal of legitimate traffic. Anomaly scoring is a real improvement over single-signature blocking because it makes a payload pay for multiple suspicious traits at once, so a partial evasion that dodges one rule may still accumulate enough score from others. It raises the cost of evasion. It does not change the fundamental thing, which is that the score is still computed over the firewall’s normalized copy, and a parser differential corrupts that copy at the source. If the firewall never located the malicious bytes as inspectable content, no rule scores them and the threshold is never approached.
There is a third force worth naming, because it is the one defenders most often underestimate: the firewall and the application age at different rates. A WAF rule set is updated on the vendor’s cadence; the application behind it is updated on the team’s. A framework upgrade that changes how the backend resolves duplicate parameters, or which charset it defaults to, or how strictly it parses a multipart boundary, silently shifts the differential without touching the firewall at all. The rule that matched the right string yesterday matches a string the new backend no longer constructs. Nobody introduced a bug. The two parsers simply drifted, and a parser differential is a relationship between two programs, so either program changing is enough to open or close one. This makes WAF coverage a property of a pairing rather than of the firewall in isolation, which is uncomfortable, because it means a green test result against last quarter’s backend is not evidence about this quarter’s.
This is the same arms-race structure that runs through anti-bot systems generally, where each detection heuristic invites a matching evasion and the defender’s real advantage is breadth and layering rather than any single perfect check. The parallel is exact enough that the integrity-check arms race post about anti-bot JavaScript reads as a sibling to this one: in both cases the defender inspects a representation of the client, and the attacker works on the gap between that representation and the truth.
The case for positive security, and its bill
If a blocklist can never enumerate all the bad inputs, the obvious inversion is to enumerate the good ones. This is the positive security model, sometimes called default-deny or allowlisting. Instead of describing what an attack looks like, you describe what a valid request looks like, this endpoint, these parameters, these types and lengths and value ranges, and you reject everything else by default. The appeal is structural rather than incremental. An allowlist is not trying to predict attack shapes, so a novel attack that does not fit the allowed shape is rejected for the same reason a typo is, without anyone having written a rule about that specific attack. In the literature this is the standing argument for positive security’s protection against unknown and zero-day inputs.
The reason most deployments are not pure allowlists is that the model moves the hard work, it does not remove it. To allow only valid requests you must first specify, completely and correctly, what valid means for every endpoint of a real application, and keep that specification in sync as the application changes. Get the spec too tight and you block legitimate users; the false-positive cost that a paranoid blocklist pays occasionally, a misconfigured allowlist pays constantly. Get it too loose and the allowed shape is wide enough to carry an attack, at which point you have a blocklist again with extra steps. Sensitive applications with stable, well-understood interfaces, banking and healthcare being the usual examples, are where the upfront specification cost is worth paying. For a large, fast-changing application the specification is a moving target, and the maintenance burden is the mirror image of the blocklist’s signature treadmill.
So the honest summary is not that positive security solves WAF evasion. It is that positive security relocates the problem from “can I enumerate every attack” to “can I specify every valid request,” and the second problem is tractable exactly when the application’s interface is small and stable. That is why real deployments are hybrids: a positive model where the interface is well-defined, a negative model with anomaly scoring everywhere else, and an acceptance that the seam between them is itself something to watch. A positive model still parses the request to decide whether it fits the allowed shape, which means it still has a parser, which means it can still differ from the backend’s parser. Allowlisting narrows the differential’s blast radius. It does not abolish the differential.
Closing: the bypass is in the gap, not the bytes
The thread running through every technique here is that none of them attack the signature directly. Encoding does not fool the matcher; it fools the normalizer that feeds the matcher. Fragmentation does not hide the pattern; it hides the boundary at which the pattern becomes contiguous. Parser differentials do not defeat any rule; they make sure the rule is matched against a different request than the one the application runs. In each case the malicious bytes are right there in the request, fully present, and the firewall fails not because its rules are weak but because the request it inspected and the request that executed were not the same request.
That reframes what a defender can actually buy. You cannot buy a signature set complete enough to make a signature-based WAF unbypassable, because the gap being exploited is not in the signatures. What you can buy is a smaller gap: a normalization pipeline that matches your specific backend’s decoding, a reassembly step that produces the exact bytes your backend will run, an allowlist tight enough that unspecified shapes are simply rejected, and the operational discipline to keep all of that aligned as the application moves underneath it. The WAFFLED result is the useful one to keep in mind, because it found 1,207 ways for five mature, well-funded WAFs to read a different request than the backend, by touching only the container and never the payload. The firewall was working correctly the whole time. It was just reading a different request.
Sources & further reading
- Akhavani, Jabiyev, Kallus, Topcuoglu, Bratus, Kirda (2025), WAFFLED: Exploiting Parsing Discrepancies to Bypass Web Application Firewalls — ACSAC 2025 paper that found 1,207 bypasses across AWS, Azure, Cloud Armor, Cloudflare, and ModSecurity by mutating the container rather than the payload.
- OWASP ModSecurity (2024), Reference Manual: Transformation Functions (v2.x) — the canonical list of decode and normalize functions (urlDecodeUni, htmlEntityDecode, base64Decode, normalizePath) and why pipeline order matters.
- OWASP ModSecurity (2024), Reference Manual (v3.x) — the v3 engine reference covering variables like REQUEST_FILENAME and the transformation model.
- sicuranext (2024), ModSecurity: Path Confusion and really easy bypass on v2 and v3 — the writeup of CVE-2024-1019, where decode-then-split on REQUEST_FILENAME excluded a region from inspection.
- OWASP CRS (2024), Anomaly Scoring — how the Core Rule Set sums per-rule scores against inbound (5) and outbound (4) thresholds instead of blocking on a single match.
- OWASP CRS (2024), Paranoia Levels — the PL1 to PL4 dial that trades attack coverage against false-positive rate.
- SpiderLabs / LevelBlue (Trustwave) (archived), WAF Normalization and I18N — on charset, Unicode normalization, and the multibyte and full-width encoding gaps that break naive matching.
- Het Mehta (2025), Advanced Techniques for Bypassing Modern Web Application Firewalls — a survey of the conceptual bypass families: encoding, case and comment variation, HTTP parameter pollution, and ML-WAF evasion.
- F5 DevCentral (2024), Positive Security vs. Negative Security — the default-deny versus default-allow trade-off using F5’s portfolio as the worked example.
- Indusface (2024), What is the Positive Security Model and Why It Matters — the case for allowlisting and its zero-day coverage, with the configuration and maintenance cost stated plainly.
- Cloudflare (2024), What is a WAF? Web Application Firewall explained — vendor overview of blocklist versus allowlist models and where hybrids land.
Further reading
How a WAF actually works: signatures, anomaly scoring, and the ModSecurity CRS
A reference on web application firewalls: positive vs negative security models, signature and parser-based matching, the CRS anomaly-scoring system and its paranoia levels, where a WAF sits in the request path, and how false positives get tuned away.
·23 min readThe OWASP Core Rule Set: anatomy of the rules that protect most of the web
A reference deep dive into the OWASP Core Rule Set: its rule categories, the anomaly-scoring model, paranoia levels, the ModSecurity and Coraza engines that run it, and how the project got here.
·17 min readThe history of the web application firewall, from packet filters to ML
Traces the WAF from network packet filters that could not see HTTP, through Sanctum AppShield and Ivan Ristic's ModSecurity, the OWASP Core Rule Set, PCI DSS pushing adoption, to cloud WAFs and machine-learning attack scoring.
·22 min read