Skip to content

How a WAF actually works: signatures, anomaly scoring, and the ModSecurity CRS

· 23 min read
Copyright: MIT
WAF wordmark with an anomaly-scoring request flow diagram

A web application firewall is supposed to do something that sounds impossible when you say it out loud. It reads the HTTP traffic going to your application, decides which requests are attacks, and drops those before your code ever runs, all without knowing anything about what your application actually does. It does not parse your routes. It does not understand your business logic. It sees bytes on the wire, and from those bytes it has to guess intent.

That guess is the whole game. Get it too loose and a SQL injection slides through into a login form. Get it too tight and a customer with an apostrophe in their surname gets a 403 at checkout. Most of what makes a WAF interesting, and most of what makes it frustrating to operate, comes down to how it manages that one tradeoff. This post walks through the machinery: the two security models a WAF can be built on, how signatures and dedicated parsers actually match a payload, how the OWASP Core Rule Set turns individual matches into a single anomaly score, what its paranoia levels really control, where the firewall sits in the request path, and the unglamorous work of tuning false positives once it is live.

Two ways to be wrong: positive and negative security models

Every filtering system in security sits somewhere on a line between two extremes. At one end is the positive security model, sometimes called allowlisting or default-deny. You enumerate exactly what valid traffic looks like, and you reject everything else. At the other end is the negative security model, blocklisting or default-allow. You enumerate what attacks look like, and you let through everything else.

The positive model is, in theory, the stronger of the two. If you have correctly described every legitimate request your application accepts, then by definition nothing malicious gets through, including attacks nobody has invented yet. A WAF operating this way blocks all traffic by default and admits only what is explicitly approved. That property is why people reach for it to protect the things they care about most. The catch is in the words “correctly described every legitimate request.” Real applications have hundreds of endpoints, each accepting parameters whose valid shapes drift as the product changes. Building and maintaining a complete positive model by hand is the kind of work that sounds tractable until you try it, and allowlist WAFs tend to break down precisely when you cannot anticipate every valid traffic type up front. This is the same difficulty that makes a hand-written allowlist so leaky in other parts of an anti-bot stack, a theme that runs through why blocklists fail as a defensive strategy.

The negative model is the opposite trade. It is easy to deploy because you start from a library of known-bad patterns and switch it on. It produces fewer false positives, because it only fires when it sees something that looks like a documented attack. And it is reactive by construction. A blocklist can only catch what someone has already described to it, so the day after a new injection technique is published, your blocklist is blind to it until the signatures catch up.

In practice almost nobody runs a pure version of either. The dominant pattern is a hybrid: a negative-model signature engine doing the heavy lifting against the OWASP Top 10 classics, wrapped in positive-model constraints where the application’s valid traffic is narrow enough to describe. A field that should only ever contain a numeric ID can be locked to digits. A path that should never receive a request body can reject one outright. Everything fuzzier falls back to the blocklist. The OWASP Core Rule Set, which is where most of this post lives, is fundamentally a negative-model engine with a scoring layer bolted on top to soften the reactivity problem.

Two security models Positive (allowlist) Negative (blocklist) default DENY default ALLOW admit only what is explicitly described block only what matches a known-bad pattern + catches zero-days - hard to keep complete + easy to deploy, fewer FPs - blind to the unknown *Positive default-deny is stronger in theory but expensive to keep complete; negative default-allow is easy to deploy but reactive. Production WAFs blend the two.*

A short history, because the engine you run was shaped by it

The reference implementation for most of this is ModSecurity, and its history explains some odd corners of how WAFs behave today. Ivan Ristić released the first version in November 2002 as an Apache module. It read requests inside the web server’s own process, which made it fast and gave it deep access to the parsed request, at the cost of being welded to Apache internals. Version 2.0 landed in October 2006, and the SecRules language it introduced is still, in spirit, the language people write WAF rules in two decades later.

The Apache coupling became the problem. Porting ModSecurity to Nginx or IIS meant fighting the host server’s architecture each time, so a ground-up rewrite began in December 2015. The result, libmodsecurity (ModSecurity 3.0), pulled the engine out into a standalone library that talks to the web server over a thin connector API. It was announced for public use in January 2018. That rewrite never fully caught up with the 2.x feature set, which matters for a reason we will get to.

Then the ownership story turned. Trustwave, which had maintained ModSecurity since acquiring Breach Security in 2010, announced end-of-sale for its commercial support effective August 1, 2021, and end-of-life effective July 1, 2024. In February 2024 the project was transferred to OWASP, which now stewards it as an open-source community effort. The current stable release line sits at 3.0.14 (February 2025).

The CRS project, which maintains the rules rather than the engine, has been blunt about the engine’s condition. By their account ModSecurity “failed to attract a community of active developers,” version 2 stayed “heavily linked to Apache internals,” and version 3 was “incomplete, less performant, and only runs on NGINX,” with around a dozen implementation gaps that quietly break CRS rules. Their conclusion was that “the intelligence is in the rule set, not in the engine.” Rather than fork a C/C++ codebase, they backed Coraza, a clean SecLang implementation written in Go by Juan Pablo Tosso. Coraza passes 100% of the CRS v4 test suite, runs on Caddy and (via proxy-wasm) Envoy, and gets memory safety for free from Go. So the modern picture is an engine-neutral rule set that runs on at least three different engines, and the part you actually configure, the rules, is the part that has stayed stable.

2002 first release (Apache module) 2006 v2.0 / SecRules 2018 libmodsecurity 3.0 Jul 2024 Trustwave EOL Feb 2024 to OWASP + CRS v4.0.0 now Coraza (Go) *The engine changed hands and languages; the SecRules dialect from 2006 and the rule set are what carried through.*

How a single rule reads a request

Before the scoring, there is the matching. A WAF rule, in the SecRules dialect, is a small machine with three moving parts: a set of variables to look at, an operator to test them with, and a list of actions to take if the test passes. The variables are the places an attack can hide, named in the engine’s own vocabulary: ARGS for the parsed query-string and body parameters, REQUEST_HEADERS, REQUEST_COOKIES, REQUEST_URI, REQUEST_BODY, and so on. The operator is the test. Most often that is a regular expression via the @rx operator, but it can also be a string comparison, a numeric range check, or a call into a specialized detector.

A rough shape of one rule, with the real directive names but a deliberately toy pattern:

SecRule ARGS "@rx (?i:union\s+select)" \
"id:9001,phase:2,deny,msg:'SQLi keyword in argument',severity:CRITICAL"

That says: in phase 2 (after the request body is parsed), look at every argument, and if any of them matches the case-insensitive pattern for a UNION SELECT, deny the request, log the given message, and treat the finding as CRITICAL. This is the negative model in miniature. Somebody saw the attack, wrote the pattern, and now the engine carries it.

The problem with naive regex signatures is that they are brittle against the thing attackers do best, which is to express the same payload in a thousand surface forms. A signature for union select does not match union/**/select or UNiOn SeLeCt or a version where the keyword is split across two parameters and reassembled server-side. The CRS handles part of this with transformations: before a rule’s operator runs, the engine can apply a chain of normalizations to the input, lowercasing it, URL-decoding it, stripping comments, collapsing whitespace, so that the regex sees a canonical form rather than the raw bytes. The catalogue of tricks a payload uses to dodge a literal pattern, from mixed encoding to fragmentation, is its own subject, covered in encoding, fragmentation, and why blocklists fail, and the obfuscation that hides intent inside the payload body overlaps with control-flow flattening and string encryption.

For the two attack classes where pure regex loses the arms race fastest, SQL injection and cross-site scripting, the CRS does not rely on regex alone. It calls into libinjection, a dedicated parser that tokenizes the input the way a SQL or HTML parser would and decides whether the token stream forms an injection, rather than whether it matches a string. Rule 942100 is “SQL Injection Attack Detected via libinjection” and rule 941100 is the XSS equivalent. A tokenizing parser is far harder to fool with whitespace games and comment insertion than a regex, because it is reasoning about structure instead of spelling. It is not perfect, parsers have their own false positives, the libinjection XSS rule has historically misfired on legitimate rich-text editors, but it changes the nature of the contest.

From matches to a score: the anomaly-scoring model

Here is the design decision that defines the modern CRS, and it is worth stating precisely because it is easy to get backwards. In the CRS, an individual rule does not block anything. A rule that matches only adds points to a running total called the anomaly score. The blocking decision is made once, later, by a separate rule that compares the accumulated score against a threshold.

This is “collaborative detection,” and it exists to solve the false-positive problem that plagues per-rule blocking. With one-rule-one-block, every rule has to be confident enough on its own to justify a 403, which forces the rules to be conservative, which lets attacks through. With scoring, individual rules are allowed to be a little trigger-happy, because no single weak signal is fatal. A request that trips one borderline rule gets a few points and sails on. A request that trips five rules across SQLi, XSS, and protocol-violation categories accumulates enough to cross the line and gets blocked. The score is a way of saying “any one of these alone is ambiguous, but all of them together is not.”

Each rule declares a severity, and severity maps to a fixed number of points. The CRS defaults are:

Severity to score CRITICAL +5 ERROR +4 WARNING +3 NOTICE +2 Default thresholds Inbound (request) 5 Outbound (response) 4 One CRITICAL match (5) alone crosses the inbound threshold. Set by rules 900100 / 900110. Score accumulates via setvar across all matching rules; one blocking rule per direction checks the total. *The four severities and their point values, and the default thresholds. At threshold 5, a single CRITICAL hit blocks; lower-severity rules have to gang up.*

A CRITICAL finding adds 5, ERROR adds 4, WARNING adds 3, NOTICE adds 2. The default inbound (request) anomaly score threshold is 5, and the default outbound (response) threshold is 4. Those numbers are set by rules 900100 (the severity-to-score mapping) and 900110 (the thresholds), which live in the CRS setup file where you are meant to override them. The accumulation itself happens through ModSecurity’s setvar action, with each matching rule incrementing a transaction variable.

Notice what the default of 5 implies. Because a single CRITICAL rule already adds 5, the out-of-the-box configuration blocks any request that trips even one critical-severity rule. That is an aggressive default, and it is deliberate: the CRS documentation describes a healthy install as running at threshold 5 with a small set of tuned exclusions. If you want the scoring to behave more like a true collaborative vote, where no single signal is decisive, you raise the threshold, and you accept that some individual attacks now need a corroborating second signal before they are stopped.

The blocking itself is concentrated into two files. REQUEST-949-BLOCKING-EVALUATION.conf runs after every request-phase rule has had its say, compares the inbound score to the threshold, and denies if it is greater than or equal. RESPONSE-959-BLOCKING-EVALUATION.conf does the same for the response. This two-checkpoint structure is the entire control flow: every detection rule is a pure scorer, and exactly two rules per direction are deciders.

A request through the scoring engine client phase 1 line + headers phase 2 request body score += 5 (SQLi via libinjection) score += 3 (protocol violation) rule 949110 if score >= 5 DENY else pass to app Detection rules only add points. Blocking is one checkpoint per direction (949 inbound, 959 outbound). With early blocking on, the phase-1 total is checked before phase 2 ever runs. *Detection rules score; two blocking rules decide. The numbers shown are illustrative of how a couple of matches push a request over threshold.*

CRS v4, the stable release that arrived on February 14, 2024, added an option that changes this control flow slightly. With early blocking enabled, by uncommenting rule 900120 to set tx.early_blocking, the engine checks the accumulated score at the end of phase 1, after the request line and headers but before the body is read. If a request has already crossed the threshold on its headers alone, it is denied without ever parsing the body or running phase 2. That saves work on obviously hostile requests. The CRS team flags a subtle operational cost: a false positive you tune away in phase 1 can unmask a second phase-2 false positive that was previously hidden behind the early block, so the two have to be tuned together. The same release added a plugin architecture and HTTP/3 support, and renumbered rules out of the 9xxxxx range during migration to avoid collisions with v3.

Paranoia levels: how many rules are in the room

The single most misunderstood CRS concept is the paranoia level, and the misunderstanding is always the same: people think it controls how aggressively the WAF blocks. It does not. The paranoia level controls which rules are even loaded. The anomaly threshold controls blocking. They are independent knobs that people constantly conflate.

The CRS ships its rules tagged with a paranoia level from 1 to 4. A rule tagged PL2 only participates if you have set the paranoia level to 2 or higher. So raising the level does not change the score needed to block; it enlarges the set of rules that can contribute score in the first place. More rules in the room means more chances to catch a real attack, and more chances to misread a legitimate request. The documentation’s own descriptions capture the trade with unusual candor:

Paranoia level 1 is “baseline security with a minimal need to tune away false positives. This is CRS for everybody running an HTTP server on the internet.” Level 2 covers “rules that are adequate when real user data is involved. Perhaps an off-the-shelf online shop. Expect to encounter false positives and learn how to tune them away.” Level 3 is “online banking level security with lots of false positives,” where false positives “are accepted and expected.” Level 4 holds “rules that are so strong (or paranoid) they’re adequate to protect the ‘crown jewels’,” to be used at one’s own risk against “a large number of false positives.” The variable is tx.paranoia_level, set in the setup file.

Paranoia levels: more rules, more noise PL1 PL2 PL3 PL4 baseline real user data online banking crown jewels false positives rules loaded (and detection coverage) increases left to right *Each step up loads more rules and catches more, at a steeply rising false-positive cost. The level is orthogonal to the blocking threshold.*

CRS v4 separates this into two related variables that solve a real operational headache. The blocking paranoia level is the set of rules whose matches actually count toward the score and can get you blocked. The executing paranoia level is the set of rules that run at all. By default they are equal. But you can set the executing level higher than the blocking level, which means the stricter rules run, log their matches, and reveal exactly which legitimate requests they would have flagged, without those matches ever counting toward a block. You watch the logs, tune away the false positives the higher level would have caused, and only then raise the blocking level to match. It is a dress rehearsal for a stricter policy, run against live traffic, with the safety off.

There is a related mechanism for cautious rollout: sampling mode, where you route only a configurable percentage of traffic through CRS and let the rest bypass it entirely. That lets you turn the whole rule set on for, say, one percent of requests, watch what breaks, and ramp up. It is the same instinct as the executing-versus-blocking split, applied at the level of whole requests rather than whole rules.

Where the WAF sits

None of the scoring matters if the WAF cannot see the traffic, and where it sits in the path determines what it can see and what it can do about it. There are three broad placements, and the choice is a trade between inspection depth, latency, and how much the WAF is allowed to interfere.

The most common today is reverse-proxy mode. The WAF has its own IP address and terminates the client’s TCP connection itself, then opens a separate connection to the origin. Because it is a full endpoint on both sides, it sees the complete, reassembled request, can decrypt TLS, can buffer and fully parse the body, and can rewrite or reject anything before the origin hears about it. This is where a cloud WAF lives, and it is why a CDN is the natural home for one: the edge already terminates TLS and proxies every request, so adding inspection is incremental. The cost is that the WAF is now a man in the middle by design, with all the latency and failure-mode questions that implies, and the origin sees the WAF’s IP rather than the client’s unless the chain is preserved through forwarding headers.

A transparent bridge sits inline at layer 2. It does not take an IP, does not terminate the TCP connection, and does not rewrite packets; it lets the link pass through and inspects it in flight. That makes it nearly invisible to deploy, with no IP renumbering, but inspecting a stream you are not terminating is harder, especially once TLS is in play, and the engine has less room to cleanly reject a request mid-flight.

The third placement is out-of-band, where the WAF sees a mirrored copy of the traffic from a switch span port or a tap. It can log and alert and score, but it cannot block, because the real request has already reached the origin by the time the mirror arrives. This is detection without enforcement, useful for tuning a rule set against production traffic before you dare put it inline, and useless for actually stopping an attack in progress.

The placement also fixes where the WAF lives relative to the rest of the edge stack. A WAF is a signature-and-scoring engine reasoning about HTTP semantics. It is not a bot manager reasoning about TLS fingerprints and browser telemetry, and it is not a rate limiter counting requests per IP, though all three commonly run on the same box. A modern edge runs the bot scoring (see server-side vs client-side bot detection), the WAF, and a rate limiter as separate stages, and they catch different things. The WAF will happily wave through a flood of perfectly-formed requests that contain no attack payload, because nothing in any individual request looks malicious; stopping that is the rate limiter’s job and the layer 7 DDoS problem, not the WAF’s. The reverse is also true: a single hand-crafted injection from a residential IP with a clean browser fingerprint sails past the bot manager and gets caught, if at all, by the WAF.

The reason WAFs have a bad reputation: false positives

Everything above is the part that works on paper. The part that determines whether anyone keeps the WAF in blocking mode is false-positive tuning, and it is where most WAF deployments quietly die, downgraded to detection-only after one too many 403s on a legitimate checkout.

The mechanics of a false positive are mundane. A rule written to catch an attack pattern also matches a benign input that happens to share the pattern. The classic case is the SQL-character anomaly rules: a customer whose password legitimately contains ' or -- or whose comment field contains the word select trips a SQLi rule, accumulates score, and gets blocked from their own account. Rich-text editors that emit HTML trip XSS rules. Base64 blobs in a legitimate API parameter trip restricted-character rules. None of these are attacks. All of them score.

Tuning is the disciplined process of teaching the rule set about your specific application’s legitimate weirdness, and the workflow is well established. You start from the logs. ModSecurity’s audit log records, for every blocked request, which rule IDs fired and against which variables. You link the access log to the error log by the shared request ID, pull out the offending rule IDs, and rank requests by how much score they accumulated, because the worst offenders give you the most relief per fix. The netnea tutorials, which are the standard reference for this, walk through exactly this grep-and-rank loop.

Once you know rule 942450 (“SQL Hex Encoding Identified”) is firing on a cookie that legitimately carries hex, you have two broad tools. A startup-time exclusion removes or narrows a rule when the configuration loads. SecRuleRemoveById 920273 deletes a rule entirely. SecRuleUpdateTargetById 942450 "!REQUEST_COOKIES" keeps the rule but stops it from looking at cookies, so it still guards your query parameters while ignoring the field that was causing trouble. The second form is almost always the right one: surgical exclusion of the specific variable, not deletion of the whole rule. A runtime exclusion is more conditional. It uses a rule that fires only for requests matching some characteristic, a particular path, say, and uses ModSecurity’s ctl action to disable a target just for those requests:

SecRule REQUEST_URI "@beginsWith /app/search" \
"phase:1,nolog,pass,id:10002,\
ctl:ruleRemoveTargetByTag=attack-sqli;ARGS:keys"

That tells the engine: for requests to the search endpoint only, stop the SQLi-tagged rules from inspecting the keys argument, because that field is a free-text search box where SQL-ish words are normal. Everywhere else on the site, those rules stay fully armed.

The discipline that separates a tuned WAF from an abandoned one is the order of operations. You do not start in blocking mode at a high paranoia level and tune under fire while real users eat 403s. You start in detection-only or at an executing level above the blocking level, gather a representative sample of real traffic, fix the false positives the logs reveal, and only then tighten. The CRS even structures its defaults around the gentlest possible start: paranoia level 1, threshold 5, a handful of exclusions. The whole executing-versus-blocking-paranoia design exists so that you can do this tuning against production traffic without production paying for it.

There is one honest limitation worth stating plainly. The exact regular expressions, the precise transformation chains, and the internal scoring nuances of the big commercial cloud WAFs are not public. What is public, and what this post has described, is the OWASP Core Rule Set, which is open source and is what runs underneath a surprisingly large fraction of the managed WAF offerings, including ones you pay for. When a vendor says they protect against the OWASP Top 10, there is a good chance the CRS or a close derivative is doing the work, with the vendor’s own tuning and threat intelligence layered on top. The model, signature plus parser plus anomaly score plus paranoia level, is the same whether you can read the rules or not.

What the design actually buys you

A WAF is a confession that the application underneath it cannot be fully trusted to defend itself. If every parameter were validated, every query parameterized, and every output encoded, the firewall would have nothing to catch. It exists because that ideal is never fully reached, and because the gap between “should be safe” and “is safe” is where attacks live. The anomaly-scoring model is the most honest engineering response to that reality: it stops pretending any single signal is decisive, lets weak signals accumulate, and concentrates the one irreversible decision into a single threshold you can reason about and tune.

What you give up is certainty. A scoring WAF at threshold 5 is making a probabilistic bet on every request, and the paranoia level and exclusions are how you shift the odds without ever eliminating them. Run it too loose and it is theater. Run it too tight and your own users are the casualties, which is why so many WAFs end their lives in detection-only mode, logging attacks they were bought to block. The engineering that matters is almost never the initial deployment. It is the weeks of reading audit logs, writing targeted exclusions, and slowly raising the paranoia level while watching the false-positive curve, until the firewall is tight enough to be worth the latency it adds. The CRS gives you the dials. Whether the thing in front of your application is a defense or a decoration comes down to whether anyone is turning them.


Sources & further reading

Frequently asked questions

What is the difference between a positive and negative security model in a WAF?

A positive (allowlist, default-deny) model enumerates exactly what valid traffic looks like and rejects everything else, which in theory blocks even unknown attacks but is hard to keep complete across hundreds of drifting endpoints. A negative (blocklist, default-allow) model starts from known-bad patterns and lets everything else through, making it easy to deploy with fewer false positives but blind to attacks until signatures catch up. Most production WAFs run a hybrid of the two.

How does the OWASP Core Rule Set decide to block a request using anomaly scoring?

In the CRS, an individual rule never blocks on its own; a matching rule only adds points to a running anomaly score based on its severity. CRITICAL adds 5, ERROR adds 4, WARNING adds 3, and NOTICE adds 2. A separate blocking rule then compares the accumulated total against a threshold, denying the request when the score is greater than or equal to it. The default inbound threshold is 5 and the outbound threshold is 4.

Does raising the CRS paranoia level make the WAF block more aggressively?

No. The paranoia level controls which rules are even loaded, not how aggressively the WAF blocks; the anomaly threshold controls blocking, and the two are independent knobs people often conflate. CRS rules are tagged with a level from 1 to 4, and raising the level enlarges the set of rules that can contribute score. More rules means more chances to catch real attacks and more chances to misread legitimate requests as attacks.

Why does the CRS use libinjection instead of relying only on regular expressions for SQLi and XSS?

Naive regex signatures are brittle against attackers expressing the same payload in many surface forms, such as comment insertion or mixed casing. For SQL injection and cross-site scripting, the CRS calls into libinjection, a dedicated parser that tokenizes input the way a SQL or HTML parser would and decides whether the token stream forms an injection. Reasoning about structure rather than spelling makes it harder to fool, though it is not perfect and has historically misfired on legitimate rich-text editors.

How do you tune away false positives in a ModSecurity CRS deployment without blocking real users?

You start from the audit logs, which record for every blocked request which rule IDs fired and against which variables, then rank requests by accumulated score to prioritize the worst offenders. A startup-time exclusion like SecRuleUpdateTargetById can stop a rule from inspecting a specific variable, which is usually preferable to deleting the whole rule, and a runtime exclusion uses the ctl action to disable a target only for matching requests. The discipline is to run in detection-only or with a higher executing than blocking level first, fix the false positives the logs reveal, and only then tighten.

Further reading