Skip to content

The OWASP Core Rule Set: anatomy of the rules that protect most of the web

· 17 min read
Copyright: MIT
OWASP CRS wordmark with rule-category numbers and the anomaly-score block rule

If you run a web application firewall on a generic ruleset, there is a very good chance you are running the OWASP Core Rule Set, even if you have never opened one of its files. It ships inside Azure Application Gateway. It backs the default managed ruleset on a long list of cloud WAFs. It is the thing that fires when somebody sends ' OR 1=1-- at your login form and gets a 403 instead of a stack trace. And for a project that guards so much traffic, it is remarkably legible: a few hundred text files of regular expressions and small pieces of logic, all readable, all in a public Git repository, all under Apache 2.0.

The interesting part is not that the rules exist. It is how they decide. A single suspicious pattern does not block a request. The CRS adds up evidence, weighs it against a threshold, and only then commits to a verdict, and the aggressiveness of the whole thing is a dial you can turn. This post takes that machinery apart.

We will start with where the project came from, because the CRS and its engine have changed hands more than once and the names matter. Then the rule categories and the numbering scheme that organises them. Then the two ideas that make the CRS what it is: anomaly scoring and paranoia levels. After that, the engines that actually execute the rules, ModSecurity and the newer Go-based Coraza, and what the 2024 split between project and engine means. We close on the bug-bounty reckoning that shaped version 4 and what it says about the limits of a regex-driven blocklist.

2002 to 2024: a project that outlived its company twice

ModSecurity came first, and the rules came later. Ivan Ristić released the first version of ModSecurity in November 2002 as a module for the Apache HTTP Server, a way to watch application traffic and act on it from inside the web server itself. He built a company around it, Thinking Stone, which Breach Security acquired in 2006. Trustwave bought Breach Security in 2010 and relicensed ModSecurity under Apache 2.0, where it has stayed.

The Core Rule Set grew out of that ecosystem. Early copyright records put the start of the ruleset around 2006, maintained for years by Trustwave’s SpiderLabs team. For most of that history the CRS and ModSecurity were spoken of in one breath, often as “the ModSecurity Core Rule Set,” which is why so much old documentation conflates the rules with the engine. They are separate things. One is a language of detection rules. The other is software that reads them.

That separation became official policy when the money left. In 2021 Trustwave announced end-of-sale of its commercial ModSecurity support effective August 1, 2021, with end-of-life of that support set for July 1, 2024. The engine was not being killed, but its commercial steward was stepping back, and the open-source project had to plan for a future without a corporate parent. The CRS had already moved to its own home at github.com/coreruleset/coreruleset and its own governance under OWASP. ModSecurity itself followed in January 2024, when Trustwave handed the project to the OWASP Foundation, with the transfer commencing on January 25, 2024. For the first time the rules and the engine lived under the same roof, both free, both open.

2002 ModSecurity (Ristić) ~2006 CRS begins (SpiderLabs) 2010 Trustwave buys Breach 2021 Trustwave EOS support 2024 ModSecurity to OWASP The rules and the engine were always separate. In 2024 they finally shared an owner. *The CRS predates its current name and outlived two corporate stewards before landing, with ModSecurity, under OWASP.*

A 2024 detail worth holding onto: the project formally dropped “ModSecurity” from its name. It is the OWASP Core Rule Set now, not the OWASP ModSecurity Core Rule Set, because it no longer assumes ModSecurity is the engine underneath. That rename is not cosmetic. It is the project admitting that the rules have to outlast any single implementation.

The 9xx files: how the rules are organised

Open the rules/ directory and the structure announces itself. Each file is a category, each category owns a thousand-number block of rule IDs, and the filename carries both. REQUEST-942-APPLICATION-ATTACK-SQLI.conf holds the SQL injection rules, all numbered between 942000 and 942999. REQUEST-941-APPLICATION-ATTACK-XSS.conf holds cross-site scripting, 941000 to 941999. The number is the namespace.

The request-side rules run roughly in order of the HTTP transaction they inspect. The low 900-thousands handle setup and protocol sanity before the attack-specific rules ever look at a payload. Initialisation lives at 901. Method enforcement at 911. Scanner and bad-bot detection at 913. Protocol enforcement at 920, which checks that the request is well-formed HTTP at all, with protocol-attack rules for things like request smuggling and response splitting at 921. Then the application-attack families: local file inclusion and path traversal at 930, remote file inclusion at 931, remote command execution at 932, PHP injection at 933, Node.js injection at 934, cross-site scripting at 941, SQL injection at 942, session fixation at 943, and Java attacks including deserialization at 944. The response side mirrors this, with outbound rules in the 950-thousands catching data leakage and error disclosure on the way out.

Request processing, top to bottom 901initialization 911method enforcement 913scanner / bad-bot detection 920protocol enforcement 921protocol attack (smuggling, splitting) 930–934LFI / RFI / RCE / PHP / Node.js 941–944XSS / SQLi / session fixation / Java 949blocking evaluation (inbound) 95xoutbound: data leakage, errors Protocol sanity runs first; attack-family rules feed a score; rule 949 decides; 95x checks the response. *Each category owns a thousand-number ID block. The orange rows are the attack families everyone pictures when they think "WAF rule."*

Within each file the convention is just as strict. The rules numbered 94x000 through 94x099 are structural plumbing, not detection logic, and the project asks you to leave them alone. Detection rules start at 94x100. That low-number reservation is what lets the CRS reorganise its internals across versions without breaking the rule IDs that operators have written exclusions against.

One file deserves a special mention because it is the brain of the operation. REQUEST-949-BLOCKING-EVALUATION.conf contains no attack signatures at all. Its job is to look at the score every other rule has been accumulating and decide whether to block. Rule 949110, “Inbound Anomaly Score Exceeded,” is the one that actually returns the 403. Everything upstream of it only adds numbers.

Anomaly scoring: the part that makes it work

Here is the idea that separates the CRS from a naive blocklist. Most WAFs of the early 2000s worked in what the project calls traditional or self-contained mode: one rule matches, the request dies. That is simple and it is brittle. Any single rule tuned aggressively enough to catch real attacks will also catch real users, and any single rule tuned loosely enough to spare real users will miss real attacks. A blocklist made of independent tripwires has no way to say “this looks a little suspicious” versus “this is clearly an attack.”

Anomaly scoring fixes that by making the rules vote. When a rule matches, it does not block. It adds points to a running counter, tx.anomaly_score, using the engine’s setvar action. How many points depends on the rule’s declared severity. A CRITICAL rule adds 5. An ERROR rule adds 4. A WARNING rule adds 3. A NOTICE rule adds 2. These values live in variables, tx.critical_anomaly_score and friends, so an operator can retune the whole weighting without editing individual rules.

After every request rule has run, rule 949110 compares the accumulated inbound score against a threshold, tx.inbound_anomaly_score_threshold, which defaults to 5. Cross it and the request is blocked. There is a matching outbound path for responses, with tx.outbound_anomaly_score_threshold defaulting to 4, which catches things like a database error or a chunk of source code leaking into the response body.

incoming request 942100 SQLi match+5 920270 bad chars+3 920170 GET w/ body+2 score=10 949110 10 ≥ 5 → 403 No single rule blocks. Weighted matches accumulate, and 949110 compares the total to the threshold (default 5 inbound). Rule IDs shown are illustrative. *Anomaly scoring turns a set of independent tripwires into a jury. One borderline match is noise; several together cross the line.*

The default threshold of 5 is set so that a single CRITICAL match is enough to block. That sounds like it defeats the purpose of accumulation, and at the default it nearly does. The power shows up when you raise the threshold. Set it to 10 or higher and you tell the CRS that one critical rule firing on its own is not enough; you want corroboration. That is exactly the tuning knob a busy site reaches for when a single over-eager rule is generating false positives but the site still wants to block requests that trip several rules at once. The score is a confidence measure, and the threshold is where you decide how much confidence equals a block.

Two operational modes ride on top of this. Detection-only mode runs every rule and records the scores and the alerts but never actually blocks, which is how every sane deployment starts: you watch the logs for a week, see what would have been blocked, and tune away the false positives before you ever flip to blocking. Blocking mode then enforces. Version 4 added early blocking, an option (tx.blocking_early) that evaluates the score after phase 1 and phase 3 rather than waiting for the full request body, so an obvious attack can be cut off before the engine spends cycles parsing the rest of it.

If you have read our piece on how a WAF actually works, this is the scoring layer it describes, viewed from inside the ruleset rather than from the engine.

Paranoia levels: the aggressiveness dial

Scoring decides how much evidence blocks a request. Paranoia levels decide how much the CRS looks for in the first place. The two are independent, and conflating them is the single most common misunderstanding about the ruleset.

A paranoia level is, mechanically, a tag on a rule. Every detection rule belongs to a level from 1 to 4. When you set tx.paranoia_level in the configuration (rule 900000 in crs-setup.conf), the CRS enables every rule tagged at or below that number and skips the rest. Level 1 is the default, and the project describes it as the set of rules that hardly ever produce a false alarm: “CRS for everybody running an HTTP server on the internet.” It is meant to be deployed without much tuning.

Climb the dial and the bargain shifts. Level 2 adds rules that catch more attacks at the cost of occasional false positives on legitimate traffic. Level 3 adds detection for specialised attacks and brings what the documentation bluntly calls “online banking level security with lots of false positives.” Level 4 is the top, with rules so aggressive they flag almost every possible attack and a great deal of innocent traffic along with it, to be used, in the project’s own words, at one’s own risk.

false positives PL1 default PL2 tuning needed PL3 banking-grade PL4 at your own risk Each level adds rules. Coverage rises, but so does the noise floor. PL1 is meant to need almost no tuning. *Paranoia level is a coverage dial, not a sensitivity threshold. Higher levels switch on more rules; the score threshold is a separate decision.*

There is a subtler companion setting, the executing paranoia level. It lets you run the rules from a higher level without counting their hits toward the blocking score, so you can see what level 3 would have flagged on your real traffic before you commit to enforcing it. That is the disciplined way to climb the dial: enable the next level in executing mode, read the alerts it generates against genuine requests, write exclusions for the false positives, and only then promote it to the real paranoia level. The cost of skipping that step is a wall of 403s for ordinary users, which is how the CRS earns its reputation for being painful when deployed carelessly and quiet when deployed with patience.

The standard tuning move, once you find a rule that misfires on a known-good parameter, is a rule exclusion: a small companion rule that tells the engine to skip a specific rule ID for a specific parameter, path, or site, rather than disabling the rule globally. CRS 4 leans on this hard, shipping curated exclusion plugins for applications like WordPress and Nextcloud so operators do not have to rediscover the same false positives every other large WordPress install has already mapped.

The engines: ModSecurity, and then Coraza

A ruleset is inert. Something has to read the SecLang rules, parse the HTTP transaction, run the regular expressions, and act on setvar and the rest. For two decades that something was ModSecurity, and the rules are written in ModSecurity’s rule language, SecLang, with its SecRule directives, transformation functions, and phases.

ModSecurity’s own architecture forked along the way. The classic version 2 is an Apache module, tightly bound to the Apache request lifecycle. Version 3, announced as libmodsecurity in January 2018, pulled the engine out into a standalone library with a thin connector for each web server, so the same engine could sit behind Apache, Nginx, or others through a small shim. That redesign is what made ModSecurity portable beyond Apache, and it is the version most new deployments target.

The bigger structural change is younger. When Trustwave’s commercial support wound down and the engine’s long-term future looked uncertain, the CRS project decided not to bet its survival on a single C codebase. In December 2021 it announced Coraza, a clean-room reimplementation of a SecLang engine written in Go by Juan Pablo Tosso. Coraza reads the same rules, speaks the same SecLang, and passes the full CRS test suite, which is the bar that matters: if it passes the test suite, the rules behave the same way they do on ModSecurity. Coraza runs as a library and plugs into Go-friendly platforms like Caddy and Traefik, with further integrations developed over time.

OWASP Core Rule Set SecLang .conf files ModSecurity C · Apache / Nginx Coraza Go · Caddy / Traefik Same rules, two engines. Coraza passes the full CRS test suite, so the verdicts match. *The 2021 to 2024 reorganisation decoupled the rules from any one engine. Pass the test suite and you are a valid CRS host.*

This is why the version-4 rename mattered. With ModSecurity now an OWASP project and Coraza a credible second implementation, the rules genuinely no longer belong to one engine. One concrete consequence shows up in the regular expressions themselves: CRS 4 rewrote every rule that depended on PCRE-only features so the patterns also run on RE2 and Hyperscan, the linear-time engines that high-throughput WAFs favour because they cannot be driven into catastrophic backtracking. A rule that only worked on PCRE was a rule that quietly assumed ModSecurity. Removing that assumption is housekeeping with teeth.

For readers coming from the offensive side, the patterns and encodings these rules try to catch are the same ones covered in WAF evasion concepts and, at the protocol layer, in HTTP request smuggling, which is exactly what the 921 protocol-attack family exists to detect.

Version 4 and the bug-bounty reckoning

CRS 4.0.0 shipped on February 14, 2024, after a development cycle long enough to become a running joke in the project’s own announcements. Roughly 500 changes went in. The headline feature was a plugin architecture, with a reserved rule-ID range from 9,500,000 to 9,999,999 and an official registry, that moved optional and application-specific functionality out of the core and into installable plugins. The point of that is attack-surface reduction. If you do not run Nextcloud, you should not be carrying Nextcloud’s exclusion rules, and now you are not.

The plugins are real features, not just exclusion bundles. One scans uploaded files through ClamAV and blocks on a virus hit. Another does fake-bot detection, checking whether a client claiming to be Googlebot or Amazon actually resolves to that company’s address space via reverse DNS, which is a far stronger signal than trusting the User-Agent string. Version 4 also added web-shell detection on the response side, dropped HTTP/0.9 support to kill a class of false positives, and added support for HTTP/3.

The most interesting thing about version 4 is not a feature. It is why it took so long. In the spring of 2022, two days after the first release candidate, the project ran a bug bounty. Yahoo’s security team funded it and the Intigriti platform ran the triage, and they recruited specifically from people with a public record of WAF bypasses. The hunters found a lot. Across roughly 175 reports came 511 individual payloads, and among them close to a dozen full or partial ruleset bypasses, the worst kind of finding, where an attacker can structure a request so that the CRS waves the real attack through.

The project made a decision at that point that is worth respecting. Rather than ship version 4 on schedule with the holes documented as known issues, it held the release and fixed the findings. That is the delay. CRS 4 is late because it spent the better part of two years closing bypasses that a focused group of specialists had just proven were exploitable. A general-purpose regex blocklist sitting in front of every kind of application has an enormous surface, and the bounty was the project measuring that surface honestly.

That honesty is the right note to end on, because it points at the structural limit of the whole approach. The CRS is a blocklist, and a blocklist enumerates badness. The space of malicious inputs is larger than any list, and every new encoding trick, every parser quirk, every freshly disclosed gadget is a new way around a pattern that was written before anyone knew the trick existed. Anomaly scoring softens this by refusing to bet everything on one rule. Paranoia levels let an operator buy more coverage when the asset is worth the false positives. Neither changes the fundamental shape of the problem. What the CRS offers is not a wall but a well-instrumented, openly auditable, community-tested approximation of one, which for a default ruleset guarding a large share of the public web turns out to be a genuinely useful thing to have. The version-4 bounty is the proof: the holes were found by the good guys first, in public, and closed in the open, which is the most you can ask of a blocklist and more than most ever get.


Sources & further reading

  • OWASP CRS Project (2026), coreruleset/coreruleset — the official Git repository, rule files, license, and supported-engine statement.
  • OWASP CRS (2024), Anomaly Scoring — the documentation page defining severity scores, default thresholds, and the blocking-evaluation mechanism.
  • OWASP CRS (2024), Paranoia Levels — definitions of PL1 through PL4 and the executing-paranoia-level concept.
  • OWASP CRS (2024), Let CRS 4 be your valentine! — the 4.0.0 release post, with the plugin architecture, early blocking, and feature list.
  • OWASP CRS (2023), What we learnt from our bug bounty program — the 2022 bounty writeup: 175 reports, 511 payloads, the bypasses, and the partners.
  • OWASP CRS (2024), Welcome the newest addition to the OWASP family: ModSecurity! — the January 2024 transfer of ModSecurity from Trustwave to OWASP.
  • OWASP (2021), Announcing Coraza — the introduction of the Go-based SecLang engine and its CRS compatibility.
  • Coraza Project (2026), corazawaf/coraza — the Go ModSecurity-compatible WAF library, with platform support and test-suite status.
  • Wikipedia (2026), ModSecurity — engine history from Ristić’s 2002 release through Breach Security, Trustwave, libmodsecurity, and the OWASP handover.
  • Microsoft Learn (2025), CRS and DRS rule groups and rules — a category-by-category listing of the rule groups as deployed in Azure’s managed WAF.
  • netnea (2024), Including the OWASP ModSecurity Core Rule Set — Christian Folini’s tutorial on installing and tuning the CRS, including scoring internals.

Further reading