ModSecurity for web-application firewalls

December 14, 2016

This article was contributed by Christian Folini

Many web applications depend on a web-application firewall (WAF) as an important part of their security strategy. There are a few free-software options in the WAF space, but they are all limited in scope. The only multi-purpose project is ModSecurity, which has been around for almost 15 years now. In fact, ModSecurity is often the base that is used by commercial WAF offerings because of its flexibility and the granular control that it offers. But readers may not be familiar with ModSecurity, so an introduction to it is in order.

In 2006, the payment-card industry (PCI) mandated the use of a WAF in order to have a secure online offering, but did not specify a particular WAF implementation. That allowed many companies to offer WAFs of varying quality as the key component for PCI compliance. That variance gave WAFs a bad name; ModSecurity is one of the few brands that stands out from the rest.

It all started in 2002 when the original author, Ivan Ristić, wrote a tool to monitor application traffic in the Apache web server. Web servers do not give the administrator easy access to the request bodies and the responses. The same is true for reverse proxies that are put in front of these servers as gateways. So he wrote an Apache module that would provide this functionality and then developed a rule language to inspect said traffic and perform access control.

In 2006, Ristić sold ModSecurity but continued as a developer of it until 2009, when he left the project to pursue other ideas such as Hardenize. In 2010 ModSecurity ended up in the hands of Trustwave where it was re-licensed under the Apache license (instead of the original GPLv2). The new license made it more attractive to integrate ModSecurity into other products. That led to its adoption by various commercial vendors. Apache is at the core of many WAFs. In addition, there is now a ModSecurity module in those WAF boxes.

The license change also allowed companies to feel more comfortable porting the software to other web servers. Hence, Microsoft contributed an IIS port. The ModSecurity developers ported the code to NGINX, where it found a growing user community. Since ModSecurity was originally an Apache module, the ports were difficult to do and maintenance is painful. That is why a complete rewrite is underway. The idea is to develop ModSecurity into a standalone daemon that will communicate with the web server via an API. This daemon, which is in a functional alpha stage now, will eventually become libModSecurity 3.0. Meanwhile, the stable release branch stands at 2.9.1.

The rule language has always been part of ModSecurity; these days there is also a group of complementary features like schema validation of XML payloads, GeoIP lookup, support for remote blacklists, and even the injection of cross-site request forgery (CSRF or XSRF) tokens into the HTTP traffic.

Writing rules

But let's focus on the rule language, which is a domain-specific language for ModSecurity. The SecRule directive gives access to the full requests and responses via over one hundred different variables. Naturally, this covers all of the headers but it also allows access to individual arguments by parsing the payload of requests and responses as well. This feature also handles XML and JSON request content types, giving fine-grained access to individual parameters in the requests.

A variety of operators allow for inspection of those variables. The workhorse is the regular expression operator. But numerical comparison operators like @lt and @gt exist as well and a fast parallel-matching operator complements the set.

Depending on the return code of the operator, you can carry out a variety of actions. To deny access would be the most obvious one. But internal variables combined with persistent storage allow for a great many use cases that go beyond simple one-shot access control. HTTP is a stateless protocol up to version 1.1. ModSecurity, on the other hand, allows keeping track of a user session, caching expensive lookup operations, or communicating between different server threads in a straightforward way.

Let's look at a simple example of a rule:

    SecRule ARGS:surname "!@rx ^[a-z0-9._-]*$" "id:1000,deny"

This rule inspects the surname request parameter. It helps protect the service using a whitelist of allowed characters that it enforces. The @rx defines the regular expression operator. It is negated and works on the pattern defined immediately afterward. The operator will return a match if the surname parameter contains any character outside of the set listed in the square brackets. If we have a match, the deny action comes into play: it blocks the request and returns an HTTP status code of "403 Forbidden" to the client by default. The id part of the rule is an arbitrary integer identifier that is used in other rules and for logging; it is mandatory for every rule.

The example rule could be inspecting the payload of an HTTP POST request on a registration form. There are many reasons to inspect the payload arguments before passing them to the application. Registration forms usually lead to inserts into SQL databases. Maybe we are facing a situation where an SQL-injection vulnerability was discovered in that part of the application. Our little rule would thus be meant to protect the database storage. But even if no vulnerability was discovered yet, such protection can be important. Unless you are sure that the invocation of the SQL insert statement is 100% safe, you may want to add a second layer of protection. The WAF is providing this security with rules like the one in our example.

A construct such as this also allows you to operate an insecure application with a so-called virtual patch. Virtual patching is a stupid idea — it would better fix an application than trying to lock down something known to be insecure. But as we have seen, fixing applications can take a lot of time. Besides, an in-depth security approach will assume applications are always insecure so they need to be protected with an additional layer of security.

The rule language has a lot more power than what the example above would suggest. Take the following construct as a demonstration of that: it counts the number of occurrences of individual parameters in a request and denies the request if a parameter has been submitted more than once (an attack dubbed HTTP parameter pollution):

    SecRule ARGS_NAMES "@unconditionalMatch" \
	    "id:1001,\
	    pass,\
	    setvar:'TX.paramcounter_%{MATCHED_VAR_NAME}=+1'"

    SecRule TX:/paramcounter_.*/ "@gt 1" \
	    "id:1002,\
	    deny,\
	    msg:'HTTP Parameter Pollution (%{TX.1})',\
	    chain"
	    SecRule MATCHED_VARS_NAMES "TX:paramcounter_(.*)" \
		    "capture"

The first rule builds up an internal hash and counts the number of occurrences for every parameter. The second rule cycles over the hash and checks if any of the parameter has a counter that is higher than 1. If that is the case, it manipulates the parameter name to get a well formatted alert message with the help of another chained rule and the use of regular expression back references.

Whitelisting

In 2005, firewall expert Marcus Ranum insisted that security systems need to be configured with a default deny policy on all levels. If you put that idea into practice on a WAF, you write a presumably long list of SecRule constructs like the one in our first example. You define value domains for every parameter of an application. But you will also cover the HTTP request method, the permissible HTTP headers, the cookies, and the list of acceptable location paths within the request line. You describe the application traffic down to the byte-level. This is called positive security.

ModSecurity allows for such a rule set. However, the big security benefit is overshadowed by the cost of maintenance. Because of the tight rule set, every change to the application will result in requests being blocked. That is not really practical. But in real life, there is a benefit to using partial whitelists. Such a rule set only covers parts of the application where you have identified specific threats. Perhaps in the login and registration forms, for example. There is public access to these resources. This makes them easy targets and qualifies them as a good place for a partial whitelist. Once the user has authenticated, the risk is smaller and you can decide to go without the whitelisting for the rest of the application.

Still, rule writing is hard and takes some experience. Unfortunately, the online documentation of ModSecurity is not very comprehensive. It is also scattered over multiple sites and many blog posts and tutorials. This is why the ModSecurity Handbook by Ristić has achieved a quasi-official status. The book came out in 2010 and it is now showing its age. Ristić asked me to write the second edition of the handbook. I spent the whole summer on this venture and the manuscript is now complete. While the book is being finished, you can purchase an early-access online version from the publisher, Feisty Duck.

Predefined rule sets

Given the difficulty of writing rules and the disadvantages of whitelisting rules, most people do not write their own rules. They use predefined sets of blacklist rules instead. ModSecurity blacklist rules are a series of rules that try to catch exploit payloads before they hit a server. In their basic form, these lists of patterns look like a list of Snort signatures, or they resemble a long list of exploit patterns you might find in anti-virus databases. Using the blacklist rules on a web server leads to performance problems. And like Snort or an anti-virus, you are only catching known exploits. New attacks will evade the rule set; as will re-implementations of known exploits with a different fingerprint.

So this approach has not been successful. But what is more successful is generic blacklisting. That is, inspecting the traffic and looking for signs of an attack. If you parse an argument within a request and it smells like an SQL injection, then you can deny access. SQL injections come in a variety of forms, so you will need a variety of rules to catch them. The OWASP ModSecurity Core Rule Set is a collection of rules that provides this functionality. A second article will describe this rule set and the new features provided by the CRS 3.0 release.

Index entries for this article
Security	Web-application firewall
GuestArticles	Folini, Christian

ModSecurity for web-application firewalls

Posted Dec 16, 2016 16:49 UTC (Fri) by smurf (subscriber, #17840) [Link] (3 responses)

Surnames may have spaces, non-ASCII letters, apostrophes, and whatnot. Summary: Please don't do that.

ModSecurity for web-application firewalls

Posted Dec 17, 2016 4:50 UTC (Sat) by dune73 (guest, #17225) [Link] (2 responses)

Sure thing. It's a simple example with a simple regex.

The real world rules for free text fields are a bit more complex.

ModSecurity for web-application firewalls

Posted Dec 17, 2016 11:15 UTC (Sat) by anselm (subscriber, #2796) [Link] (1 responses)

Actually, people may not even have surnames. Fortunately the original regex takes that into account; let's hope that the actual application does, too.

ModSecurity for web-application firewalls

Posted Dec 18, 2016 4:57 UTC (Sun) by dune73 (guest, #17225) [Link]

It is tempting to do the full input validation via ModSecurity rules. But the client and the application are in a much better position to do so.

Not having a surname is a typical example. It's up to the application to decide what to do with such a registration. ModSecurity should concentrate on security and leave people without a surname alone.