SQL injection vulnerabilities in PostgreSQL
A recent urgent update to PostgreSQL vividly demonstrates the problems with validating user input that are the foundation of SQL injection attacks. Widely used techniques to escape characters in user input can still allow SQL injection when coupled with multibyte character encodings. While this problem was first discovered in PostgreSQL, today's security fix announcement for MySQL indicates a similar problem there as well.
As discussed in the LWN SQL injection article, inserting strings of user input into SQL queries can be hazardous. Many applications do little or no validation of strings entered by a user before dropping them into a query; this negligence can lead to a compromise of the entire database. Better behaved programs attempt to escape various troublesome characters (typically single-quote and backslash), but because of the multibyte-encoding problem, problems can remain.
It is not just database clients that need to validate user input, the database server needs to validate as well as the first bug shows. PostgreSQL allows the "\'" (backslash + single-quote) sequence to be used to represent a single-quote character in a query as well as the two single-quote character sequence ("''") that is the SQL standard. Unfortunately, the escaping code used by database clients often ignores the character encoding and just looks for bytes with a 0x27 ("'") value and replaces them with an escaped version. The security hole comes about because illegal multibyte character sequences can be used to enable quotes to slip past the escaping process. An example provided in the technical information describes how this can be done.
In the UTF8 encoding, the byte value 0xc8 introduces a two-byte character; the second byte must be within the range 0xa0-0xff. However, PostgreSQL would accept any value for the second byte and treat both bytes as a single character. A malicious user could enter "0xc8'text", which would be converted by the well meaning client to "0xc8''text" (or "0xc8\'text"); the server would then treat the 0xc8' or 0xc8\ sequence as a single character, leaving an unescaped single-quote in the input, effectively injecting the attacker-supplied text.
The second issue stems from certain far-eastern encodings where the value 0x5c ("\") is a valid value for the second byte of a two-byte character. In the SJIS encoding for example, the two-byte sequence 0x95 0x5c is a valid character, but a client that is not encoding-aware may try to escape the 'backslash' that it sees by doubling it. Adding single-quotes into the mix provides a means for a SQL injection. "0x95 0x5c'text" could become "0x95 0x5c\''text", which effectively inserts an unescaped single-quote into the query. It is interesting to note that 0x27 ("'") is not a valid value for the second byte of a two-byte character and, if PostgreSQL had rigidly adhered to the SQL standard and only accepted "''" to escape single-quotes, this issue would not exist.
There is a straightforward fix for the first problem: do not accept illegal multibyte character sequences and refuse to process queries that contain them. Unfortunately, the second problem is more complicated and there is no single simple fix on the database server side. If database clients did their escaping in an encoding aware manner, this problem would not exist; expecting this from all clients is hopeless, however. The PostgreSQL developers chose to disallow "\'" for any encoding that allows embedded 0x5c characters. This closes the hole for all clients that use "''" to escape single-quotes but still allows for injections for clients that use "\'". This change is likely to break those clients altogether, however.
Both of these problems could have been avoided by using prepared statements with placeholders (i.e. 'SELECT * FROM tbl WHERE id=?'). Even if the libraries did not implement the quoting correctly, the SQL engine would still not allow the parameter to be treated as anything but data for that particular spot in the query, thereby avoiding the injection. Another way to avoid this kind of problem is to use stored procedures. As these bugs show, it can be very difficult to appropriately filter and/or validate user input.
| Index entries for this article | |
|---|---|
| GuestArticles | Edge, Jake |
