Savannah.gnu.org compromised [LWN.net]

Savannah.gnu.org compromised

Posted Nov 30, 2010 16:03 UTC (Tue) by Rubberman (guest, #70320) [Link] (5 responses)

I have been designing and implementing large-scale transaction processing systems for decades. SQL injection attacks are unconscionable, in my opinion. ALL data values passed to a database access layer should by necessity be in the form of bound variables, and the underlayer should NEVER just convert the data to standard SQL strings, but should indeed bind them to variables such as :data1, etc. This will TOTALLY eliminate the possibility of SQL injection compromises. Unfortunately, programmers (not real software engineers) are lazy and often compromise safety and effectiveness with time to delivery. That extra 5 minutes to do it "right" will save many, many hours in the future, as in that old saying "You can pay me now, or you can pay me 10x later!".

Savannah.gnu.org compromised

Posted Nov 30, 2010 16:14 UTC (Tue) by Trelane (subscriber, #56877) [Link] (2 responses)

Seems to me that the SQL design decision of mixing commands and data in one big user-specified string is the root of the problem.

Savannah.gnu.org compromised

Posted Nov 30, 2010 16:33 UTC (Tue) by Cato (guest, #7643) [Link]

You could say the same about shell scripts where commands mix constants and variables, or almost any language that lets you execute a dynamically created string at run time.

At least SQL has provided bound variables for a very long time - even the early embedded SQL tools for C, COBOL, etc, provided this, so there's really no reason not to use this. The problem is that using bound variables takes a small amount of extra coding (and in old versions of MySQL, has a performance impact due to lack of query caching) - hence it's easier to skip the perceived extra hassle of using bound variables with the dire results for security.

To be fair, there's also the basic issue of input parameter validation which is behind so many web application vulnerabilities including remote command execution, file inclusion, cross-site scripting (XSS) and SQL injection.

However using bound variables is the cleanest way to stop SQL injections and doesn't depend on 100% correct input validation.

Savannah.gnu.org compromised

Posted Dec 1, 2010 10:22 UTC (Wed) by dgm (subscriber, #49227) [Link]

Nothing to do with SQL. The problem lies in the libraries used to send queries from your code to the server.

Those libraries should prevent any kind of constant value in the query string, and force all values through bound variables.

A pass-through should also exist, otherwise applications like a SQL expression editor or a database shell would be impossible, but they should be made so inconvenient as to prevent casual usage, and keep it for the things that really need this capability.

Language support needed

Posted Nov 30, 2010 19:18 UTC (Tue) by talex (guest, #19139) [Link]

With just a little help from the programming language, you can make doing it right easy, and doing it wrong extremely difficult. For example, E makes it trivially easy to define alternative ways to interpolate values into expressions (besides plain string interpolation) so you can embed SQL in E like this, despite the language not supporting SQL directly:

def addUser(id, name) { sql`INSERT INTO users (id, realName) VALUES ($id, $name)` } def getName(id) { def [name] := sql`SELECT realName FROM users WHERE id=$id`.singleton() return name }

Savannah.gnu.org compromised

Posted Dec 1, 2010 2:23 UTC (Wed) by alankila (guest, #47141) [Link]

As a point of note, there are actually valid reasons to embed parameters in the query sometimes.

It is when you have optimizer making query plan during statement preparing time. Sometimes it helps a great deal to know what some important variable values are, for instance, will filtering by foo = 'bar' give 50 rows or 5000 rows out of table with 50000 rows.

If all you say is foo = :foo, the database can't use detailed statistics, and for some use cases this can have a significant penalty.

Savannah.gnu.org compromised

Posted Nov 30, 2010 16:39 UTC (Tue) by madscientist (subscriber, #16861) [Link] (3 responses)

Unfortunately the original codebase that Savannah is derived from didn't do a good job of guarding against these types of security issues. There was a previous pass by the GNU folks to try to clean it up but apparently some areas were missed. It's one thing to build good security in when it's an original design goal: it's MUCH more complicated to add it on after the fact.

I'm not sure what the underlying language is for Savannah, but it's also the case that some languages are much better suited for this type of thing than others.

One nice thing is that, unlike some other site compromises, we can probably expect to get a very complete explanation of what happened from the FSF, and what steps are being taken in response.

Savannah.gnu.org compromised

Posted Nov 30, 2010 17:16 UTC (Tue) by lolando (guest, #7139) [Link] (2 responses)

> It's one thing to build good security in when it's an original design goal: it's MUCH more complicated to add it on after the fact.

I can confirm that. As developer of another derivative from the same original code (FusionForge, also coming from SourceForge via GForge), I spent many many *many* hours rewriting all our database access to use parametrized queries. I feel the pain of the Savannah guys.

Savannah.gnu.org compromised

Posted Nov 30, 2010 20:34 UTC (Tue) by Los__D (guest, #15263) [Link] (1 responses)

Have you considered contributing* the code to the Savannah folks, or has the code diverted too much?

* As in pointing them to your changes, in case they didn't know that they existed.

Savannah.gnu.org compromised

Posted Dec 1, 2010 9:31 UTC (Wed) by lolando (guest, #7139) [Link]

I probably mentioned it to them, yeah. But the patch itself is fairly intrusive and, as you guessed, our most recent common ancestor is more than 10 years old so it's of little practical value. Most of it is replacing db_query("SELECT foo FROM bar WHERE key='$value'") with db_query_params('SELECT foo FROM bar WHERE key=$1',array($value)), which can be semi-automated for about 80% of the queries; the only part of it that's not boring grunt-work (Perl be praised) is a mechanism to handle complex queries built on the fly with varying number of tables in the join, varying number WHERE clauses, and so on, with some combinatory explosion that excludes writing all the possible queries in advance and requires some dynamic stuff.

(If anyone's interested, https://fusionforge.org/scm/viewvc.php/trunk/src/common/i... has the implementation)

Gna! too? was: Savannah.gnu.org compromised

Posted Nov 30, 2010 16:52 UTC (Tue) by AlexHudson (guest, #41828) [Link] (1 responses)

Looks like Gna! admins are investigating an issue too.

Safe to assume that someone's been prodding the Savane-based installs at least :(

Gna! too? was: Savannah.gnu.org compromised

Posted Nov 30, 2010 17:36 UTC (Tue) by AlexHudson (guest, #41828) [Link]

Looking at this:

https://mail.gna.org/public/project/2010-11/msg00036.html

It actually looks like the Gna! admins pulled their web interface as a preventive measure; no actual security problem has (yet?) been detected.

Savannah.gnu.org compromised

Posted Nov 30, 2010 22:45 UTC (Tue) by aliguori (subscriber, #30636) [Link] (1 responses)

We host our project on Savannah and once again, I'm incredibly happy that we use git. Once we figured out who was the last person that pushed, we were able to sync up our repository to another server and keep going as if nothing happened.

If we had been using a centralized revision control system (we used to use SVN on Savannah), it would have been a nightmare.

Savannah.gnu.org compromised

Posted Dec 1, 2010 10:10 UTC (Wed) by dgm (subscriber, #49227) [Link]

Not only that. You can also instantly tell if the repo has been compromised thanks to git hashing all its history.

Forges as a target

Posted Dec 1, 2010 10:02 UTC (Wed) by Trou.fr (subscriber, #26289) [Link]

After BerliOS was compromised in January (http://lwn.net/Articles/369633/), Savannah was now targeted.
It is no surprise that forges are targeted : what a perfect way to add backdoors into distributions. While those two were detected (btw, the savannah folks are way more transparent than BerliOS, which basically said "fuck you" to people asking for more detail), how many more went undetected ?

Savannah.gnu.org compromised

Posted Dec 1, 2010 11:49 UTC (Wed) by Karellen (subscriber, #67644) [Link] (8 responses)

From the TODO list:

Implement crypt-md5 support (like /etc/shadow, strong and LDAP-compatible) hashes, or possibly crypt-sha2

What? This wasn't already in place?

Savannah.gnu.org compromised

Posted Dec 2, 2010 14:51 UTC (Thu) by madscientist (subscriber, #16861) [Link] (7 responses)

I'm not sure what you're suggesting; the passwords WERE encrypted. They were just using an older version of the hash: the "traditional" UNIX hash algorithm. As far as I'm aware that algorithm is theoretically more vulnerable but it's not by any stretch trivially vulnerable. The main reason Linux et.al. moved to newer algorithms was many companies require NIST-certified encryption and the original wasn't.

Anyway, based on the reports provided the passwords were cracked by brute-force, not by decrypting them. A better hashing algorithm wouldn't have helped in that situation anyway (unless possibly the algorithm took orders of magnitude longer to run, which could have slowed the crack attempt--I'm not sure if the new algorithms are enough slower to make this a legitimate deterrent).

Savannah.gnu.org compromised

Posted Dec 2, 2010 16:43 UTC (Thu) by anselm (subscriber, #2796) [Link] (4 responses)

Being a variant of DES, the traditional Unix password encryption scheme can only distinguish a fairly small number of passwords (2⁵⁶, to be exact – a password consists of eight seven-bit characters, and if you enter something longer it will be truncated to length eight). In the decades since the scheme was originally put forward, computers have become fast enough so that it is no longer a big deal to work through all 2⁵⁶ possible passwords. It has also become reasonable to precompute and store tables of encrypted passwords, so if one gains access to a list of encrypted passwords it becomes trivial to find the corresponding unencrypted ones.

The more modern schemes use hash functions with much longer results (128 bits for MD5) and also allow longer unencrypted passwords, so precomputed tables are a lot less feasible. As far as brute-force attacks go, people have made progress doing those using cloud-based services or botnets, so even the weaker »strong« hash functions like MD5 or SHA-1 probably ought to be on their way out.

Savannah.gnu.org compromised

Posted Dec 3, 2010 16:36 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (3 responses)

Traditional crypt is weaker than anyone would like for a new system, but I think you sell it very short here.

"no longer a big deal" might be a fair description of something which utterly collapses when reasonable resources are engaged, like say the LM hash where you can use pre-computed rainbow tables that fit easily on an ordinary desktop PC, then have password recovery take a few seconds (for disk seeks and re-computation) per hash.

But for crypt you could pick the sweet spot of CPU power per dollar, spend hundreds of thousands of dollars, and still take months per hash. If you can afford petabytes of storage to go with those CPUs you could pre-compute, and only do the work once, but that's yet more hundreds of thousands of dollars because now you're building a SAN as well as a compute farm.

Knowing this, and given that nobody thinks the NSA broke into Savannah, it's no surprise that the reason the passwords were trivially determined is that they used unsalted MD5, which is far worse than crypt despite the apparent improvement of 128 bits vs 56. Partial rainbow tables (covering e.g. "HaRdPwD" or "linux98") for unsalted MD5 are freely available anywhere files of dubious legality are traded and probably the black hats went from the list of hashes to their first few dozen passwords in seconds, with nothing more than a home PC.

Savannah.gnu.org compromised

Posted Dec 3, 2010 18:23 UTC (Fri) by paulj (subscriber, #341) [Link] (2 responses)

The problem is that this CPU power is re-useable. If someone who has already generated long lists of string->DES(string) puts them online in tabular form, then others can avail of that use. And guess what, people have done this for DES (and MD5) for both unsalted and salted strings. Google for rainbow tables (NB: they're not necessarily available for free). LWN even did an article on them: http://lwn.net/Articles/208418/.

Rainbow tables - not well suited to crypt(3)

Posted Dec 4, 2010 15:26 UTC (Sat) by tialaramex (subscriber, #21167) [Link] (1 responses)

Rainbow tables are a good choice for plain MD5 (note, not PHK's MD5-based crypt(3) as used in some Unix systems) because it is unsalted.

Rainbow tables are a bad choice for traditional crypt(3) because it has salt. Not enough salt by modern standards, but still a 12 bit salt means your time-space tradeoff in choosing Rainbow tables just got 4096 times worse. You will need to crack many tens of thousands of equally valuable passwords to amortize the additional cost compared with brute force.

[NB the lwn article you linked is about a Windows hash scheme which uses unsalted DES, Microsoft chewed through three or four different lousy password hash schemes before finally getting one that doesn't suck worse than the one they could have cloned from Unix 30 years ago...]

To give some idea of what that 4096 number means in the real world:

If you can afford disk space and CPU time to build a Rainbow table for MD5 that covers the alphabet plus digits (A-Za-z0-9) for up to 8 characters (a lot of work, but enough to reverse many passwords), then if you try salted crypt(3) instead you'll only have resources for up to 6 characters instead.

You will miss any half-way secure passwords this way, which makes it of very limited value.

But the story gets better yet, because the above assumed the two hashes were equally expensive to calculate. But this isn't true unless you have the resources to use custom hardware; on general purpose CPUs crypt(3) was deliberately pessimised, and remains annoyingly expensive in CPU time for brute forcers or calculating Rainbow tables.

So the difference between traditional crypt(3) and MD5 ends up being way more than a constant factor 4096 for Rainbow tables. Enough more that I've never seen a public project to attempt even the somewhat useless six alphanumerics I described above for crypt(3).

Is it possible there's a private project? Yes. Is it likely? Hard to say, ask someone with more relevant expertise. Would it be expensive? Undoubtedly yes. So then is it likely script kiddies would use it to break into Savannah? No.

Rainbow tables - not well suited to crypt(3)

Posted Dec 4, 2010 16:57 UTC (Sat) by paulj (subscriber, #341) [Link]

The Savannah implementation were using unsalted ones though, if I've read this story correctly. Though, I don't know why I replied to the person I did, they're clearly aware of rainbow tables. My comment seems quite misplaced as a response to that particular comment, apologies to them.

I take your point on salting, but I still don't think it's infeasible for DES crypt, given that the 2^12 fold increase is still of a significantly lower scale than the size of botnets that are readily available for sale on the black-hat market. I'd agree DES crypt() isn't a sufficiently juicy target anymore (what uses it anymore?) and , but if someone /did/ care enough to brute-force DES crypt and if they expected to do this regularly enough, wouldn't they be daft to NOT store the results? E.g. if you needed O(100)GB of space for the total rainbow table, and if you could risk stealing 10GB from each computer in your botnet, then obviously you're looking at O(40k) size botnet to store it. That seems well within realms of feasibility... If lower probabilities of password recovery suffice (cause you're not interested in recovering specific ones), you can use less space and/or a smaller botnet.

And just generally, regardless of the size of the salt, if you're going to spend time computing it, you might as well store the result. The multiplicative increase of the salt need /not/ apply fully to your storage requirements, because you can trade storage space for computation with longer chains.

Savannah.gnu.org compromised

Posted Dec 3, 2010 12:39 UTC (Fri) by Trou.fr (subscriber, #26289) [Link] (1 responses)

Apparently, the passwords were stored using raw (unsalted) MD5 for which current implementations on a single GFX card manage up to 1billion tries/s. So basically if you have passwords with less then 10 characters and upper/lower/digits/special, you're at risk.

Storing passwords securely is easy : salt, many iterations of a secure hash function. SHA-256 crypt or bcrypt are good candidates. See http://www.openwall.com/crypt/

Source: http://www.fsf.org/blogs/sysadmin/savannah-and-www.gnu.or... (search for "unsalted MD5")

Savannah.gnu.org compromised

Posted Dec 3, 2010 13:35 UTC (Fri) by paulj (subscriber, #341) [Link]

This makes me glad I generate distinct random passwords for all administratively distinct services (I write them down in a secure place).

Reconstruction of the incident

Posted Dec 1, 2010 15:49 UTC (Wed) by codewiz (subscriber, #63050) [Link]

Free Software Foundation website published a detailed chronology of the incident.