Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
A hash-based DOS attack on Btrfs
Posted Dec 13, 2012 22:14 UTC (Thu) by masoncl (subscriber, #47138)
I'm certainly happy to add a second hash function if there is a strong case for it, but so far I'm not convinced ;)
Posted Dec 13, 2012 22:52 UTC (Thu) by therealmik (subscriber, #87720)
While I guess server software, such as web servers, samba, etc could always store a random filename then create an index for it in a database, that's exactly what a directory is supposed to be doing anyway - and the database probably has the same vulnerability in whatever hash it uses.
So assuming that you come around to the fact that filename hash collisions are a bad thing, it leads us to:
- Directories probably need a random hash seed for their index (one for the whole FS probably won't cut it)
- crc is not an appropriate hash, as even with a random seed an attacker can still cause collisions
Posted Dec 13, 2012 23:06 UTC (Thu) by masoncl (subscriber, #47138)
1) Users share writable directories (either with each other or with admins)
2) Blocking the creation of specific filenames causes problems
3) Blocking the creation of those filenames by creating a file with that name is not an issue
Posted Dec 13, 2012 23:20 UTC (Thu) by man_ls (subscriber, #15091)
I would discard the original name and use a random filename, unless for some reason the original name was valuable for anything (e.g. for later user selection). Such use cases do exist, and besides there are many software packages that work this way. That using a new shiny filesystem would automatically convert them from ugly software to DoS targets is not admissible IMO.
Posted Dec 14, 2012 0:20 UTC (Fri) by cyanit (guest, #86671)
In addition to SipHash, taking the latest 64 bits of AES-CBC ciphertext of the zero-terminated filename should work and might be faster if the CPU can accelerate AES.
Posted Dec 14, 2012 0:48 UTC (Fri) by wahern (subscriber, #37304)
Siphash was designed for this exact scenario. Dan Bernstein isn't known for his lack of performance insights. The paper has gobs of analysis of performance characteristics.
Using a CRC as a hash is just stupid. And I say that as someone who learned that the hard way. People only used it because of the lack of a decent alternative--newer hashes like murmur, etc, didn't have enough proven guarantees to inspire confidence. Siphash was designed to be _the_ go to hash for data structures, with both decent performance and a rigorous security analysis.
Posted Dec 14, 2012 2:06 UTC (Fri) by masoncl (subscriber, #47138)
This isn't a new issue, reiserfs had a similar hashing scheme. Other hashes are different, but crc32c is implemented in hardware on newer CPUs, so there is a real performance benefit here.
Posted Dec 14, 2012 2:21 UTC (Fri) by therealmik (subscriber, #87720)
Another example would be somebody just wanting to hog a lot of in-kernel CPU time, and causing conflicts with their own filename, even though they're rate-limited by higher-level schemes (eg. server-level rate-limits).
Creating a conflict with a lock file created in the same directory is probably another example.
As a previous poster alluded, crc32c is good for detecting random corruption, not for hash tables or malicious corruption.
The CPU implementation of crc32c argument is also deeply flawed - extra tree walks are not worth the few cycles saved, as tree walks are vastly more expensive.
Remember that hash tables are only O(1) in a "perfect hash" scenario, so the hash table performance approaches what you'd expect from O(n) as the hash gets worse.
Posted Dec 14, 2012 3:30 UTC (Fri) by davidescott (guest, #58580)
Chris (if you are still monitoring the thread),
How would one go about strengthening the FS and making it more resistant to these kinds of attacks? I gather that BTRFS is capable of supporting other hashes -- what are the various options down the road? It it an option at filesystem creation time to use a more collision resistant hash? How big a hash can one use? If one had a hardware SHA-512 and wanted to waste the disk space could they? Are there ways to salt hashes using something secret in the directory structure? Could the Hash be replaced with red-black trees etc?
Given the current on disk layout where would the introduction of protections against this kind of attack introduce btrfs2?
Posted Dec 14, 2012 12:35 UTC (Fri) by masoncl (subscriber, #47138)
Salting etc are also more than possible. I personally don't think the hash makes the shared directories in any way secure, but I've been wrong many times before ;)
Posted Dec 14, 2012 16:09 UTC (Fri) by cesarb (subscriber, #6266)
But if the hash output is visible via telldir, the attacker can still find out the salt unless you are using strong crypto, which would not be very good for performance. I think the hash output is not visible via telldir on the ext* family.
You could play games like having a per-directory randomly chosen salt, but if you are getting to that point IMHO it is better to work on making the algorithms robust against degenerate edge cases.
Posted Dec 14, 2012 17:01 UTC (Fri) by davidescott (guest, #58580)
Hopefully nobody exports a "tell/read/seekdir" interface over php.
Posted Dec 14, 2012 17:35 UTC (Fri) by dakas (guest, #88146)
A CRC is not a cryptographic hash. Salting is useless, and identical prefixes have identical impact on the result.
Posted Dec 14, 2012 21:56 UTC (Fri) by masoncl (subscriber, #47138)
Posted Dec 14, 2012 2:13 UTC (Fri) by cyanit (guest, #86671)
Posted Dec 14, 2012 2:54 UTC (Fri) by cyanit (guest, #86671)
However, if other instructions can be scheduled by the out-of-order CPU in the meantime, then AES should be much faster, since the reciprocal throughput is actually 10 cycles on Sandy Bridge and should only take up one execution unit.
Without AES instructions things are most likely reversed.
Posted Dec 14, 2012 11:23 UTC (Fri) by daniel (subscriber, #3181)
Posted Dec 14, 2012 15:27 UTC (Fri) by drag (subscriber, #31333)
Posted Dec 15, 2012 2:11 UTC (Sat) by daniel (subscriber, #3181)
The most important result here is not the inappropriateness of CRC as a hash, but that Btrfs is a complex beast, still in development, a critical bug was just exposed, and likely a few more remain. Given wide enough testing and continued developer commitment it will become stable, but today it is not. From where I sit, it looks like Ext4 will be wearing its standard Linux filesystem crown for some time yet.
Posted Dec 15, 2012 6:17 UTC (Sat) by SEJeff (subscriber, #51588)
Posted Dec 15, 2012 13:18 UTC (Sat) by vonbrand (subscriber, #4458)
From a cursory look over, I'd say 220 *minutes* isn't just a bad hash degenerating to a linear search...
Posted Dec 15, 2012 2:46 UTC (Sat) by wahern (subscriber, #37304)
Posted Dec 14, 2012 10:30 UTC (Fri) by robert_s (subscriber, #42402)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds