Worst case performance and "impossibly unlikely" conditions

Posted Jan 11, 2012 23:56 UTC (Wed) by nix (subscriber, #2304)
In reply to: Worst case performance and "impossibly unlikely" conditions by epa
Parent article: Denial of service via hash collisions

The slower ordered-tree based dictionary implementations

... aren't necessarily slower. The last hash I implemented was a modified-splay-tree-based system where each get of a given hash item rotated it 1/Nth of the way towards the top of the tree (a customizable value). Finding a random item scales as O(lg n), which is admittedly slower than the O(1) of a perfect hash or sufficiently empty bucket hash -- but repeated searches on the same set of keys end up O(1) or O(m) where m is the size of the searched set, and additions are always O(lg n) as well. This is definitely not true of bucket hashes, as they have to be periodically expanded and perhaps contracted for efficiency, leading to nasty spikes in the time profile. Having O(1) operations suddenly take on the order of O(n)-with-a-large-constant time while an expansion happens is... not good. I know it amortizes to O(log n), but still it's a jagged and nasty way to do it.

That splay-tree-based hash had other nice properties, such as an iteration order stable under non-colliding modification, and even stable under collision with certain limitations (if you deleted the currently-iterated value, and earlier-iterated values had colliding hash values, you would see such values more than once). It was also really easy to lock for multithreaded use, and copying and freeing it was pleasantly easy.

But its primary attraction to me was the limited conditional count. Because of the absence of expansion code, the only conditional in the whole thing other than the odd-or-even part of the tree rotation code was the trivial five-line code to handle hash collisions, which could have been omitted had I chosen to record full 64-bit hash values. There was no rarely-executed expansion code, no rehashing, nothing rarely executed to harbour bugs. Or there wasn't until I added extra code to efficiently pack small-enough data items into the item pointers, anyway...

(The thing was under a proprietary license thanks to my old free-software-phobic employer, but I will probably be reimplementing it and a number of related data structures under the BSD license sometime soonish. I'm sick of not having them available. :)

Worst case performance and "impossibly unlikely" conditions

Posted Jan 12, 2012 1:26 UTC (Thu) by cmccabe (guest, #60281) [Link] (4 responses)

Based on some things I've read, splay trees may not be the best way to go on modern hardware. The problem is that they turn what would normally be read-only operations into read+write operations. That means cache lines get dirty, If your program has multiple threads working on multiple CPUs, the overhead of cache line invalidation can really start to hurt.

There was a post about this in 2010 on the freebsd list:
http://lists.freebsd.org/pipermail/freebsd-hackers/2010-S...

Basically, it was a great idea in the 1980s, but I'm not sure how well it works these days.

Worst case performance and "impossibly unlikely" conditions

Posted Jan 12, 2012 1:52 UTC (Thu) by wahern (subscriber, #37304) [Link] (1 responses)

It's easy enough to test. NetBSD, FreeBSD, and OpenBSD all have <sys/tree.h>, containing preprocessor based splay and red-black tree implementations which, other than the algorithm, are implemented almost identically. It's self contained, so you can download the single header file. man 3 tree for the API.

Those kernels tend to rely on RB trees fairly heavily. OpenBSD uses <sys/tree.h> for almost everything, including PF and the VM. It's perhaps not as fast as the radix, etc., trees in Linux, but it's simple and is unlikely to create any surprises in the future.

I wrote a left-leaning RB tree variant which is a drop-in replacement, except the prefix is LLRB instead of RB. LLRBs have a simpler algorithm, so it's easier to hack on if necessary.

Worst case performance and "impossibly unlikely" conditions

Posted Jan 12, 2012 21:42 UTC (Thu) by cmccabe (guest, #60281) [Link]

Yeah, sys/tree.h is useful. I tend to avoid the splay tree parts, though. To be fair, I've been too lazy to actually benchmark the two implementations.

Worst case performance and "impossibly unlikely" conditions

Posted Jan 13, 2012 13:52 UTC (Fri) by nix (subscriber, #2304) [Link] (1 responses)

Indeed that is a downside. If your hash is often accessed from many different threads, and particularly if it is often randomly accessed, a splay tree may be a bad choice of implementation. However, if access is concentrated in a few threads (as is common for the vast majority of data structures but not e.g. for a hash tracking all memory in a multithreaded kernel), *or* if your access patterns exhibit strong locality of reference (so that item percolation to the top of the tree is worthwhile), *or* if your overheads are elsewhere such that you don't care about a bit of cache overhead (true everywhere but in hot paths), a splay tree is either OK or downright good.

And I contend that in the vast majority of applications this is true. This site has a focus on kernel development, but most programs are not the kernel and most data structures are not randomly accessed on a large number of CPUs in a very hot path. And you need all of those things to be true at once before a splay tree's disadvantages become substantial. (That is not to say that it always has advantages over other implementations: I would only bother with a splay tree if I expected to be populating large hashes and then only using bits of them most of the time. But for some very common applications -- e.g. caching -- this is true.)

Worst case performance and "impossibly unlikely" conditions

Posted Jan 18, 2012 23:36 UTC (Wed) by cmccabe (guest, #60281) [Link]

This seems like the kind of thing where benchmarking is your friend. Luckily, it's fairly easy to go back and forth between red-black trees and splay trees.

I'm still wary of splay trees because they generate dirty cache lines which need to be written back to main memory at some point. That eats into the available memory bandwidth. And of course, even if all of your accesses are single-threaded, there's always the possibility of false sharing.

This is definitely not true of bucket hashes,

Posted Jan 20, 2012 1:58 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

Why?

The Pick database is based on bucket hashes, and modern implementations will resize so fast the user won't notice. "time is proportional to delta data" - given a 4k bucket it will take EXACTLY the same time to double the file size from 4K to 8K, as it will to increase the file size from 1M to 1028K.

Look up "linear hashing" in Wikipedia.

Cheers,
Wol

This is definitely not true of bucket hashes,

Posted Jan 26, 2012 18:09 UTC (Thu) by nix (subscriber, #2304) [Link]

Yes. It's not true. Those examples use pathetically tiny hashes (which were reasonably sized back in the 80s when Pick was big). These days, with the memory hierarchy being what it is, data sizes being what they are, and rehashing requiring massive pointer copying at best... well, just you try to rehash a hash containing fifty million elements and tell me that it'll not be slow enough to notice. (And this is not particularly large as such things go.)