LWN.net Logo

Worst case performance and "impossibly unlikely" conditions

Worst case performance and "impossibly unlikely" conditions

Posted Jan 13, 2012 13:52 UTC (Fri) by nix (subscriber, #2304)
In reply to: Worst case performance and "impossibly unlikely" conditions by cmccabe
Parent article: Denial of service via hash collisions

Indeed that is a downside. If your hash is often accessed from many different threads, and particularly if it is often randomly accessed, a splay tree may be a bad choice of implementation. However, if access is concentrated in a few threads (as is common for the vast majority of data structures but not e.g. for a hash tracking all memory in a multithreaded kernel), *or* if your access patterns exhibit strong locality of reference (so that item percolation to the top of the tree is worthwhile), *or* if your overheads are elsewhere such that you don't care about a bit of cache overhead (true everywhere but in hot paths), a splay tree is either OK or downright good.

And I contend that in the vast majority of applications this is true. This site has a focus on kernel development, but most programs are not the kernel and most data structures are not randomly accessed on a large number of CPUs in a very hot path. And you need all of those things to be true at once before a splay tree's disadvantages become substantial. (That is not to say that it always has advantages over other implementations: I would only bother with a splay tree if I expected to be populating large hashes and then only using bits of them most of the time. But for some very common applications -- e.g. caching -- this is true.)


(Log in to post comments)

Worst case performance and "impossibly unlikely" conditions

Posted Jan 18, 2012 23:36 UTC (Wed) by cmccabe (subscriber, #60281) [Link]

This seems like the kind of thing where benchmarking is your friend. Luckily, it's fairly easy to go back and forth between red-black trees and splay trees.

I'm still wary of splay trees because they generate dirty cache lines which need to be written back to main memory at some point. That eats into the available memory bandwidth. And of course, even if all of your accesses are single-threaded, there's always the possibility of false sharing.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds