LWN.net Logo

Binary trees considered harmful

Binary trees considered harmful

Posted Jun 29, 2006 9:04 UTC (Thu) by ncm (subscriber, #165)
Parent article: Trees II: red-black trees

The author/designer of Judy Trees remarks that, having measured them in comparison with other methods, "It is truly remarkable to me how much research has been done on binary trees and [that they are] still being taught." Red-black trees are quaint, at best, nowadays. Replacing them in the kernel with a more cache-aware (or, better, cache-oblivious) alternative ought to be an easy way to speed up affected kernel operations, except that it appears it would require rewriting zillions of search and insert functions. It seems unfortunate that the design of the kernel rbtrees is so intrusive, but there may have been no choice, in C.


(Log in to post comments)

Binary trees considered harmful

Posted Jun 29, 2006 16:23 UTC (Thu) by nix (subscriber, #2304) [Link]

What a baroque and lovely data structure. Alas it seems it's tied up in software patents...

Binary trees considered harmful

Posted Jun 29, 2006 17:21 UTC (Thu) by cventers (subscriber, #31465) [Link]

True, but so is RCU. I wonder if HP would give an explicit license to GPL
code (since they released the Judy library on Sourceforge as LGPL, along
with some "confidential" internal paper about Judy).

Binary trees considered harmful

Posted Jun 30, 2006 15:56 UTC (Fri) by nix (subscriber, #2304) [Link]

Yeah. It's a shame that the accessor macros and the code are so horribly named/unreadable. It's almost obfuscated-yet-heavily-commented...

Binary trees considered harmful

Posted Jul 6, 2006 16:49 UTC (Thu) by i3839 (guest, #31386) [Link]

Well, who knows, maybe it has something to do with this:
$ size  Judy-1.0.3/src/obj/.libs/libJudy.so.1.0.3 
   text    data     bss     dec     hex filename
 114515     760     160  115435   1c2eb Judy-1.0.3/src/obj/.libs/libJudy.so.1.0.3
$ size linux/lib/*tree*.o
   text    data     bss     dec     hex filename
   1823       0     128    1951     79f linux/lib/prio_tree.o
   2934      28      36    2998     bb6 linux/lib/radix-tree.o
   1544       0       0    1544     608 linux/lib/rbtree.o
For something which is avoiding cache misses as much as possible it seems a bit strange to be so bloated that the code to avoid all those cache misses causes so awfully lot cache misses itself. Of course, this isn't something you'll see when benchmarking the code...

Furthermore, whatever memory saving might be achieved by using Judy arrays, it has a long to go way before it made up for that 100K extra usage.

And to rub it in, hashes are almost as good as Judy arrays but are much simpler and smaller to implement. Only icky thing is their (re-)size problem, but that's less icky than a 100K footprint. Good b-tree (not a binary tree) implementations are also not significant worse.

I'm not trying to prove that Judy arrays are bad, they have their usages, but I do think that the kernel isn't the right place for them.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds