LWN.net Logo

Optimizing Linker Load Times

Optimizing Linker Load Times

Posted Jul 27, 2006 3:49 UTC (Thu) by bluefoxicy (guest, #25366)
In reply to: Optimizing Linker Load Times by martinfick
Parent article: Optimizing Linker Load Times

I actually proposed filling hash buckets with Patricia Tries, but never wrote and tested it. I like to put my ideas out there but keep in mind I'm not an expert on this stuff by far ;) (guys feel free to point out reasons this won't work)

A Patricia trie would basically let you find a bucket; locate the first character of the symbol; and then from there the string only goes as far as ALL symbols in that bucket go with a common prefix. Then it would fork towards each different symbol.

In other words you'd have this:

apple
applecore
appliance
brick

appl--e--core
    |-ianc
brick

The first is a list, the second is a Patricia Trie. The difference is that to find 'appliance' you have to compare 'apple', 'apple', 'appliance\0' with a list (if it's in that exact order); whereas with the Patricia Trie you would compare 'appl\0', 'e', 'iance\0'. 'brick' would be 'a' 'a' 'a' 'brick\0' vs 'a' 'brick\0'.

The size of the storage would be smaller and the data locality would thus be tighter. You would avoid problems with similar prefixes (hello C++ crap) and the trie could be implemented to give these guarantees fairly easily as long as they were kept small (i.e. big tries might have a pointer forward to the next segment a few pages ahead; so I said to stick them in hash buckets to keep them small and reduce the number of forks).

The data would be stored at each branch of the trie. So 'appl' would resolve, it would just resolve to "there is no value here"; and 'e' would store the dynsym entry for 'apple'; etc.

This could be done without breaking the ELF standard by using a different section and letting the dynamic linker use it instead of the typical sections.

The problem isn't so simple though. This would help with the case of "there's 3 symbols in every bucket that start with some 400 character string and we're wasting tons of cycles doing millions of character comparisons on load;" but overall it's not going to become instant look-up. You still have to walk through all loaded libraries; this is what -Bdirect linking takes care of for you.

In short, the reasons we haven't used binary tries are:

  • They're not optimal. Patricia Tries may not even be optimal, and they look (to me) to be a pretty big improvement over a flat linked list of strings!
  • They'd need a new section, which means more space usage on disk (no more memory usage though; we wouldn't use the original section with the linked list, so it'd never be faulted into memory)
  • Nobody has come up with a mechanism here, written it, tested it, and proposed it.
  • I'm not sure the speed boost here. Reading into Drepper's paper I'd guess ELIMINATING redundant character comparisons would offer at least 20%; but he only gives a guess, not a measurement, and doesn't say how this relates to overall dynamic link time. So hey, we could be talking about 5%, 10%, or 30% or 60% or who knows.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds