Generic hashing functions and the platform problem

May 18, 2016

This article was contributed by Neil Brown

What is a kernel developer to do when they need a simple hashing function to use in a new hash table, and they find that the obvious choice provided by the kernel works poorly? The "right" option is to fix the common code. The "easy" option is to write a replacement or a workaround. The "best" option, it seems, is to make sure Linus Torvalds finds out, because this is just the sort of thing that he cares about.

Linux has a fairly simple and efficient set of hashing functions in include/linux/hash.h that work on simple input values: 32-bit or 64-bit numbers, or pointers that are cast to one of those depending on the architecture. The hash is computed by multiplying the input by one large number and ignoring any overflow, thus effectively taking the remainder of a division by another large number: 2³² or 2⁶⁴, depending on the word size. The required number of bits are then extracted from the result.

When Thomas Gleixner was testing his second patch set for making futexes faster, he discovered that the hash values returned weren't particularly evenly distributed and, as a result, he was receiving more collisions than expected. He addressed this problem by writing a simple alternative, sparking a sub-thread exploring the problems with these hash functions. As Torvalds found during subsequent investigations, Gleixner was not the first developer to decide it was easier not to fix the broken function. When working on the RAID 5/6 code for Btrfs, David Woodhouse, or possibly Chris Mason, discovered problems with the "hash_64()" function and helpfully provided a comment:

     * we shift down quite a bit.  We're using byte
     * addressing, and most of the lower bits are zeros.
     * This tends to upset hash_64, and it consistently
     * returns just one or two different values.

It was determined that zeros in the lower-order bits result in the remaining bits not being mixed well, so there is not much variation in the output — not a particularly useful property for a hash function. The chosen response was not to fix hash_64() but to shift down the input to hide the problem.

Some eight years ago, Matthew Wilcox made some changes to the interfaces for the hash function and noted in the commit message:

The 32-bit version is more efficient (and apparently gives better hash results than the 64-bit version), so users who are only hashing a 32-bit quantity can now opt to use the 32-bit version explicitly

So he, too, knew that there were problems with the hashing function, particularly the 64-bit version. It would be naive to think that these three are the only developers to have noticed a problem but none made the effort to fix it. In the 14 years since this code was introduced, several people have noticed a problem and no one has bothered to fix it. This is an example of what has become known as the "platform problem".

Fortunately Torvalds was interested in the futex changes and so paid attention to the second patch set from Gleixner. He made no comment on the futex changes, but did take interest in the hash changes, researching the problem and orchestrating the fix. The key issue turned out to be the number that the input is multiplied by. Supposedly based on recommendations from Donald Knuth, this number was chosen to be a prime and to be approximately the fractional part of the golden ratio multiplied by 2⁶⁴ (in the 64-bit case). A prime was chosen that was "bit sparse", having long runs of ones or zeros in the binary representation.

This latter choice was guided by the desire for the multiplication to be fast. On some hardware, shifts and addition or subtraction can be faster than general multiplication and with fewer runs in the binary pattern, fewer additions or subtractions are needed. For the 64-bit case, the pattern is particularly sparse, having a run of 33 one bits.

    /*  2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1 */
    #define GOLDEN_RATIO_PRIME_64 0x9e37fffffffc0001UL

This sparseness was the problem. Torvalds's research suggested that the primality was barely interesting and a misunderstanding. As George Spelvin later explained it:

One thing I note is that the advice in the comments to choose a prime number is misquoting Knuth! Knuth says (vol. 3 section 6.4) the number should be *relatively* prime to the word size, which for binary computers simply means odd.

It also appears that the golden ratio is not all that magical. It is just a convenient number that is, as Torvalds put it, "an irrational number that is not near to 0 or 1". The important property was to have "roughly 50% of the bits set in a random pattern" so the resultant mixing of bits would hide any patterns in the input. The current hash function, it seems, emphasizes exactly the wrong qualities.

Once the problem was understood, fixing it was relatively straightforward: just pick a better multiplier with more "randomness" in the bit pattern. This was done by approximating an irrational value and not worrying about primality or bit-sparseness. On hardware without 64-bit multiplication, the two 32-bit halves of the 64-bit input are hashed sequentially so two 32-bit multiplications are used. This doesn't return a 64-bit number, but no code ever wants more than 32 bits of hash, so not providing the full 64 is no loss.

Spelvin made good use of the opportunity provided by this need to adjust the hash functions and examined other hashing code, fixing up the hash_mem() and hash_str() function in the sunrpc code (used by nfsd) and suggesting improvements for some simplistic hashing used in the inode cache.

A minimal fix has landed for Linux 4.6-rc7 with a more complete fix expected for v4.7. It is good to know that this problem with the Linux kernel platform has been addressed, but it does lead one to wonder what other problems there are that have been conveniently ignored by many of us, and just need a little light to shine on them at the right time for the problem to be quickly fixed.

Index entries for this article
Kernel	Hashing functions
GuestArticles	Brown, Neil

Generic hashing functions and the platform problem

Posted May 19, 2016 9:36 UTC (Thu) by Sesse (subscriber, #53779) [Link] (1 responses)

The golden ratio actually does have a nice property in itself, beyond being irrational. It turns out that if you do ϕ, 2ϕ, 3ϕ etc. (all of them modulo 1), they will tend to fall as far as possible from any of the previous values. So you get very nice spreading properties for consecutive values, which can be an important special case for hashing.

Generic hashing functions and the platform problem

Posted May 19, 2016 13:51 UTC (Thu) by clemensg (guest, #94377) [Link]

Good point, just noticed the same thing when reading said chapter from Vol. 3.

Generic hashing functions and the platform problem

Posted May 19, 2016 9:54 UTC (Thu) by mjthayer (guest, #39183) [Link]

I see that I posted something similar to the "platform problem" article... if one is afraid to dive into "unknown" code and fix it, then at least adding a "todo" comment in the right place might be the next best thing. As long as reviewers are supportive of this and do not come down on people to just do the work.

Generic hashing functions and the platform problem

Posted May 20, 2016 15:17 UTC (Fri) by willy (subscriber, #9762) [Link] (3 responses)

I no longer remember who told me in 2008 that the 32-bit hash was "better". I certainly didn't do any experiments of my own to confirm this.

Perhaps we could add tests of the "goodness" of the hash functions to tools/testing? It's been invaluable for preventing regressions of the radix tree code.

Generic hashing functions and the platform problem

Posted May 21, 2016 0:00 UTC (Sat) by neilbrown (subscriber, #359) [Link] (2 responses)

> Perhaps we could ...

It is part of the human condition, visible in many communities, that it is easier to say "we could" or "we should" rather than "I will". I think the "platform problem" is, in part, how that condition manifests in this community.

Not that it is necessarily wrong to say "we could": people who say "I will" too often can easily get burnt out.

Some sort of balance is best: where many in the community feel empowered, encouraged, and even permitted, to say "I will", without feeling undue pressure (so no "You should"!).
Someone out there probably wants to take the test harness that Linus threw together (mentioned in the email thread) and put it in tools/testing in a useful way. What can we ... What can *I* do to permit, empower, and encourage that?

Generic hashing functions and the platform problem

Posted May 21, 2016 0:10 UTC (Sat) by willy (subscriber, #9762) [Link]

I think it's an honest acknowledgement of the length of our own TODO lists and the likelihood that this task would ever bubble up to the top of it.

But this is a great opportunity for someone who wants to improve the kernel in a quantifiable way to get involved. It's almost entirely user-space programming, but it involves working with kernel code. Everybody agrees we need better testing, so there's likely to be positive feedback.

Now, how best to bring this idea to the attention of someone who has time and inclination to work on it?

Generic hashing functions and the platform problem

Posted May 25, 2016 11:27 UTC (Wed) by robbe (guest, #16131) [Link]

If I can’t say "this should be fixed" without somebody piping up "fix it, then", I may just opt for keeping silent. Exactly the problem described in the article.

Alternatives? Anonymous bug reporting? Adding it to a list of newbie projects?

Generic hashing functions and the platform problem

Posted May 20, 2016 16:32 UTC (Fri) by jlargentaye (subscriber, #75206) [Link] (2 responses)

This is tangential, but I've long wondered about "George Spelvin", more accurately linux@horizon.net because "George Spelvin" is a classic pseudonym [1] which they only started using relatively recently.

They are obviously a very skilled hacker who have been around Linux for a long time. Though I can't find the reference immediately, I remember them correcting/optimizing someone's PPC(?) assembly code, then stating mild uncertainty because they "don't have access to hardware to test it." (As I remember it, the optimized code worked flawlessly).

[1] https://en.m.wikipedia.org/wiki/George_Spelvin

Generic hashing functions and the platform problem

Posted May 20, 2016 16:36 UTC (Fri) by jlargentaye (subscriber, #75206) [Link]

Ah, here was their assembly code without access to HW:

http://www.gelato.unsw.edu.au/archives/git/0504/1753.html

Of note, this was also work on a hash function.

Generic hashing functions and the platform problem

Posted May 24, 2016 17:47 UTC (Tue) by jzbiciak (guest, #5246) [Link]

Wouldn't it be something if "George Spelvin" was actually Knuth? ;-)

(It seems unlikely, but I'm amused by the possibility.)