"Big data" features coming in PostgreSQL 9.5

Posted Aug 8, 2015 6:01 UTC (Sat) by kleptog (subscriber, #1183)
In reply to: "Big data" features coming in PostgreSQL 9.5 by jberkus
Parent article: "Big data" features coming in PostgreSQL 9.5

I think it also has to do with the fact that in PostgreSQL the abbreviated keys approach allows the removal of a lot of indirect function calls from tight loops. In specialised code that knows beforehand that it's sorting text data replacing strcoll() with strxfrm() might not be that much of a win. But in the general PostgreSQL sorting code there is always an extra layer of indirection (for example TEXT values internally are not null-terminated, so you have to compensate for that).

"Big data" features coming in PostgreSQL 9.5

Posted Aug 8, 2015 7:24 UTC (Sat) by petergeoghegan (guest, #84275) [Link] (1 responses)

strxfrm() blobs are usually a little over 3 times as large as the original strings with glibc. As I go into in my blog post about abbreviated keys, this is because there are multiple "levels" (at least 3) in the blob, demarcated by sentinel bytes, whose order relative to each other relates to how heavily some aspect of a code point should be weighed. For example, with Latin scripts, primary alphabetical ordering is represented at the primary level (case and punctuation are represented at subsequent levels). You get a far higher concentration of entropy in the first 8 bytes than (say) the last 8 bytes. Sometimes that concentration is much higher than you'd expect, since (for example) whitespace and diacritics are also not represented at the primary weight level. Also, by just comparing the first 8 bytes, a significant amount of pointer chasing is avoided (other indirection is avoided too).

The idea that strxfrm() is not generally useful seems dubious, even leaving aside a technique like abbreviated keys. It's in the C standard. And Certainly, ICU offers something that's similar but very much more advanced. The glibc docs strongly suggest using strxfrm() where the space overhead is acceptable during a sort of a non-trivial number of strings.

"Big data" features coming in PostgreSQL 9.5

Posted Aug 9, 2015 12:20 UTC (Sun) by nix (subscriber, #2304) [Link]

I think the abbreviation is helpful but perhaps not the primary reason why strxfrm() works here but not for sort(1). The biggest problem with strxfrm() is that the blobs are so much larger than the input that the dcache hit simply ruins builds and comparisons using it: you blow the dcache building the blobs and then blow it again using them. Hence the cutover point from cache-dominated to memory-dominated comes much earlier when you're using strxfrm() blobs than when you're using straight strcoll(), which is a killer for things like sort(1) which are massively affected by cache, and repeatedly so, and tries to avoid hitting the disk.

Index building... is of a different order. The memory hierarchy is less important, because everything is hitting the disk anyway, and because you abbreviate the blobs their horrendous size is reduced to something well below one cacheline for every blob.

(This has been a genuine guess: no numbers or experiments were conducted and no strxfrm() blobs were harmed in the making of this post. But I have tried to use strxfrm() before, myself, long ago, and come to similar conclusions back then, though it was the Solaris strxfrm() bugs that really killed that idea.)