abbreviated keys for BYTEA?
abbreviated keys for BYTEA?
Posted Aug 6, 2015 9:53 UTC (Thu) by andresfreund (subscriber, #69562)In reply to: abbreviated keys for BYTEA? by zack
Parent article: "Big data" features coming in PostgreSQL 9.5
I guess you could devise something for bytea as well, but it'd have to look a bit different. Actually it'd be much closer to the abbreviated key logic for text than to numeric. Just without having to care about locales. With numeric you have to care about NaN and such.
Do you regularly sort/index large amount of bytea values?
      Posted Aug 6, 2015 13:19 UTC (Thu)
                               by zack (subscriber, #7062)
                              [Link] (4 responses)
       
Thanks for your answer! 
> Do you regularly sort/index large amount of bytea values? 
I'm storing lots of checksums (of various kinds: sha1, sha256 for now), in the order of a few billion entries. 
I haven't yet firmly chosen the postgres datatype to do that. 
On the one hand, I'm inclined to implement a custom data type right away. It is my understanding that while doing that one can plug into the new sort support facilities that give the benefits of abbreviated keys (right? :-)). 
On the other hand, if an appropriate built-in data type (such as BYTEA or its variants) have already support for abbreviated keys, that would be a good incentive to start with it, and migrate to a custom data type only later. 
     
    
      Posted Aug 6, 2015 19:23 UTC (Thu)
                               by jberkus (guest, #55561)
                              [Link] (2 responses)
       
If so, well, "patches welcome".  Probably what you'd want to do with BYTEA is compare the first 8 bytes, sort, then compare the full values to break ties. 
On the other hand, you could just store the checksums as NUMERICs, if you are fine with converting to hex and back. 
     
    
      Posted Aug 7, 2015 18:58 UTC (Fri)
                               by zack (subscriber, #7062)
                              [Link] (1 responses)
       
Index (and specifically b-tree) build time and maintenance are my main concerns, yes. 
But I was also under the impression that abbreviated keys are relevant also for (b-tree) index lookups and uniqueness constraint verifications, due to the comparisons needed to get down to the actual indexes values, no matter how shallow the index is. Maybe that's a wrong impression of mine? 
     
    
      Posted Aug 7, 2015 21:17 UTC (Fri)
                               by jberkus (guest, #55561)
                              [Link] 
       
Anyway, I encourage you to bring up the idea of BYTEA on a PostgreSQL mailing list.  Nobody's opposed to extending abbreviated keys further, we just ran out of time for 9.5. 
     
      Posted Feb 5, 2016 11:17 UTC (Fri)
                               by petergeoghegan (guest, #84275)
                              [Link] 
       
     
    abbreviated keys for BYTEA?
      
abbreviated keys for BYTEA?
      
abbreviated keys for BYTEA?
      
abbreviated keys for BYTEA?
      
abbreviated keys for BYTEA?
      
 
           