|
|
Subscribe / Log in / New account

Tablesample with proportion really constant time?

Tablesample with proportion really constant time?

Posted Aug 6, 2015 11:50 UTC (Thu) by dskoll (subscriber, #1630)
In reply to: Tablesample with proportion really constant time? by epa
Parent article: "Big data" features coming in PostgreSQL 9.5

I think it means that if you pick M rows out of N with N much bigger than M, the time is dependent on M only and not on N. At least, that's how I read it.

It's not constant wrt to the percentage of rows, of course, becuase in that case M depends on N.


to post comments

Tablesample with proportion really constant time?

Posted Aug 6, 2015 18:47 UTC (Thu) by jberkus (guest, #55561) [Link] (1 responses)

That's right.

And it's not precisely constant time; it will take longer to pull 100 rows out of a billion row table than a million row table. However, the increase in time will be incremental (and small) instead of a multiple of the original request, since we're just looking up data by pageID.

For example, I tested SYSTEM between returning 100 rows from a 100000 row table vs. a million row table. Regardless of which table I used, the difference in request time was below significance thresholds. However, with BERNOULLI, the thousands table took around 5ms, whereas the millions table took around 14ms.

Tablesample with proportion really constant time?

Posted Aug 7, 2015 9:32 UTC (Fri) by epa (subscriber, #39769) [Link]

Thanks. I just meant that if you want constant time, you need to specify something other than the '0.001' proportion in the reviewer's example. It would have to be tablesample (1000 rows) or something similar, I'm not sure of the exact syntax.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds