Query planners are still dumb
Query planners are still dumb
Posted Aug 13, 2024 15:30 UTC (Tue) by cen (subscriber, #170575)Parent article: Changes coming in PostgreSQL 17
Posted Aug 14, 2024 7:24 UTC (Wed)
by Sesse (subscriber, #53779)
[Link] (5 responses)
Posted Aug 14, 2024 10:33 UTC (Wed)
by cen (subscriber, #170575)
[Link] (4 responses)
Posted Aug 14, 2024 10:39 UTC (Wed)
by Sesse (subscriber, #53779)
[Link]
- Giving wrong results
So I don't think you'll see this happen in most databases anytime soon. Remember that a query optimizer can only do things that are guaranteed to give the correct result, and they need to be helpful rather than hurtful in the vast majority of cases, and finally, it needs to be a case that is common enough (ideally, in non-broken code) to warrant adding extra logic for.
Posted Sep 27, 2024 7:33 UTC (Fri)
by ringerc (subscriber, #3071)
[Link]
That way your pagination will actually be correct, at the cost of extra DB resources held busy during user think-time.
I'm not sure quite what you're proposing for a way for the DB to help with pagination w/o keeping a query context around. PostgreSQL already does its best to re-order joins and push down filters to select the desired subset of data from a query, but this is generally not feasible if you're using LIMIT and OFFSET for pagination rather than key-ranges.
Posted Sep 27, 2024 7:41 UTC (Fri)
by ringerc (subscriber, #3071)
[Link] (1 responses)
This doesn't really make sense with PostgreSQL's heap table structure. You have to read the row to get the key and ensure the row pointed to by the b-tree index is visible, even in an index scan.
The only case this strategy would be likely to be useful at all is where an index-only scan can be performed due to an up-to-date visibility map, and where the query is a trivial ORDER BY key_col LIMIT ... OFFSET ... such that the total ordering is completely dependent on that one index.
In that case I'm not sure if PostgreSQL will skip fetching the heap rows skipped by an OFFSET clause, and I agree it probably could if it doesn't. It generally will skip the much more expensive operation of fetching any external TOASTed data though.
LIMIT / OFFSET pagination is kind of terrible anyway. In most cases where pagination is unavoidable, it's better to use a predicate filter on the ordered attribute, e.g. `SELECT ... FROM sometable WHERE sortkey > $1 ORDER BY sortkey LIMIT 50` - where "$1" is the value of the last entry in the previously-seen page. Not only does this generally execute much more efficiently, but its results won't skip rows if the row-set changes between page requests.
Posted Oct 1, 2024 13:08 UTC (Tue)
by raven667 (subscriber, #5198)
[Link]
I hadn't thought of that optimization before, I usually skip pagination and just dump whatever the result set is to the client and let them choke on it, but I'll try and keep that technique in my back pocket because that looks useful. I do use the Netbox REST API a lot which is a Django app, and have to deal with the performance impact of paginated queries, and I wonder if this was integrated into the Django ORM for those queries if that would make any difference, or if all the overhead is in JSON serialization and round trips between pages. GraphQL seems faster but I don't understand it at all so it's not that useful in practice, when I don't know how to do what I want.
Query planners are still dumb
Query planners are still dumb
Query planners are still dumb
- Completely bogus optimization-wise
- Solved better on a different layer of the database
- Only relevant to old versions of certain databases
Query planners are still dumb
Query planners are still dumb
Query planners are still dumb