|
|
Subscribe / Log in / New account

Python sets, frozensets, and literals

Python sets, frozensets, and literals

Posted Jan 25, 2022 3:09 UTC (Tue) by NYKevin (subscriber, #129325)
In reply to: Python sets, frozensets, and literals by Wol
Parent article: Python sets, frozensets, and literals

I can write code like this, long after my database is already deployed and populated, and a migration would be expensive:

WITH recent_invoices AS (
SELECT *
FROM invoices
WHERE age(when_issued) < '30 days' -- your syntax may vary here.
)
SELECT p.id, p.name, p.inventory_count - COALESCE(SUM(itm.quantity), 0) AS forecasted_surplus
FROM products p
LEFT JOIN (
invoice_items itm JOIN recent_invoices inv ON itm.invoice_id = inv.id
) ON itm.product_id = p.id
GROUP BY p.id, p.name, p.inventory_count;

I probably also need to run a CREATE INDEX on the invoices(when_issued) column, but that's not too expensive to run once (it's basically a full table scan, with writes). This should (on most reasonable systems) create a B-tree index which can keep track of the order of invoices (i.e. it doesn't just stick them into hash buckets). Then I can get monthly forecasts of my inventory based on last month's demand.

Note that we cannot assume that recently-inserted invoices are necessarily issued more recently than older invoices, because new invoices may need to be backdated, so we can't simply loop over the full table backwards and stop when we start hitting old invoices, because the table is not necessarily in the right order. We have to use an index, or do a full table scan going back through (potentially) years of ancient invoices. Also note that adding this query does not require significant changes to our existing schema - we run one very simple CREATE INDEX command, do some load testing just in case, and that's it, we're ready to roll this out to prod.

Finally, while this may look complicated, I banged it out in about an hour, and frankly it's not that hard to understand for people who use SQL regularly. Quick translation:

* WITH name AS SELECT ... turns another query into a temporary table ("common table expression"). Here all it does is pull the "within the past month" logic out so that the FROM part of the main query does not get too hard to read, but you can also use it if you're going to do the same sub-select more than once.
* The join in parentheses needs to happen first, so that we throw away any invoice_items which don't match one of the existing invoices. We could do this as a second CTE, but that's uglier IMHO.
* LEFT JOIN is an abbreviation of LEFT OUTER JOIN, and has the effect of manufacturing an all-NULL invoice and invoice_item for any products which were not purchased in the past month.
* COALESCE() turns the NULL back into a zero.

With your solution... well, I'm not sure exactly how Pick works, but I tend to imagine you will at least need to introduce a products table so that you know how many are in inventory, and then do an outer join against the invoices table.


to post comments

Python sets, frozensets, and literals

Posted Jan 25, 2022 3:24 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

> you will at least need to introduce a products table

Actually, your solution already has that information.

Python sets, frozensets, and literals

Posted Jan 25, 2022 8:57 UTC (Tue) by Wol (subscriber, #4433) [Link]

Okay, so have I got this right? You're creating a report on everything sold in the last month?

Like you, I need to create an index on DATE, so evens there. I'd rather put a date in the query, but adding another column to the invoice is easy

ED DICT INVOICES
... create AGE as TODAY - INVOICE.DATE

I think Pick would struggle with me creating an index on a dynamically created field ... :-)

Then it's something as simple as

LIST INVOICES WITH INVOICE.DATE > 25/12/2022 ITEM.NAME ITEM.QUANTITY GRAND.TOTAL ITEM.QUANTITY BY ITEM.NAME.

Okay I'd need to make sure I got the syntax right, but it cost me more time to understand what you were doing than to come up with my first cut at the Pick equivalent. And now I see you're also comparing against current inventory ...

LIST INVOICES WITH INVOICE.DATE > 25/12/2022 ITEM.CODE ITEM.NAME ITEM.QUANTITY GRAND.TOTAL ITEM.QUANTITY BY ITEM.NAME EVAL(TRANSLATE PRODUCTS ITEM.CODE STOCK) AS STOCK EVAL (STOCK - ITEM.QUANTITY) AS SURPLUS

There's probably a bit more to it than that, I've never had to write that sort of query before, but that's about it. Nested Pick queries are more likely to be written as procedures than command lines. (Note the translate instead of a join...)

Cheers,
Wol


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds