Where does the speedup come from? And my own experiences
Where does the speedup come from? And my own experiences
Posted Feb 27, 2025 11:24 UTC (Thu) by mathstuf (subscriber, #69389)In reply to: Where does the speedup come from? And my own experiences by anton
Parent article: Python interpreter adds tail calls
Did you miss this paragraph from the article?
> The solution comes from another attribute, preserve_none. Functions with this attribute don't preserve any registers from the caller. The calling convention was originally created for the Glasgow Haskell Compiler (GHC), which is why old documents sometimes refer to it as "ghccc", but C++11 and C23 standardized the attribute for C++ and C, respectively [A reader correctly pointed out that preserve_none has not actually been standardized yet].
Posted Feb 27, 2025 13:48 UTC (Thu)
by daroc (editor, #160859)
[Link]
As for whether I've attributed the performance improvements to the right things in the article — modern performance is complicated. The above description is synthesized from the discussion between the Python developers about the rationale and effects of the change; it makes sense to me, based on my understanding of compiler optimization pipelines, but it is entirely possible that both they and I are mistaken about the deeper causes here.
In particular, I think that the LTO comment can make sense — as mentioned, GCC and Clang have limits on how large a function they're willing to create by inlining things. By breaking the interpreter up into smaller functions, you have more opportunities to inline the other functions in the Python codebase that each instruction calls, and then optimize those in context.
Perhaps as the Faster CPython project keeps working on these things we'll see a more detailed performance breakdown that could shed some light on the matter.
Where does the speedup come from? And my own experiences