CPython, C standards, and IEEE 754
Perhaps February was "compiler modernization" month. The Linux kernel recently decided to move to the C11 standard for its code; Python has just undergone a similar process for determining which flavor of C to use for building its CPython reference implementation. A calculation in the CPython interpreter went awry when built with a pre-release version of the upcoming GCC 12; that regression led down a path that ended up with the adoption of C11 for CPython as well.
A bug that was fixed in early February started the ball rolling for Python. Victor Stinner encountered a GCC regression that caused CPython not to get the expected IEEE 754 floating-point NaN (not a number) value in a calculation. An LWN article sheds some light on NaNs (and how they are used in Python) for those who need a bit more background. The calculation was using the HUGE_VAL constant, which is defined as an ISO C constant with a value of positive infinity; the code set the value of the internal Py_NAN constant used by the interpreter to HUGE_VAL*0, which should, indeed, evaluate to a NaN. Multiplying infinity by any number is defined to be a NaN for IEEE 754.
During his investigation of the problem, Stinner found that instead of the
calculation, Python could simply use the NAN constant defined in
<math.h>—as long as a C99 version of the header file
was used.
As part of the bug discussion, Petr Viktorin said that PEP 7
("Style Guide for C Code
") should be updated to reflect the
need for the C99 header file. So Stinner duly created a pull request for a
change to the PEP, but Guido van Rossum said
that a change of that nature should be discussed on the python-dev mailing list.
That led Stinner to post a message to discuss the change on February 7. As it turns out, there are actually two bugs fixed by Stinner that require parts of the C99 math API; bug 45440 reported a problem with the CPython Py_IS_INFINITY() macro; the fix for that also involved using the C99 <math.h>. As Stinner noted, C99 is now 23 years old, and support for it in compilers is widespread; GCC, Clang, and Microsoft Visual C (MSVC) all support the needed features.
Floating point
Mark Dickinson pointed
out that the existence of the NAN constant is not required by
C99 directly; it is only present if IEEE 754 floating
point is enabled as well. He thought that it made sense for CPython to
require IEEE 754, but wondered whether Python, the language,
should also require it. Stinner said
that all modern computers support IEEE 754; even embedded devices
without a floating-point unit (FPU) typically support it in
software. "Nowadays, outside museums, it's hard to find computers
which don't
implement IEEE 754.
"
Stinner was in favor of requiring IEEE 754 for CPython; Gregory P. Smith agreed. Brett Cannon wondered if there was even any ability to test with systems that lacked the support:
Do we have a buildbot that has a CPU or OS that can't handle IEEE-754? What are the chances we will get one? If the answers are "none" and "slim", then it seems reasonable to require NaN and IEEE-754.
Stinner reported
that all of the buildbot machines supported IEEE 754, so the path was clear to
require it. In terms of the Python language, Christopher Barker said
that IEEE 754 support should not be required for all implementations
of Python, but that
it should be recommended. Steve
Dower agreed
that leaving it up to Python implementations made sense: "Otherwise,
we would prevent _by specification_ using Python as a
scripting language for things where floats may not even be
relevant.
" He said that making it a requirement would inhibit
adoption: "The more 'it's-only-Python-if-it-has-X'
restrictions we have, the less appealing we become.
"
Which C?
Switching to C99 makes sense if the compilers being used to build CPython
support it, Cannon said.
Viktorin asked
about MSVC's support for all of C99; he did not find any documentation saying that
it did, so it might be better to consider C11,
which is
supported. "Do we need to support a subset like 'C99 except the
features that were
removed in C11'?
" Dower, who works on Python at Microsoft, said
that he had not found an answer to the C99 question either:
All the C99 library is supposedly supported, but there are (big?) gaps in the compiler support. Possibly these are features that were removed in C11? I don't know what is on that list.[...] Personally, I see no reason to "require" C99 as a whole. We have a style guide already, and can list the additional compiler features that we allow along with guidance on updating existing code and ensuring compatibility.
I don't see much risk requiring the C99 standard library, though. It's the compiler features that seem to have less coverage.
Stinner suggested
tying the wording to what was supported in MSVC, but H. Vetinari thought
a better formulation might be "'C99 without the things that became
optional in C11', or perhaps
'C11 without optional features'
". That led Viktorin to wonder
why C11 would not make a better target: "[...] the main thing keeping
us from C99 is MSVC support, and
since that compiler apparently skipped C99, should we skip it as well?
"
Cannon said
that he found a list
of optional C11 features, none of which were really needed; if the "C11
without optional features"
flavor is widely supported, as it would seem
that it is, "I think that's a fine target to have
".
Meanwhile, both Inada
Naoki and Viktorin
were excited about using C11's anonymous
union feature in CPython.
Viktorin also said that in order to keep the CPython public header files
compatible with C++, anonymous unions could not be used in them, though
Inada said
that C++ does
support them, "with some reasonable limitations
". While
CPython aims to be compatible with C++ at the API level, it is hard to
completely specify what that means—or even which version of the C standard
is supported—as Smith pointed
out:
We're likely overspecifying in any document we create about what we require because the only definition any of us are actually capable of making for what we require is "does it compile with this compiler on this platform? If yes, then we appear to support it. can we guarantee that? only with buildbots or other CI [continuous integration]" - We're generally not versed in specific language standards (aside from compiler folks, who is?), and compilers don't comply strictly with all the shapes of those anyways for either practical or hysterical reasons. So no matter what we claim to aspire to, reality is always murkier. A document about requirements is primarily useful to give guidance to what we expect to be aligned with and what is or isn't allowed to be used in new code. Our code itself always has the final say.
The final result was a rather small patch to PEP 7 to say that CPython 3.11 and beyond use C11 without the optional features (and that the public C API should be compatible with C++). In addition, bug 46656 and a February 25 post from Stinner document the changes to the floating-point requirements; interestingly, they do not mention IEEE 754, just a requirement for a floating-point NaN. While it may have seemed like a bit of a yak-shaving exercise along the way, the GCC regression eventually led to a better understanding of which flavor of C is supported for building CPython—along with moving to a flavor from this century. All in all, a good "days" work.
Index entries for this article | |
---|---|
Python | CPython |
Python | Floating point |
Posted Mar 2, 2022 23:52 UTC (Wed)
by developer122 (guest, #152928)
[Link] (5 responses)
Posted Mar 3, 2022 2:42 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (4 responses)
Posted Mar 3, 2022 5:24 UTC (Thu)
by developer122 (guest, #152928)
[Link] (1 responses)
Posted Mar 5, 2022 17:46 UTC (Sat)
by wtarreau (subscriber, #51152)
[Link]
Posted Mar 3, 2022 9:26 UTC (Thu)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Mar 3, 2022 9:42 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
Posted Mar 3, 2022 11:28 UTC (Thu)
by ballombe (subscriber, #9523)
[Link]
Posted Mar 9, 2022 23:52 UTC (Wed)
by bartoc (guest, #124262)
[Link]
Sidenote: while msvc is pretty unlikely to ever support them I actually don't think VLAs are that horrible of a feature. Compilers just have to emit stack probes! which they were all really bad about doing initially (and some even still don't with default settings!). IMO the standard should have required them to emit them unconditionally (unless the compiler can prove the size is actually < 1 page, for sure). I should write a paper, tbh.
CPython, C standards, and IEEE 754
CPython, C standards, and IEEE 754
CPython, C standards, and IEEE 754
CPython, C standards, and IEEE 754
CPython, C standards, and IEEE 754
CPython, C standards, and IEEE 754
CPython, C standards, and IEEE 754
There are ARM processors without hardware ieee754, but with software ieee754 support, but of course the performance are not the same.
Even VAX could in theory have software ieee754 support.
CPython, C standards, and IEEE 754