|
|
Log in / Subscribe / Register

Python and the infinite

By John Coggeshall
October 13, 2020

A recent proposal on the python-ideas mailing list would add a new way to represent floating-point infinity in the language. Cade Brown suggested the change; he cited a few different reasons for it, including fixing an inconsistency in the way the string representation of infinity is handled in the language. The discussion that followed branched in a few directions, including adding a constant for "not a number" (NaN) and a more general discussion of the inconsistent way that Python handles expressions that evaluate to infinity.

In general, Python handles floating-point numbers, including concepts like infinity, following the standards laid out by IEEE 754. Positive and negative infinity are represented by two specific floating-point values in most architectures. Currently, representing a floating-point infinite value in Python can be done using a couple of different mechanisms. There is the float() function, which can be passed the string "inf" to produce infinity, and there is the inf constant in the math library, which is equivalent to float('inf'). Brown provided several reasons why he believed a new, identical, and built-in constant was necessary. One of his reasons was that he felt that infinity is a "fundamental constant" that should be accessible from Python without having to call a function or require a library import.

To highlight the issue, Brown provided an example using the repr() and eval() functions. The repr() function converts a data structure to a printable string that in many cases can be evaluated back into the original data structure using eval(). Brown noted that, unlike other floating-point values, using eval() on the printable representation of inf results in an exception unless inf has been imported from math:

    >>> eval(repr(float('inf')))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1, in <module>
    NameError: name 'inf' is not defined

Brown's post created a mega-thread around how floating-point infinity values should be handled in Python. To some, it wasn't clear why importing inf was a bad solution to Brown's problem. Additionally, Greg Ewing noted that using the eval() function in the way suggested by Brown can be "a serious security problem in some contexts." The eval() function evaluates a string as Python code, which is dangerous if the string contains user-supplied data. Ewing suggested that the safer function to use would be literal_eval() from the ast library, which only evaluates literal values. Using that function would only work, however, if inf and nan were built-in constants.

Christopher Barker agreed, but noted another issue: "there is no way to use literal_eval that gets you an inf (or NaN value). Which is a real, though maybe not important, use case." Later in the thread, Barker further refocused the conversation on the fundamental point that Brown's post was leading to:

What's special is that we have a literal syntax for only a few special fundamental types: floats, ints, strings. (I [think] that's it, yes?), as well as "display" versions of a few core container types: list, dict, etc.

So the limitation here is that floats have a literal, and can have a huge number of values, all of which can be created by a literal, except inf,[-inf], and NaN.

Barker did not think that it was a "critically important" issue, but he did feel that extending the concept of a valid literal for the float type to include inf and nan was useful. He also claimed that it shouldn't require a new keyword. Barker admitted that if a new keyword were the only way to implement Brown's idea, it would be "dead in the water." Ewing suggested a possible implementation using a "special bytecode that looks it up as a name, and if that doesn't work, returns float('inf')," but noted that it was more straightforward and functionally the same to move math.inf and math.nan to __builtins__.

The problem with adding these constants to __builtins__, as Ewing noted, was that Guido van Rossum was not in favor of that idea. To that point, Barker responded by stating that, while Van Rossum has a "very well respected opinion", he is no longer the Benevolent Dictator for Life who could make this decision unilaterally.

Barker clarified that his goal was "to do that little bit more to make Python better support the float special values out of the box." He noted that, while PEP 754 ("IEEE 754 Floating Point Special Values") was rejected due to inactivity, almost all of the suggestions it made had ultimately been implemented, except for a version of the built-in literals for infinity and NaN now being proposed.

Van Rossum responded by expressing his willingness to sponsor a PEP for the Python steering council to consider. While he did not personally agree with the idea, he committed to remaining neutral so that Barker and Brown could receive "a fair hearing" with that body. With Van Rossum's sponsorship support, the duo put together a draft PEP that is now available on GitHub. Reflecting the thread, the PEP proposes "the introduction of built-in variables for floating point 'Infinity' (inf) and 'Not a Number' (nan)." Also included in the draft is an example summarizing the impact of the proposed change:

>>> inf
inf
>>> inf > 1e9
True
>>> eval('inf')
inf
>>> assert inf == eval('inf')
>>> inf == eval(repr(inf))
True
>>> import ast
>>> inf == ast.literal_eval('inf')
True

The discussion that evolved from Brown's post did more than lead to a new PEP; it also sparked an interesting conversation around how Python handles expressions that evaluate to mathematical infinity. Paul Bryan provided a relevant example of some quirks of the IEEE specification, specifically how subtracting infinity from itself results in NaN, but comparing the NaN from that operation doesn't equal math.nan:

>>> math.inf - math.inf
nan
>>> (math.inf - math.inf) == math.nan
False

Steven D'Aprano explained this strange behavior. According to IEEE 754, NaN values are, by definition, never equal to each other. To be otherwise, D'Aprano explained, would result in nonsensical expressions like sqrt(-2) == sqrt(-3) evaluating to True.

The conversation prompted a question from Stephen J. Turnbull, who asked if "base" Python could "create an infinity." Van Rossum had suggested earlier in the thread that you could produce inf from the value 1e1000 "in a pinch". However, Turnbull noted that a value like 1e1000, which cannot be represented in IEEE floating point, is technically an overflow rather than an infinity. While IEEE 754 permits mapping an overflow to infinity, that is not the same thing as the abstract mathematical concept of infinity. In another message, he expanded on this idea:

As Steven [D'Aprano] points out, [1e1000 is] an overflow, and IEEE *but not Python* is clear about that. In fact, none of the actual infinities I've tried (1.0 / 0.0 and math.tan(math.pi / 2.0)) result in values of inf. The former raises ZeroDivisionError and the latter gives the finite value 1.633123935319537e+16.

Turnbull explained that arithmetic involving infinite values only makes sense if it is carefully analyzed in the context of a specific calculation. The problem being that Python is inconsistent on the matter: some expressions should evaluate as the IEEE infinity but do not, some currently evaluate to inf when they are really overflows, and still others raise exceptions. As Turnbull said:

I prefer to think of it as being honest: this isn't infinity, this is overflow — and the way Python treats infs, here be Dragons, all bets are off, "magic is loose in the world", and anything can happen.

Ben Rudiak-Gould provided some examples of Python's inconsistencies when dealing with various mathematical expressions that represent an infinity. Each one should ideally evaluate to inf, but most raised exceptions like ValueError or OverflowError instead; "I get the impression that little planning has gone into this", he added.

Brown agreed that "a look should be taken at the structure of math-based errors and exceptions," adding that "the exceptions make little sense." Since the types of changes required to make Python consistent in this regard would represent a significant backward-compatibility break, it seems unlikely that they will be addressed soon.

As for the PEP, the next step is for Brown and Barker to submit a pull request to the steering council for acceptance, be assigned a PEP number, and bring it up again for further discussion. Before the steering council or its delegate will be able to accept or reject it, though, the PEP will need to be discussed on the python-dev mailing list. Time will tell if Brown's wish to move inf and nan into __builtins__ is accepted, and what, if anything, comes of the inconsistent way that Python handles concepts like infinity throughout the language.


Index entries for this article
PythonFloating point


to post comments

Python and the infinite

Posted Oct 13, 2020 18:02 UTC (Tue) by mb (subscriber, #50428) [Link] (2 responses)

>In fact, none of the actual infinities I've tried (1.0 / 0.0

Division by zero is not an actual infinity.

https://en.wikipedia.org/wiki/Division_by_zero

Python and the infinite

Posted Oct 13, 2020 18:44 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

This is in the context of IEEE 754, for which 1.0 / 0.0 most certainly should be (positive) infinity (assuming the zero is positive, of course). IEEE 754 does allow 1.0 / 0.0 to trap, but this is explicitly an optional extension, and not the default way that things are "supposed to" work.

(Why do that? Because IEEE 754 is designed for doing calculations with a limited number of significant digits. The zero may actually stand for a non-zero quantity which is too small to represent, and was rounded to zero. This is also why we preserve the sign of the zero in the rounding step - so we can reproduce the correctly-signed infinity.)

Python and the infinite

Posted Oct 24, 2020 13:45 UTC (Sat) by eduperez (guest, #11232) [Link]

And infinity is not even a number...

Python and the infinite

Posted Oct 13, 2020 19:33 UTC (Tue) by smurf (subscriber, #17840) [Link] (1 responses)

> math.pi / 2.0

Since π is a transcendental number it isn't representable in a computer's floating point format, thus the above expression is not equal to π/2, thus the resulting large-but-not-infinite value is correct.

Python and the infinite

Posted Oct 14, 2020 22:39 UTC (Wed) by marcH (subscriber, #57642) [Link]

Nice.

Of course a "transcendental" number isn't even required to have fun with IEEE rounding. Stackoverflow (pun intended) is enough:

python3

a=1.9999999999999998e+00
b=1/(1/a)
1/(a-b)

=> large-but-not-infinite 4503599627370496.0

Python and the infinite

Posted Oct 13, 2020 20:21 UTC (Tue) by josh (subscriber, #17465) [Link] (1 responses)

The type `float` is available in Python without importing anything. Why not add inf and nan as attributes on `float`, so that they're canonically spelled `float.inf` and `float.nan`?

Python and the infinite

Posted Oct 13, 2020 21:47 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Those aren't literals, so ast.literal_eval() won't accept them.

IMHO, however, this whole discussion is Not A Bug (except for the "some calculations should produce inf, but they trap instead" part - that should be fixed, but I imagine that's easier said than done). Nobody ever promised that eval(repr(x)) == x. The purpose of repr() is to be unambiguous, not to round-trip perfectly. The proper "I want to round-trip data to/from a string-ish type, and I don't care about arbitrary code execution" function is pickle.dumps(). Or, for simple data structures including floats, json.dumps() (which also patches the arbitrary code execution hole).

Python and the infinite

Posted Oct 13, 2020 21:48 UTC (Tue) by koh (subscriber, #101482) [Link] (2 responses)

I disagree with Ben who writes

> [...] lgamma(0) raises a ValueError,
> which isn't even a subclass of ArithmeticError. The function has a pole at
> 0 with a well-defined two-sided limit of +inf. If it isn't going to return
> +inf then it ought to raise ZeroDivisionError, which should obviously be a
> subclass of OverflowError.

The subclassing should be different: The ValueError actually should be an OutOfDomainError and as such be a super-class of the more specific DivisionByZero, however it should also be a subclass of ArithmeticError. OverflowError on the other hand has nothing to do with arithmetic but with representation. Overflows are not specific to functions: Functions have a domain and therefore there might be values outside of their domain. Overflows in contrast only occur when the value is not representable in the specific type, but that implies that there actually is a value and the case that "the argument" (whatever that might be in the context of representations) "is not in the domain" cannot happen. IEEE specifies a way of handling overflows. Other types might as well. The underlying functions do not have that "represented type" context.

For instance, integer division in some languages results in an integer again, others define its result to be of rational type. In the first case, over-/underflows and out-of-domain can happen, while in the second case there can not be an underflow.

Python and the infinite

Posted Oct 13, 2020 22:12 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (1 responses)

Overflow, as you describe it, isn't really a thing in Python, outside of "weird" library modules like ctypes and struct (which could probably just use ValueError anyway). Python's integers are arbitrary precision, and as you say, its floating point numbers should silently over/underflow as specified in IEEE 754 (but as the article notes, this is unfortunately not the case). That would leave the fractions module, but that's also arbitrary precision, and the decimal module, but that has its own little universe of "signals" for this sort of thing (see https://docs.python.org/3.9/library/decimal.html#signals). So maybe they should just kill OverflowError altogether?

Python and the infinite

Posted Oct 13, 2020 23:06 UTC (Tue) by koh (subscriber, #101482) [Link]

Interesting, thanks for the pointer to decimal's signals! Looks like they've taken a more sane approach there than for the math.* float functions. It is similar to what MPFR does, except they have a "context" maintaining "working precisions", etc. while MPFR has that for every operation (by way of the initialisation of the result for an operation). Overflow there seems to appear in order to bound the exponent which is also maintained by the context.

What does puzzle me is why they
a) keep the concepts of NaN and infinities even for decimal,
b) additionally allow for Overflow, Underflow and Subnormal "signals", and
c) let some of those functions raise OverflowError (which is different from the Overflow signal), e.g. as_integer_ratio() on the above infinities.

Wouldn't a) + b) be enough? Or c) without the others?

My take on IEEE-754 default semantics (no traps, just non-normal value representations) is "no alteration to control flow", which makes sense for numerical algorithms expecting a few spurious outliers in mega-/giga-/whatever large quantity of samples and not aiming for rigor. Looks like the motivation for Python's operations inside the decimal type was the same here, only c) exists if you go out of this type. But instead of making use of Python's non-strictly typedness they chose to force altering control flow by raising errors instead of returning None or something else. Is that a common pattern in Python's stdlib?

Python and the infinite

Posted Oct 14, 2020 8:32 UTC (Wed) by MKesper (subscriber, #38539) [Link] (1 responses)

Since the types of changes required to make Python consistent in this regard would represent a significant backward-compatibility break, it seems unlikely that they will be addressed soon.

Could this be behind some kind of feature gate á la from __future__ import consistent_inf?

Python and the infinite

Posted Oct 14, 2020 12:44 UTC (Wed) by hkario (guest, #94864) [Link]

Python has made changes like that before, with async first being introduced as a builtin and then promoted to keyword. If `inf` remains a builtin, that still allows you to do something like `inf = "abracadabra"` and use `inf` as a regular string variable, just like you can do with `bytes` now

so no, it will not break existing code

Python and the infinite

Posted Oct 14, 2020 11:03 UTC (Wed) by LtWorf (subscriber, #124958) [Link] (1 responses)

Meanwhile the only way known to me to get the type of None, to give type hints, is to do type(None)

Python and the infinite

Posted Oct 14, 2020 15:19 UTC (Wed) by mgedmin (guest, #34497) [Link]

You don't have to use type(None) in type hints; you're allowed to use None itself.

https://docs.python.org/3/library/typing.html#type-aliases

Python and the infinite

Posted Oct 14, 2020 18:35 UTC (Wed) by kevincox (subscriber, #93938) [Link]

It bothers me that they are lowercase unlike other constants. I think they should be Inf and NaN to match True, False and None.

Python and the infinite

Posted Oct 14, 2020 22:43 UTC (Wed) by marcH (subscriber, #57642) [Link] (1 responses)

tl;dr : stick to Julia for number crunching?

Python and the infinite

Posted Oct 15, 2020 19:09 UTC (Thu) by smitty_one_each (subscriber, #28989) [Link]

I was thinking that, on the one hand, python isn't the preferred choice for serious mathematical endeavors;
on the other hand, giving python substantial mathematical heft would be useful.

Python and the infinite

Posted Oct 15, 2020 8:30 UTC (Thu) by Homer512 (subscriber, #85295) [Link]

Any reason why the overflow / zero division error cannot be handled with a thread-local context? The current behavior could then remain the default for backwards compatibility. Something like this:

with math.trap(None):
 x = y / z
with math.trap():
 try:
  u = v / w
 except ZeroDivisionError:
  pass


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds