|
|
Log in / Subscribe / Register

Toward a conclusion for Python dictionary "addition"

By Jake Edge
January 8, 2020

One of Guido van Rossum's last items of business as he finished his term on the inaugural steering council for Python was to review the Python Enhancement Proposal (PEP) that proposes a new update and union operators for dictionaries. He would still seem to be in favor of the idea, but it will be up to the newly elected steering council and whoever the council chooses as the PEP-deciding delegate (i.e. BDFL-Delegate). Van Rossum provided some feedback on the PEP and, inevitably, the question of how to spell the operator returned, but the path toward getting a decision on it is now pretty clear.

PEP 584 ("Add + and += operators to the built-in dict class") has been in the works since last March, but the idea has been around for a lot longer than that. LWN covered a discussion back in March 2015, though it had come up well before that as well. It is a seemingly "obvious" language enhancement, at least for proponents, that would simply create an operator for dictionaries to either update them in-place or to easily create a combination of two dictionaries:

>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> d + e
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> e + d
{'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2}

>>> d += e
>>> d
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

As can be seen, the operation would not be commutative as the value for any shared keys will come from the second (i.e. right-hand) operand, which makes it order-dependent. There are some who do not see the operators as desirable features for the language, but the most vigorous discussion over the last year or so has been about its spelling, with a strong preference for using | and |= among participants in those threads—including Van Rossum.

At the beginning of December, Van Rossum posted his review of the PEP to the python-ideas mailing list. He encouraged the authors (Brandt Bucher and Steven D'Aprano) to request a BDFL-Delegate for the PEP from the steering council, noting that he would not be on the council after the end of the year. D'Aprano indicated that he would be doing so. Apparently that happened, because, tucked away in the notes from the November and December steering council meetings was a mention that a BDFL-Delegate had been assigned—none other than Van Rossum himself.

In his review, he comes down strongly in favor of | and |= and had some other minor suggestions. He said: "All in all I would recommend to the SC to go forward with this proposal, targeting Python 3.9, assuming the operators are changed to | and |=, and the PEP is brought more in line with the PEP editing guidelines from PEP 1 and PEP 12." Given that, and that he is the decision maker for the PEP, it would seem to be smooth sailing for its acceptance.

That did not stop some from voicing objections to the PEP as a whole or the spelling of the operator in particular, of course, though the discussion was collegial as is so often the case in the Python world. Van Rossum thought that | might be harder for newcomers, but was not particularly concerned about that: "I don't think beginners should be taught these operators as a major tool in their toolbox". But Ryan Gonzalez thought that beginners might actually find that spelling easier because of its congruence to the Python set union operator.

Serhiy Storchaka is not a fan of the PEP in general, but believes that | is a better choice than +. He thinks there are already other ways to accomplish the same things that the operators would provide and that their use may be error-prone. He also had a performance concern, but Brett Cannon pointed out that it might only exist for CPython; PyPy and other Pythons might not have the same performance characteristics. Furthermore:

To me this PEP is entirely a question of whether the operators will increase developer productivity and not some way to do dict merging faster, and so performance questions should stay out of it unless it's somehow slower than dict.update().

Marco Sulla made the argument that using | is illogical because sets also support a set difference operation using -, while the PEP does not propose that operator for dictionaries (though it should be noted that a previous incarnation of the PEP did have "subtraction", but it was not well-received and was dropped). Andrew Barnert felt that "illogical" was not the right reason to choose one spelling over the other:

It’s logical to spell the union of two dicts the same way you spell the union of two sets; it’s also logical to spell the concatenation of two dicts the same way you spell the concatenation of two lists. The question is which one is a more useful analogy, or which one is less potentially confusing, not which one you can come up with a convoluted way of declaring illogical if you really try.

Sulla continued by saying that since list and string subtraction make no sense, that it is an unfair comparison. But Chris Angelico pointed out that's not necessarily the case either, since that operation does make sense in some contexts. While he doesn't necessarily think Python should add support for those use cases, "I do ask people to be a little more respectful to the notion that these operations are meaningful". What followed was a bit of a digression into mathematics and the meaning of various operations, much of which had little to do with Python.

There were two offshoots of the discussion. "Random832" suggested a generic way to add an operator specific to a module: all code in the module could use the operator but it would not bubble out from there. Cannon thought it could be quite confusing to programmers who did not realize the operator was redefined. "And debugging this wouldn't be fun either." Storchaka brought up some performance concerns, which could perhaps be worked around, but the general reaction to Random832's idea was negative.

Jonathan Fine thought that since the proposed | operator gives preference to the right operand ("merge-right" in his terminology), there was a need for a merge-left operation. He called it gapfill(), which was a puzzling name choice to some; it would only add values for keys in the right-hand operand that were not present in the left-hand one. While the use case of, say, filling in defaults to a dictionary that held command-line options is reasonable, there are a number of other ways to do that (as is also true for |, however). Fine did not propose that an operator be added but did note that some other Python operations could be seen to give preference to the left-hand operand, which might make the merge-right | operator confusing. There was not a lot of reaction to the idea, but it doesn't look to be going anywhere for now.

D'Aprano plans to update the PEP based on the feedback from Van Rossum and others. It presumably also needs to run the gauntlet of the python-dev mailing list before Van Rossum can decide its fate. There is still plenty of time for all of that before the Python 3.9 release, even though the project adopted a 12-month release cycle a few months back. Python 3.9 is due in early October; it's a pretty good bet that | and |= for dictionaries will make the cut. Even if they do not, though, one of the goals was to put the subject to rest once and for all; a rejected PEP would serve as a place to point those who ask about dictionary "addition" in the future.


Index entries for this article
PythonDictionaries
PythonPython Enhancement Proposals (PEP)/PEP 584


to post comments

Toward a conclusion for Python dictionary "addition"

Posted Jan 9, 2020 13:21 UTC (Thu) by mina86 (guest, #68442) [Link] (2 responses)

Is there an example of a type where pipe operator is not reflective (as in ‘x | y’ yields different result to ‘y | x’)? Preferring ‘|’ over ‘+’ seems strange to me as it would create an unnecessary precedence. Using plus wouldn’t create such a precedence and programmers are already used to addition not being reflective for all types.

Toward a conclusion for Python dictionary "addition"

Posted Jan 9, 2020 13:49 UTC (Thu) by maltekraus (subscriber, #129184) [Link] (1 responses)

Sets: {1} | {True} gives {1}, whereas {True} | {1} gives {True}.

Toward a conclusion for Python dictionary "addition"

Posted Jan 9, 2020 18:50 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

I'm not entirely convinced that counts:

Python 3.7.5rc1 (default, Oct  2 2019, 04:19:31) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 == True
True
>>> {1} == {True}
True
>>> {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} == {'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2}
False

Toward a conclusion for Python dictionary "addition"

Posted Jan 9, 2020 13:47 UTC (Thu) by jezuch (subscriber, #52988) [Link] (1 responses)

> Jonathan Fine thought that since the proposed | operator gives preference to the right operand ("merge-right" in his terminology), there was a need for a merge-left operation.

And thus the compromise!

|+

and

+|

:)

Toward a conclusion for Python dictionary "addition"

Posted Jan 10, 2020 12:43 UTC (Fri) by cpitrat (subscriber, #116459) [Link]

Reading this same part I thought |< and |> would provide both functionalities in an expressive way but your option has the benefit of making the + proponents happy... or not?

Toward a conclusion for Python dictionary "addition"

Posted May 29, 2020 0:56 UTC (Fri) by 0thgen (guest, #139205) [Link] (1 responses)

im a big fan of using the addition symbol

it seems like the correct symbol for the union of two sets is `∪` anyways, so it's not like using `|` is more formally correct anyways
(source: https://en.wikipedia.org/wiki/Union_(set_theory))

Toward a conclusion for Python dictionary "addition"

Posted Mar 22, 2024 8:06 UTC (Fri) by SomeOtherGuy (guest, #151918) [Link]

| is more formally correct, the definition of union is an axiom of set theory (another axiom shows its unique) but basically:

The union of A and B contains an element x if and only if (x is in A _OR_ x is in B)

Intersection is "and"

Some authors (notably Russian) - eg Krzysztof Maurin use a big "V" like symbol for union (which is a symbol for or) and a ^ like symbol for intersection (^ (upside down vee) being a symbol for and)

BUT this doesn't really matter! Python isn't APL :P

Also a map/dict represents a function, set theory wise a function from sets A to B is a relation (subset of A times B - the cartesian product) such that there exists exactly one b in B such that f(a)=b for some (possibly multiple) a in A.

This means that if we write f(a)=b and f(a)=c then b=c a function is a kind of relation.

Functions are not closed under set union, that is the union of two functions need not be a function (indeed it's usually not)

Isn't maths fun :)


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds