|
|
Log in / Subscribe / Register

A "frozen" dictionary for Python

By Jake Edge
December 4, 2025

Dictionaries are ubiquitous in Python code; they are the data structure of choice for a wide variety of tasks. But dictionaries are mutable, which makes them problematic for sharing data in concurrent code. Python has added various concurrency features to the language over the last decade or so—async, free threading without the global interpreter lock (GIL), and independent subinterpreters—but users must work out their own solution for an immutable dictionary that can be safely shared by concurrent code. There are existing modules that could be used, but a recent proposal, PEP 814 ("Add frozendict built-in type"), looks to bring the feature to the language itself.

Victor Stinner announced the PEP that he and Donghee Na have authored in a post to the PEPs category of the Python discussion forum on November 13. The idea has come up before, including in PEP 416, which has essentially the same title as 814 and was authored by Stinner back in 2012. It was rejected by Guido van Rossum at the time, in part due to its target: a Python sandbox that never really panned out.

frozendict

The idea is fairly straightforward: add frozendict as a new immutable type to the language's builtins module. As Stinner put it:

We expect frozendict to be safe by design, as it prevents any unintended modifications. This addition benefits not only CPython's standard library, but also third-party maintainers who can take advantage of a reliable, immutable dictionary type.

While frozendict has a lot in common with the dict built-in type, it is not a subclass of dict; instead, it is a subclass of the base object type. The frozendict() constructor can be used to create one in various ways:

    fd = frozendict()           # empty
    fd = frozendict(a=1, b=2)   # frozen { 'a' : 1, 'b' : 2 }
    d = { 'a' : 1, 'b' : 2 }
    fd = frozendict(d)          # same
    l = [ ( 'a', 1 ), ( 'b', 2 ) ]
    fd = frozendict(l)          # same
    fd2 = frozendict(fd)        # same
    assert d == fd == fd2       # True

As with dictionaries, the keys for a frozendict must be immutable, thus hashable, but the values may or may not be. For example, a list is a legitimate type for a value in either type of dictionary, but it is mutable, making the dictionary as a whole (frozen or not) mutable. However, if all of the values stored in a frozendict are immutable, it is also immutable, so it can be hashed and used in places where that is required (e.g. dictionary keys, set elements, or entries in a functools.lru_cache).

As might be guessed, based on the last line of the example above, frozen dictionaries that are hashable can be compared for equality with other dictionaries of either type. In addition, neither the hash() value nor the equality test depend on the insertion order of the dictionary, though that order is preserved in a frozen dictionary (as it is in the regular variety). So:

    d = { 'a' : 1, 'b' : 2 }
    fd = frozendict(d)
    d2 = { 'b' : 2, 'a' : 1 }
    fd2 = frozendict(d2)
    assert d == d2 == fd == fd2

    # frozendict unions work too, from the PEP
    >>> frozendict(x=1) | frozendict(y=1)
    frozendict({'x': 1, 'y': 1})
    >>> frozendict(x=1) | dict(y=1)
    frozendict({'x': 1, 'y': 1})
For the unions, a new frozen dictionary is created in both cases; the "|=" union-assignment operator also works by generating a new frozendict for the result.

Iteration over a frozendict works as expected; the type implements the collections.abc.Mapping abstract base class, so .items() returns an iterable of key-value tuples, while .keys() and .values() provide the keys and values of the frozen dictionary. For the most part, a frozendict acts like a dict that cannot change; the specific differences between the two are listed in the PEP. It also contains a lengthy list of places in the standard library where a dict could be switched to a frozendict to "enhance safety and prevent unintended modifications".

Discussion

The reaction to the PEP was generally positive, with the usual suggestions for tweaks and more substantive additions to the proposal. Stinner kept the discussion focused on the proposal at hand for the most part. One part of the proposal was troubling to some: converting a dict to a frozendict was described as an O(n) shallow copy. Daniel F Moisset thought that it would make sense to have an in-place transformation that could be O(1) instead. He proposed adding a .freeze() method that would essentially just change the type of a dict object to frozendict.

However, changing the type of an existing object is fraught with peril, as Brett Cannon described:

But now you have made that dictionary frozen for everyone who holds a reference to it, which means side-effects at a distance in a way that could be unexpected (e.g. context switch in a thread and now suddenly you're going to get an exception trying to mutate what was a dict a microsecond ago but is now frozen). That seems like asking for really nasty debugging issues just to optimize some creation time.

The PEP is not aimed at performance, he continued, but is meant to help "lessen bugs in concurrent code". Moisset noted, that dictionaries can already change in unexpected ways via .clear() or .update(), thus the debugging issues already exist. He recognized that the authors may not want to tackle that as part of the PEP, but wanted to try to ensure that an O(1) transformation was not precluded in the future.

Cannon's strong objection is to changing the type of the object directly. Ben Hsing and "Nice Zombies" proposed ways to construct a new frozendict without requiring the shallow copy—thus O(1)—by either moving the hash table to a newly created frozendict, while clearing the dictionary, or by using a copy-on-write scheme for the table. As Steve Dower noted, that optimization can be added later as long as the PEP does not specify that the operation must be O(n), which would be a silly thing to do, but that it sometimes happens "because it makes people stop complaining", he said in a footnote. In light of the discussion, the PEP specifically defers that optimization to a later time, suggesting that it could also be done for other frozen types (tuple and frozenset), perhaps by resurrecting PEP 351 ("The freeze protocol").

On December 1, Stinner announced that the PEP had been submitted to the steering council for pronouncement. Given that Na is on the council, though will presumably recuse himself from deciding on this PEP, he probably has a pretty good sense for how it might be received by the group. So it seems likely that the PEP has a good chance of being approved. The availability of the free-threaded version of the language (i.e. without the GIL) means that more multithreaded Python programs are being created, so having a safe way to share dictionaries between threads will be a boon.


Index entries for this article
PythonDictionaries
PythonPython Enhancement Proposals (PEP)/PEP 814


to post comments

Complexity specification

Posted Dec 5, 2025 9:01 UTC (Fri) by jeeger (subscriber, #104979) [Link] (7 responses)

As Steve Dower noted, that optimization can be added later as long as the PEP does not specify that the operation must be O(n)
I might be misremembering from my Uni days, but all O(1) algorithms are also O(n), so the statement doesn't make sense. I'd be happy for someone to correct me though.

Complexity specification

Posted Dec 5, 2025 9:29 UTC (Fri) by taladar (subscriber, #68407) [Link] (6 responses)

O(1) means constant time, O(n) means the time scales with the number of inputs.

Complexity specification

Posted Dec 5, 2025 9:55 UTC (Fri) by intelfx (subscriber, #130118) [Link] (5 responses)

> O(1) means constant time, O(n) means the time scales with the number of inputs.

Obviously, yes. What the GP is saying is that functions that are O(1) are also, strictly speaking, O(n), since the big-O notation only defines, informally, an "upper bound" on the algorithmic complexity.

(The answer here is that engineers tend to casually use the big-O notation where they really mean Knuth's Θ notation instead.)

Complexity specification

Posted Dec 7, 2025 12:29 UTC (Sun) by Baughn (subscriber, #124425) [Link] (4 responses)

And can you blame us? My keyboard does not have theta on it.

If he'd used uppercase-T instead, then we'd use it.

Complexity specification

Posted Dec 7, 2025 13:21 UTC (Sun) by excors (subscriber, #95769) [Link] (3 responses)

> My keyboard does not have theta on it.

As is often the case, Knuth solved that problem too, by inventing TeX half a century ago. Now we just need LWN to implement server-side KaTeX rendering.

Complexity specification

Posted Dec 7, 2025 13:57 UTC (Sun) by dskoll (subscriber, #1630) [Link] (2 responses)

I solved it in a horrible way. Look up "theta" on Wikipedia, then paste the result: Θ

Complexity specification

Posted Dec 7, 2025 15:10 UTC (Sun) by adobriyan (subscriber, #30858) [Link] (1 responses)

Install kcharselect or equivalent.

Complexity specification

Posted Dec 10, 2025 1:56 UTC (Wed) by raven667 (subscriber, #5198) [Link]

I will also add that GNOME Characters is a search provider, so you can enable it then just hit Meta and type "theta" and it'll return all the Unicode codepoints with "theta" in their names, arrow down to select and paste in. Very handy.

Hashable mappings

Posted Dec 5, 2025 11:45 UTC (Fri) by iabervon (subscriber, #722) [Link] (3 responses)

I'm looking forward to being able to check whether two unordered collections of mappings whose keys and values are strs are equivalent without a huge hassle. The code we've needed to consider [{"t": "a", "n": "b"}, {"t": "x", "n": "y"}] the same as both [{"t": "x", "n": "y"}, {"t": "a", "n": "b"}] and [{"n": "b", "t": "a"}, {"n": "y", "t": "x"}] (decoded from JSON) has been quite a pain. There wasn't a way to have a value built out of standard library data structures that can be accessed as expected and also makes these compare equal.

Next, I want a flag to json.loads() that causes it to return hashable values instead of mutable ones (without the caller needing to know how to accomplish that).

Hashable mappings

Posted Dec 6, 2025 0:49 UTC (Sat) by AdamW (subscriber, #48457) [Link] (2 responses)

Huh. I just read the PEP, and that's in interesting detail.

It says frozendicts will be ordered, but hashes and comparisons will not care about the order.

So frozendict({"a": "b", "c": "d"}) and frozendict({"c": "d", "a": "b"}) will have the same hash and compare as equal, but they're not really the same?

I don't know how I feel about that!

Hashable mappings

Posted Dec 6, 2025 5:02 UTC (Sat) by NYKevin (subscriber, #129325) [Link]

They are "not really the same" in the sense that, if you iterate over them, you'll observe two different orders. In all other respects, they're functionally identical.

Whether this is a problem is debatable, but it is also moot. Non-frozen dicts have behaved this way forever, so making frozendict behave differently would be pretty terrible language design.

Hashable mappings

Posted Dec 6, 2025 5:23 UTC (Sat) by iabervon (subscriber, #722) [Link]

It's already the case that {"a": "b", "c": "d"} and {"c": "d", "a": "b"} compare as equal, but are distinguishable with respect to the order that iterators visit the items.

The history is that the iterator order used to be unpredictable, so the same object might give different orders when traversed multiple times and objects constructed by adding the items in different order might give the same order when traversed multiple times. However, a more recent implementation of dict started to traverse the items in the order the keys were first added, just because that was more convenient, and then the language changed to guarantee this. Of course, that meant that there was now something you could reliably determine about dicts that wasn't included in the equality rules that had always existed.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds