|
|
Log in / Subscribe / Register

Unpacking for Python comprehensions

By Jake Edge
November 21, 2025

Unpacking Python iterables of various sorts, such as dictionaries or lists, is useful in a number of contexts, including for function arguments, but there has long been a call for extending that capability to comprehensions. PEP 798 ("Unpacking in Comprehensions") was first proposed in June 2025 to fill that gap. In early November, the steering council accepted the PEP, which means that the feature will be coming to Python 3.15 in October 2026. It may be something of a niche feature, but it is an inconsistency that has been apparent for a while—to the point that some Python programmers assume that it is already present in the language.

Unpacking

One of the most common use cases for unpacking is to pass a list or dictionary to a function as a series of its elements rather than as a single object. The "*" unpacking operator (and its companion, "**", for dictionaries) can be used to expand an iterable in that fashion; a simple example might look like the following:

    def foo(a, b):
        a*b

    l = [ 2, 4 ]  # list
    t = ( 3, 5 )  # tuple
    d = { 'a' : 2, 'b' : 9 } # dict
    foo(*l)   # foo(2, 4) == 8
    foo(*t)   # foo(3, 5) == 15
    foo(**d)  # foo(a=2, b=9) == 18
In each case, the unpacking operator extracts the elements of the iterable to turn them into individual arguments to the function—keyword arguments for the dictionary.

Python comprehensions provide a mechanism to build a list or other iterable using a compact syntax, rather than a full loop. A classic example of that appears in the Python documentation linked just above:

    squares = []
    for x in range(10):
        squares.append(x**2)

    # can be replaced with

    squares = [ x**2 for x in range(10) ]
For both of those, the result is a list with the squares of the numbers zero through nine.

The unpacking operator can already be used to create iterables, such as:

    a = [ 1, 2 ]
    b = [ 3, 4 ]
    c = [ *a, *b ]  # [ 1, 2, 3, 4 ]

    # dictionaries can be merged in similar fashion

    newdict = { **d1, **d2, **d3 }
In the latter example, the order of the dictionaries matters, keys that are duplicated will take their value from the last dictionary where they were set (i.e. d3[key] takes precedence over the value for key in d1 or d2).

But what if there is a list of lists with a length that is not known except at run time? Currently, there is no easy way to use a list comprehension to build the flattened list of all of the entries of each list; it can be done using a comprehension with two loops, but that is error prone. There are other possibilities too, as described in the "Motivation" section of the PEP, but all of them suffer from semi-obscurity or complexity. Instead, Python will be allowing unpacking operators in comprehensions:

    # from the PEP
    
    [*it for it in its]  # list with the concatenation of iterables in 'its'
    {*it for it in its}  # set with the union of iterables in 'its'
    {**d for d in dicts} # dict with the combination of dicts in 'dicts'
    (*it for it in its)  # generator of the concatenation of iterables in 'its'

    # a usage example

    a = [ 1, 2 ]
    b = [ 3, 4, 5 ]
    c = [ 6 ]
    its = [ a, b, c ]
    [ *it for it in its ]   #  [ 1, 2, 3, 4, 5, 6 ]

    # current double-loop version

    [ x for it in its for x in it ]  #  [ 1, 2, 3, 4, 5, 6 ]

Discussion

The PEP authors, Adam Hartz and Erik Demaine, actually proposed the idea back in 2021 on the python-ideas mailing list. As noted in that message, though, the idea also came up in 2016 and perhaps even before that. In 2021, the proposal was generally well-received, and reached the pre-PEP stage, but was unable to attract a core developer as a sponsor. In late June, Hartz posted a lengthy pre-PEP message to the ideas category of the Python discussion forum, "hoping to find a sponsor for moving forward with the PEP process if there's still enthusiasm behind this idea".

Hartz noted that the idea had been raised in October 2023, as well, so it is a feature that is frequently brought up—generally to nodding approval. That 2023 message was posted by Alex Prengère, who was quick to reply to the pre-PEP saying that he had been working on unpacking in comprehensions as well. He, along with others, wondered about support for unpacking in asynchronous comprehensions (as described in PEP 530); it was not mentioned in the pre-PEP, but the implementation would allow them, he said. Hartz said that the intent was to support them and that he would update the text to reflect that.

There was also some discussion regarding the syntax for making function calls using a generator comprehension (e.g. f(x for x in it)); there is some ambiguity in the meaning, as Ben Hsing noted. PEP 448 ("Additional Unpacking Generalizations") added the ability to use the unpacking operators in more contexts, including function call arguments, but it explicitly did not extend that to generator comprehensions because it was not clear which meaning should be chosen. As Hsing put it:

That is, which one of these is intended?
    f(*x for x in it) == f((*x for x in it))
or:
    f(*x for x in it) == f(*(x for x in it))

In the thread, Hartz and others argued that, since the language already allows a generator comprehension (without any unpacking operators) as an argument without requiring an extra set of parentheses (e.g. f(x for x in it), the same should be true for those with the unpacking operator. In the reply linked above, Hartz noted that the error message for the syntax error points in that direction as well:

A little bit of support in this direction, perhaps, comes from the way that the syntax error for f(*x for x in it) is reported in 3.13, which suggests that this is interpreted as f(<a single malformed generator expression>) rather than as f(*<something>):
    >>> f(*x for x in its)
      File "<python-input-0>", line 1
        f(*x for x in its)
          ^^
    SyntaxError: iterable unpacking cannot be used in comprehension

The draft PEP continued to be discussed; it was updated a few days after it was first posted, on June 25, and then again on July 3. The latter posting caught the eye of core developer Jelle Zijlstra who said that it "is a nice little feature that I've missed several times in the past" and that he was willing to sponsor it. That started its path into the CPython mainline.

The now-numbered PEP 798 was posted for discussion in the PEPs category of the forum on July 19. Along the way, the PEP had picked up some extra pieces, including a section with examples of where standard library code could be simplified using the feature and an appendix on support in other languages. Most of the comments at that point were about other features that might also be considered, though Hartz and Zijlstra tried to keep things focused on the PEP itself.

One outstanding issue was the treatment of synchronous generator expressions versus the asynchronous variety. The PEP, which will be changing as we will see, currently makes a distinction between the two because "yield from" is not permitted in asynchronous generators. Another appendix goes into more detail; the difference comes down to whether the generator-protocol methods, such as send(), can be used. There are two ways that the semantics of an unpacking generator expression could be defined:

    g = (*x for x in it)

    # could be:

    def gen():
        for x in it:
            yield from x
    g = gen()

    # or:

    def gen():
        for x in it:
            for i in x:
                yield i
    g = gen()
Either of those works for synchronous generators, but:
    g = (*x async for x in ait())

    # must be:

    async def gen()
        async for x in ait():
            for i in x:
                yield i
    g = gen()
So the question is whether it makes sense to define the synchronous semantics differently so that those comprehensions could potentially use the generator-protocol methods. Hartz ran a poll in the thread, with several possibilities for the semantics, but no real consensus was reached—perhaps unsurprising given the esoteric nature of the question and that thread participants had likewise been unable to converge on the semantics.

In mid-September, after more than a month of quiet in the thread, Hartz submitted the PEP to the steering council for consideration. The council started looking at it a month later, with council member Pablo Galindo Salgado noting that the group was uncomfortable with positioning the new syntax "as offering 'much clearer' code compared to existing alternatives (such as itertools.chain.from_iterable, explicit loops, or nested comprehensions)" because readability is in the eye of the beholder. Instead, the council suggested that "the stronger, more objective argument is syntactic consistency as extending Python's existing unpacking patterns naturally into comprehensions". Hartz agreed and adjusted the PEP accordingly.

In the thread, "Nice Zombies" highlighted part of the "Rationale" section of the PEP, which nicely illustrates the argument for syntactic consistency:

This proposal was motivated in part by a written exam in a Python programming class, where several students used the proposed notation (specifically the set version) in their solutions, assuming that it already existed in Python. This suggests that the notation represents a logical, consistent extension to Python's existing syntax. By contrast, the existing double-loop version [x for it in its for x in it] is one that students often get wrong, the natural impulse for many students being to reverse the order of the for clauses.

One of the examples given in the PEP shows how an explicit loop to create a set could be changed in the shutil module of the standard library:

    # current:
    ignored_names = []
    for pattern in patterns:
        ignored_names.extend(fnmatch.filter(names, pattern))
    return set(ignored_names)

    # proposed:
    return {*fnmatch.filter(names, pattern) for pattern in patterns}
Instead of extending the list from the iterable returned by fnmatch.filter(), then converting it to a set, the new syntax allows creating the set directly. The existing code could have taken advantage of set.update() to avoid using the list, but the new syntax is in keeping with the ideas behind comprehensions—and was apparently intuitively obvious, but wrong, to Python students.

In its announcement of the PEP's acceptance, the SC also decided the question about generator comprehensions: "we require that both synchronous and asynchronous generator expressions use explicit loops rather than yield from for unpacking operations". That removes some advanced use cases "that are rarely relevant when writing comprehensions" but simplifies the mental model for the new feature. "We don't believe that developers writing comprehensions should have to think about the differences between sync and async generator semantics or about generator delegation protocols."

While it is certainly useful, the feature is not revolutionary in any sense, it simply fills a fairly longstanding hole that has been noticed and discussed several times over the years. Python is a mature language at this point, so revolutions are likely to be few and far between—if not absent entirely. The whole tale shows, however, that, with some persistence, a well-written PEP, and a well-shepherded discussion (by Hartz joined by Zijlstra, Demaine was absent this time around), changes can be made. Future Python students can rejoice starting next October.


Index entries for this article
PythonEnhancements
PythonPython Enhancement Proposals (PEP)/PEP 798


to post comments

Would’ve failed the question too.

Posted Nov 21, 2025 19:06 UTC (Fri) by glueless (guest, #180321) [Link]

I’m surprised that set example *wasn’t* valid syntax already, but then again, I remember trying that before and hitting the same issue! This definitely makes a lot more sense and I’m looking forward to this feature.

Pythonic Extraction and Report Language

Posted Nov 22, 2025 1:14 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Is it just me, or the "current" notation is just better than the proposal?

Also, when do we get to have a nice periodic table of operators, like in Perl ( https://www.ozonehouse.com/mark/periodic/ )?

Pythonic Extraction and Report Language

Posted Nov 22, 2025 15:35 UTC (Sat) by rsidd (subscriber, #2582) [Link]

Are you referring to the set example
    return {*fnmatch.filter(names, pattern) for pattern in patterns}
or to the list PEP example?
    [ *it for it in its ]   #  [ 1, 2, 3, 4, 5, 6 ]

In both cases I find the new version better, but in the set example, I agree it's not a big difference. For lists I have done the nested-for version many times, it's not pretty.

List comprehension example looks strange?

Posted Nov 22, 2025 15:45 UTC (Sat) by rsidd (subscriber, #2582) [Link] (9 responses)

I thought list comprehensions (which are very cool, and I believe were borrowed from Haskell into Python) are a syntactic sugar for map/filter and not for imperative code. That is,
    squares = [ x**2 for x in range(10) ]
is a shorter version of
squares = list(map(lambda x: x**2, range(10)))
and not
    squares = []
    for x in range(10):
        squares.append(x**2)
That is, the first two are purely functional with no side-effects (and, at least in Haskell-equivalent, the first is just syntactic sugar for the second), while the third is imperative, with side-effects (it increments the initially null list one element at a time), and at least to a functional programmer, ugly.

After writing this, I looked at the linked python docs and the docs indeed make the same point, that the comprehension is an alternative syntax for map/filter and not for the imperative version.

List comprehension example looks strange?

Posted Nov 22, 2025 19:09 UTC (Sat) by iabervon (subscriber, #722) [Link] (8 responses)

I believe that list(iterable) actually initializes the return value empty and adds each item from the iterable, but you can't tell within the language because the value isn't in the environment until everything has been added. So it isn't actually different, so long as you don't open-code the imperative one (in which case you'd have the still-being-mutated list in your environment).

If you hang out with Haskell implementors, you find out that the runtime is actually doing a lot of imperative things that it can prove you can't tell are happening.

List comprehension example looks strange?

Posted Nov 23, 2025 3:39 UTC (Sun) by rsidd (subscriber, #2582) [Link] (7 responses)

Well yes, functional programming refers to the high-level language and guarantees that it gives. Low-level implementations cannot be functional, bits have to be flipped ...

List comprehension example looks strange?

Posted Nov 23, 2025 14:38 UTC (Sun) by iabervon (subscriber, #722) [Link] (6 responses)

That's directly true in that you need to fill in values in newly allocated memory and reclaim memory, but there's also the less required case of mutating objects as the only reference gets replaced with a very similar object. For example, there's a lot of code where the high-level description says you get a new list with one more element, but the low-level implementation actually mutates the list in place because there won't be a reference to the old list afterwards. Also, recursive loops with tail calls get further optimized into mutating the stack frame instead of popping it off and pushing a new frame with some bindings in common, resulting in what would sensibly decompile to an imperative loop. These go beyond the need to deal with finite machines that can't just find you memory with the bit pattern you want already there, to changing allocated memory that's been read before in a way that's only necessary for getting acceptable performance.

List comprehension example looks strange?

Posted Nov 23, 2025 16:11 UTC (Sun) by rsidd (subscriber, #2582) [Link] (5 responses)

Yes, pure functional programming even if possible at low level would be horribly inefficient. Tail recursion is a good example: recursion is inefficient, every functional language does tail recursion (more generally, tail call optimization) turning function calls into loops or just removing them when the result is being returned directly.

I find that learning a bit about functional programming (ocaml, Haskell) improved my programming in other languages; but the "pure" functional approach of Haskell is not worth it for me. Things like matrix multiplications, dynamic programming, etc, if implemented naively in Haskell are incredibly inefficient. Iteration over the indices in matrices is just the most natural way to describe such things. You can still aim to write your functions to not have side effects as far as possible: eg, when returning a new matrix as a product of two input matrices, don't modify the input, but use for loops when that's just more readable.

To do it in Haskell, you need to get into lazy evaluation, memoization, ... which, for most programmers, just adds layers of complexity. Or you could use a functional language that implements matrix multiplication as a primitive (using the Fortran-based BLAS under the hood, which is decidedly not functional), but you're still stuck when implementing your clever new algorithm.

Matrix multiplication

Posted Nov 23, 2025 19:39 UTC (Sun) by willy (subscriber, #9762) [Link] (1 responses)

I've never implemented a matrix multiplication algorithm myself, but I have read with fascination the stories of those who have.

It's a genuinely hard problem when you take into account realities like the size of L1/L2/L3 caches, and for the gargantuan cases, even DRAM.

I don't think it's even possible to create an efficient algorithm that mutates one of the input matrices. Maybe if both matrices are small and square, but those aren't the interesting cases.

I'd say that functional languages really should have matrix multiplication as a primitive (or library function) since that opens up the possibility of using a compiler intrinsic or even CPU instruction for the cases where it fits. So I 100% agree with you about using BLAS or similar under the covers; why implement it more than once? If it's good enough for the meteorologists, it's probably good enough for me.

Matrix multiplication

Posted Nov 24, 2025 4:34 UTC (Mon) by rsidd (subscriber, #2582) [Link]

I was talking about the trivial algorithm, good enough for small matrices:
A_{ik} = \sum_j B_{ij} C_{jk}
The natural way to write it is with three nested for-loops. But it also applies to cleverer implementations. Also, it's fine if you don't mutate B or C, but the normal way to do it is to initialize A and then fill it up with those for-loops; but that's not "functional" enough for Haskell.

Haskell matrix multiplication

Posted Nov 24, 2025 23:36 UTC (Mon) by DemiMarie (subscriber, #164188) [Link]

Haskell has mutable arrays in the ST monad, so you can use algorithms that need mutation without it appearing in your API.

You still won’t be able to match the performance of the hand-tuned SIMD assembly language in OpenBLAS, though.

List comprehension example looks strange?

Posted Nov 25, 2025 14:31 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (1 responses)

> To do it in Haskell, you need to get into lazy evaluation, memoization

IIUC, Eigen does this in C++. Operators return thunks rather than actually performing the operation right away so that when it is evaluated, an optimizer can go over the entire expression and try to optimize it at that point (e.g., noticing common subexpressions and only doing it once). It's also the reason you shouldn't use `auto` for Eigen types at API boundaries.

List comprehension example looks strange?

Posted Nov 25, 2025 17:04 UTC (Tue) by excors (subscriber, #95769) [Link]

As I understand it, the "optimiser" in this case is not the C++ compiler's optimiser. Lazy evaluation allows Eigen to provide its own higher-level optimisation steps, executed at compile time with template metaprogramming.

E.g. it will recognise that `m1 = m1 * m2` must allocate a temporary, whereas `m1 = m1 + m2` or `m1 = m2 * m3` can output directly into m1. (It can tell the difference because it defers the decision until the implementation of `=`.)

It can analyse a complex matrix multiplication involving transpose/adjoint/conjugate/scalar-multiplication/etc, and refactor the expression so the core is a pure multiplication that can be passed to a hand-optimised GEMM routine. And it can vectorise expressions into SIMD intrinsics, making use of its knowledge of alignment and aliasing etc, instead of relying on the compiler's more limited auto-vectorisation.

(https://libeigen.gitlab.io/eigen/docs-5.0/TopicLazyEvalua..., https://libeigen.gitlab.io/eigen/docs-5.0/TopicWritingEff...)

I think that's quite different to Haskell. Haskell's laziness is (semantically) at run-time, so it applies directly to all your numerical algorithms, whereas Eigen is only lazy in the metaprogram and generates eagerly-evaluated C++ code at compile time. The important part is the ability to do compile-time execution that can manipulate the program's matrix expressions - lazy evaluation is not used because it's a good mechanism, it's used because C++ doesn't provide any less horrible ways for metaprograms to manipulate expressions.

This is NOT just a syntactic change

Posted Nov 23, 2025 18:15 UTC (Sun) by xi0n (guest, #138144) [Link] (1 responses)

Rather, it is fundamentally extending the set of operations that comprehensions can do. Right now, they are a map/filter process; after this change, they’ll be a map/filter/reduce process, except the reduce part is limited to only one possible function.

Given that it alters comprehensions on a deep level, I don’t think this PEP should be just waved through, as the sentiment on the mailing list seems to suggest. Care should be taken to understand how it interacts with all types of comprehensions, including potential ones that don’t exist yet, and how to deal with the possibility of further enhancements that the proposal enables (most obvious being the choice of function for the reduce part).

This is NOT just a syntactic change

Posted Nov 30, 2025 0:37 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

> after this change, they’ll be a map/filter/reduce process, except the reduce part is limited to only one possible function.

You're straining to make it sound exotic and non-standard, but in reality Python already has a name for this reduce-like operation, as do many other languages. Python calls it itertools.chain.from_iterable(). Rust calls it Iterator::flatten(). Some versions of SQL call it UNNEST(). And so on. The proposal merely attaches syntactic sugar to a well-known operation that already exists in Python's stdlib (and those of many other languages).

The main "downside," if you can call it that, is that it doesn't neatly fit into the worldview in which all looping must be expressed in terms of the Three Magic Words. But Python never conformed to that interpretation, and neither do most of the other languages with this feature. Rust even has Iterator::flat_map(), which is literally the exact combination of mapping and reducing that Python would provide with this syntax extension.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds