|
|
Subscribe / Log in / New account

"Structural pattern matching" for Python, part 2

"Structural pattern matching" for Python, part 2

Posted Sep 2, 2020 21:32 UTC (Wed) by mathstuf (subscriber, #69389)
In reply to: "Structural pattern matching" for Python, part 2 by nybble41
Parent article: "Structural pattern matching" for Python, part 2

Changing variable scope to be anything other than the parent function/class/module declaration would be a massive change. Anything digging around in frame objects would need to learn to poke wherever these new scopes end up being declared. You'd basically need to add variable declarations too since this fairly common pattern would be broken:

if foo is not None:
    local_foo = foo
else:
    local_foo = DefaultValue()
(Sure, here `foo` could probably be reused, but code like this still exists.)


to post comments

"Structural pattern matching" for Python, part 2

Posted Sep 2, 2020 23:24 UTC (Wed) by nybble41 (subscriber, #55106) [Link] (7 responses)

> Changing variable scope to be anything other than the parent function/class/module declaration would be a massive change.

Yes, I anticipated that. I think it would be worth it, since the only way to get reasonable scopes in current Python involves nested functions or lambdas, but in the end that's not for me to decide.

> Anything digging around in frame objects would need to learn to poke wherever these new scopes end up being declared.

Making it harder to mess around with frame objects seems like a feature, not a bug. Making them completely inaccessible would be even better. (And, apart from debugging, how would you even use such a thing without creating a very tight coupling with the function's internal implementation? I'm not seeing any realistic scenario where this code would not need to be updated anyway.)

> You'd basically need to add variable declarations too…

I think that could be avoided by simply defaulting to function scope as current Python does when assigning to an unknown variable. Nested block scopes would be introduced by specific statements where binding makes sense such as loop variables in a "for" statements, "as" clauses in "with" statements (as per the original version of PEP 343[1]), or placeholders in "match" statements. A "local" keyword could be added to explicitly introduce scoped variables into the current block, similar to the "global" and "nonlocal" keywords, if there is a demand for it.

[1] https://docs.python.org/2.5/whatsnew/pep-343.html

"Structural pattern matching" for Python, part 2

Posted Sep 3, 2020 7:06 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (6 responses)

> Yes, I anticipated that. I think it would be worth it, since the only way to get reasonable scopes in current Python involves nested functions or lambdas, but in the end that's not for me to decide.

How exactly would you define "reasonable?"

IMHO, the Python rule is very straightforward and easy to remember. Moreover, I've never run into a situation where it actually causes a problem of some sort. Sure, variables hang around for a little longer than they would otherwise, but so what? If you really care, you can explicitly write del x, which is probably a Good Idea anyway, because if you care, then you should call attention to the fact that you care (not-caring is the default state of affairs, so x must be special in some way).

> Making it harder to mess around with frame objects seems like a feature, not a bug. Making them completely inaccessible would be even better.

And while we're at it, let's put the e in creat(2), where it should've been all along.

You can't break backcompat on a feature just because you personally dislike that feature. It exists. It needs to be supported. Unless we want to do *another* 3.x-style flag day, frame objects are here to stay.

> I think that could be avoided by simply defaulting to function scope as current Python does when assigning to an unknown variable.

As I mentioned in the comments the last time we discussed this idea:

- There is no such thing as an "unknown" variable, except at the global scope.
- It is always possible to determine at compile time which variables belong to which scopes, with the exception of the global scope. The bytecode compiler will then emit bytecode that only consults the single scope where the variable actually lives, or consults the global scope if the variable is not recognized.
- The global scope is not known at compile time, for a number of reasons, most obviously the fact that from foo import * is legal at the global scope (it's a SyntaxError anywhere else). This doesn't matter, because the "unknown = global" algorithm is good enough for the current design.
- Because the global scope is not known at compile time, it is not possible to determine at compile time whether an unrecognized variable in a match expression is a global or a nonexistent variable that should be created and bound to the case's scope.
- Therefore, allowing match expressions to create a new scope does not actually solve this problem, and would introduce a significant amount of complexity for no adequately justified reason.

"Structural pattern matching" for Python, part 2

Posted Sep 3, 2020 14:17 UTC (Thu) by sbaugh (guest, #103291) [Link] (5 responses)

>Moreover, I've never run into a situation where it actually causes a problem of some sort.

Isn't the fact that Python functions aren't true closures a direct consequence of its lack of proper lexical scope? I'd say that causes a *lot* of problems:

>>> fs = [lambda: i for i in range(5)]
>>> [f() for f in fs]
[4, 4, 4, 4, 4]

"Structural pattern matching" for Python, part 2

Posted Sep 3, 2020 22:48 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (4 responses)

lambda does create a new lexical scope already, and it doesn't help with that problem. In order for your example to work "correctly," you would need to establish a new lexical scope for each iteration (not just one for the whole loop, because it would still have i=4 at the end of the loop). I would find that very hard to reason about, because that's not what I think of when I hear the word "lexical."

What (I think) you really want is capture-by-value semantics instead of capture-by-name, which has nothing to do with whether or not the loop has a separate scope.

"Structural pattern matching" for Python, part 2

Posted Sep 9, 2020 13:37 UTC (Wed) by milesrout (subscriber, #126894) [Link] (3 responses)

And of course there is in also a very easy solution: lambda i=i: i captures i “properly”. Perhaps not very intuitive for beginners, but once you've learnt it you've learnt it.

"Structural pattern matching" for Python, part 2

Posted Sep 9, 2020 14:37 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (2 responses)

Hmm. That doesn't seem fool-proof:

>>> d = []
>>> f = lambda x=d: print(x)
>>> f()
[]
>>> d.append(3)
>>> f()
[3]

Unless I'm missing what you're trying to capture here?

"Structural pattern matching" for Python, part 2

Posted Sep 10, 2020 2:11 UTC (Thu) by milesrout (subscriber, #126894) [Link] (1 responses)

That's conflating two separate things: capturing names and capturing names that refer to mutable values.

>>> a = 0
>>> b = []
>>> f = lambda: print(a)
>>> g = lambda a=a: print(a)
>>> h = lambda: print(b)
>>> i = lambda b=b: print(b)
>>> (f(), g(), h(), i())
0
0
[]
[]
>>> a = 1; c.append(3)
>>> (f(), g(), h(), i())
1
0
[3]
[3]

f captures the *name* 'a', which refers to the immutable value 0. g has an optional parameter, the default value of which is the value of a i.e. the immutable value 0. Like any Python function, g will store internally references to the default values for its parameters. It can't 'capture' the variable a because you could put *any* expression as the default value for g's parameter, and that expression is evaluated *when the function is defined*, with the resulting value stored inside the closure object somewhere.

When f is called, it prints a i.e. the value referred to by the name 'a' that f captured. If 'a' has changed then what f prints will change. When g is called, it prints the value of its parameter or if it isn't given one, the default value, which is zero. The 'a' in f refers not to the value that 'a' had when f was created, but to the name 'a' itself in the definition environment of f.

When a is rebound to 1, it doesn't change the value that a was originally bound to. 'a = 1' is an operation on the name 'a', not an operation on the value pointed to by a. We reassign 'a' so that it refers to a different value, but we don't modify the value that 'a' previously pointed to. So when f() is called a second time, it prints 1 (the new value pointed to by a) while when g() is called a second time it prints 0 (the value you can think of as being pointed to by some 'g.__internal_stuff__.parameters[0].default_value' attribute).

h and i actually work exactly the same way, but with one crucial difference: the value that the name 'b' refers to is a list and is thus mutable. So when we do the 'b.append(3)' operation we are not changing the name 'b'! We're modifying the value *pointed to by b*. When we defined i, we evaluated the expression "b" and pointed some internal "this is the default value of my parameter" field of the closure object at the result of that evaluation. Evaluating the expression "b" results in that list object, the list object pointed to by the name 'b'.

So when we call h(), it prints the value pointed to by the name 'b'. When we call i(), it prints the value of parameter, if it exists, or the value pointed to by some internal 'default' reference for that parameter. That default reference hasn't been changed, it still points to the same object. However that object itself has been changed by the call to append. We only ever created *one* list object, and there are two references to it.

The 'mutable default value' issue for Python functions basically has nothing to do with variable capture at all. It's present even for normal function definitions:

def foo(x=[]):
x.append(1)
print(x)

>>> foo()
[1]
>>> foo()
[1, 1]

The main problem that the variable capture issue causes happens even with immutable values, and in fact is primarily an immutable value issue. The first time I encountered it was with a GUI library. I was doing something like this:

>>> for i in range(5):
>>> buttons[i].on_click(lambda: buttons[i].colour = RED)

but of course the problem with this is that all the buttons will make the last button red when clicked. That's not because of a mutable value: the value pointed to by i doesn't change. The problem here is that a for loop assigns to the iteration variable. That code is the same as writing

>>> i = 0
>>> buttons[i].on_click(lambda: buttons[i].colour = RED)
>>> i = 1
>>> buttons[i].on_click(lambda: buttons[i].colour = RED)
etc.

And there it's more clear what's happening vis a vis name capture. The solution is of course to instead write buttons[i].on_click(lambda i=i: buttons[i].colour = RED).

"Structural pattern matching" for Python, part 2

Posted Sep 10, 2020 13:21 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Ah, I see now what you're doing. Yes, that's a nifty trick (assuming you can guarantee that you're only using rebound variables rather than holding a window into a variable someone else eventually mutates). But it's Python and I consider that kind of thinking just necessary for making robust software in that language.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds