|
|
Subscribe / Log in / New account

"Structural pattern matching" for Python, part 2

"Structural pattern matching" for Python, part 2

Posted Sep 8, 2020 16:15 UTC (Tue) by ecree (guest, #95790)
In reply to: "Structural pattern matching" for Python, part 2 by johill
Parent article: "Structural pattern matching" for Python, part 2

> But I believe it can be done.

I agree. And that's a clever idea of yours to use a context manager to get the 'only mention p once' semantics.


to post comments

"Structural pattern matching" for Python, part 2

Posted Sep 8, 2020 17:39 UTC (Tue) by johill (subscriber, #25196) [Link] (3 responses)

I also just realized that I sort of sketched the basics of "algebraic matching", where the original PEP completely bypassed that issue and (x, x) doesn't even do what seems obvious? Or am I reading that wrong?

Of course some more complex expressions would be ... hard. Probably not impossible, but hard. You'd have to make the CapturedVar() be able to track all kinds of things, if you wanted to be able to write

...
    if m := matches((capture.x(int), capture.x(int)**2)):
        print("Integer point on the parabola at %d" % m.x)
...
It still seems possible, but ... tricky to say the least. Effectively, "CapturedVar" would have to be able to be operated on in a symbolic fashion, always returning a new CapturedVar object that encapsulates the expression, and can later evaluate it. So given an instance
  x_square := CapturedVar('x') ** 2
you'd have to be able to calculate
x_square(7)
so you can compare to
capture.x(7)
.

But you can see the ambiguity here! I previously allowed

capture.x(int)
to indicate you wanted a specific type ...

This would be better if the types were to rely on type annotations, but I don't think you can do that in this fashion.

I suppose you could detect if the argument was a type or an instance or something? But tricky to differentiate ...

Oh, I have an idea. The

x_square(7)
calculation is purely internal, so we can do it as
x_square(value=y)
(say the signature is "__call__(self, type=None, value=None)" so you can differentiate more easily. The value passing syntax is only needed internally to evaluate when you have a given thing in hand.

Well, basically ... it get complex, but I haven't hit anything I couldn't do.

Now I'm tempted to actually implement it :-)

"Structural pattern matching" for Python, part 2

Posted Sep 8, 2020 18:33 UTC (Tue) by johill (subscriber, #25196) [Link] (2 responses)

But actually...

while all of that works nicely with tuples because you can build some knowledge of them into the code, it really doesn't work at all with what the PEP calls "Class Patterns".

You can't write "m := matches(Point(capture.x, capture.y))" because you quite probably cannot instantiate a Point() object with two CapturedVar instances.

so you'd have to rewrite that as something like "m := matches_instance(Point, capture.x, capture.y)" at which point (pun intended) new syntax seems to make sense...

"Structural pattern matching" for Python, part 2

Posted Sep 8, 2020 23:39 UTC (Tue) by ecree (guest, #95790) [Link] (1 responses)

because you quite probably cannot instantiate a Point() object with two CapturedVar instances.

But can't you, though?

As long as CapturedVar suitably 'ducks' its type (e.g. for ints, implementing + and **, like you suggested), Point.__init__ should be entirely happy with it unless it's trying to do things that depend on the actual values.

Sure, it's not something that will work for every class, but nor is the rest of the "Match Protocol" in PEP 622. And if the class has a __match_args__, then you could potentially have a matching.Duck that uses that to "play the rôle of" the class in a match expression:

from matching import match, capture, Duck

with match(p) as matches:
    if m := matches(Duck(Point)(capture.x(int), capture.x(int))):
        print("Diagonal at %d" % (m.x, ))
    elif m := matches(capture.p(Point)(capture.x(int), capture.y(int))):
        # the second call on capture.p implies a Duck
        # if the match succeeds, it is replaced with the concrete Point
        print("Other point at %s % (m.p, ))

So you'd use plain Point() for dataclasses and anything else that's not too fussy, Duck(Point)() otherwise.

Still not quite as pretty as the PEP 622 syntax, but it still seems like a good way to get experience with "how this feature would be used in practice" before baking it into the language. Which to my mind is reason enough to do it.

"Structural pattern matching" for Python, part 2

Posted Sep 9, 2020 6:22 UTC (Wed) by johill (subscriber, #25196) [Link]

> As long as CapturedVar suitably 'ducks' its type (e.g. for ints, implementing + and **, like you suggested), Point.__init__ should be entirely happy with it unless it's trying to do things that depend on the actual values.

I don't think that'll work. It has to implement the algebraic operations to build a parse tree to be able to do algebraic matching, but you still can't really call Point.__init__ with it I'd think, it might check the type, or other random things in there. Hence the match protocol though!

> Sure, it's not something that will work for every class, but nor is the rest of the "Match Protocol" in PEP 622. And if the class has a __match_args__, then you could potentially have a matching.Duck that uses that to "play the rôle of" the class in a match expression: [...]

But yeah, you're right, the PEP also cannot do magic wrt. class matching, it has __match_args__ for that, and indeed with something like the "Duck()" you propose - or what I wrote as "m := matches_instance(Point, capture.x, capture.y)" - you can indeed do the same thing.

I neglected to think enough about it I suppose, but (fairly obviously) the only magic in the PEP can be the syntax, making "Point(x, y)" not be an instantiation call but some other kind of expression. Without new syntax we have to open that up manually by doing "Duck(Point)(capture.x, capture.y)" or my "m := matches_instance(Point, capture.x, capture.y)".

The rest ... yeah can probably be done, now that I think about it. In mostly the same ways.

And that shows to some extent where and how the algebraic matching breaks down and probably why they punted on it, though I'm not convinced it's that bad. Obviously, you don't want to get into the territory of writing "Point(x**2, x**3)" and suddenly needing an arbitrary solver to do the match, but I think you could do it if you restrict to having to mention each variable "naked" at least once. I.e. "Point(x, x**3)" would be OK and "Point3d(x, y, x+y)" would also be OK - all of this because once you have values for x/y it's easy to check the remaining expressions. But something like "Point(x+y, x-y)" would not be allowed.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds