|
|
Subscribe / Log in / New account

The winding road to PHP 8's match expression

By John Coggeshall
September 2, 2020

New to the forthcoming PHP 8.0 release is a feature called match expressions, which is a construct designed to address several shortcomings in PHP's switch statement. While it took three separate request-for-comment (RFC) proposals in order to be accepted, the new expression eventually received broad support for inclusion.

The match expression story began at the end of March 2020 with an RFC suggesting changes to the switch statement. The proposal, authored by Ilija Tovilo and Michał Brzuchalski, highlighted four shortcomings of switch: the inability to return values from the statement, matches falling through to the next case, "inexhaustiveness", and type coercion.

The problem with switch

The PHP switch statement is one of the oldest constructs in the language, and shares a few common traits with the C variety; the proposal suggests that these common behaviors aren't ideal for PHP. For example, each case within a switch will fall through to the next case unless there is a break (or continue) statement. Also like C, switch is not an expression and therefore does not return any values. If a developer wants to, for example, assign a value from the logic of a switch construct to a variable, they need to do so explicitly as part of the corresponding case statement.

The RFC further suggests that the way switch handles types is incorrect for modern PHP. Since PHP is fundamentally a dynamically typed language, switch employs type coercion (called "type juggling" in the PHP manual). This means passing a string "0" into a PHP switch statement will match against an integer case 0: block, which might make less sense as PHP continues to embrace types in modern versions. Finally, the RFC points out that the switch statement is "inexhaustive", meaning that it is not an error if no case block was found to match.

The path to a solution

To address these concerns, the RFC proposed a new "expression variant" of switch:

    $result = switch($condition) {
        1 => foo(),
        2 => bar(),
        3, 4, 5 => baz(),
    };

This expression would operate in a similar fashion to the switch statement, however the evaluated code of a matched condition would be returned as an assignable value. The switch expression would also be more strict than its statement counterpart, eliminating fall-throughs and throwing an exception if a match was not found.

When Tovilo introduced the RFC (Brzuchalski did not participate in the discussion), it became clear that, as written, the proposal wasn't going to go far. In a limited discussion, Dan Ackroyd responded with several fundamental problems regarding the approach. While Ackroyd agreed there were issues with switch, he did not feel the RFC was a "good starting point" for that discussion. Ackroyd continued by itemizing his concerns and, since the RFC did not perform type coercion when comparing values, "it's way less interesting to me". In Ackroyd's view, he believed the new expression being proposed should employ a new keyword; which avoids the risk of potentially confusing developers by reusing switch.

The initial response to the RFC prompted Tovilo to conduct a poll to try to find the best way forward for the concept. When sharing the informal vote with the community, Tovilo explained the reasoning:

There's been a fundamental disagreement on what the switch expression should actually be. Due to the conflicting feedback I no longer know how to proceed.

In response to the poll (and the justification for it), Rowan Tommins said "I think this confusion is there in your proposal, not just in people's responses to it", adding "I think changing the keyword to 'match', and possibly using strict comparison [not type coercion], would make a compelling feature."

In the end, the poll received five responses, not enough to build any consensus around the value of the proposal. The lack of appetite by the internals community for the RFC led Tovilo and Brzuchalski to withdraw it from consideration before a vote, replacing it the next day with a different proposal without Brzuchalski.

The new proposal introduced the match keyword, addressed type coercion, and fixed ambiguities in the original RFC structure to make the idea more clear; it yielded significantly more discussion than the original. Tommins seemed to appreciate the changes, yet still had several concerns. For example, Tovilo had suggested that the proposed match expression "allows you to use a match expression anywhere the switch statement would have been used previously." That was an idea Tommins took issue with:

I don't think it's necessary for the new keyword to be a replacement for every switch statement, any more than switch replaces every if statement or vice versa, and doing so adds a lot of complexity to the proposal.

Tovilo and Tommins, along with other community members, continued the debate of the proposal at length, mostly with regard to the statement blocks being added for match. Here is an example of a statement block from the proposal:

$y = match ($x) {
    0 => {
        foo();
        bar();
        baz(); // This value is returned
    },
}

PHP currently has no concept of statement blocks, and many community members found it strange to see them included as part of the proposed match expression syntax. Tommins, in particular, felt strongly that it was a mistake, though he liked the proposal otherwise; he expanded on this problem later in the thread, suggesting that statement blocks could be added in the future with more consideration. Another community member, Larry Garfield, supported Tommins in his suggestion to move statement blocks to a different proposal:

I really feel like this is the best solution. Multi-line expressions are not a logical one-off part of match(). They may or may not be a good idea for PHP generally, in which case they should be proposed and thought through generally. If they do end up being a good move generally, they'll apply to match naturally just like everywhere else; if not, then they wouldn't confuse people by being a one-off part of match() and nowhere else.

Tovilo declined to remove the proposed feature. Tommins also suggested separating the statement block issue into another vote within the RFC. Still, Tovilo didn't agree, saying:

If we were to remove blocks we'd probably also reconsider other things (namely the optional semicolon and break/continue) and many examples in the RFC would become invalid and irrelevant. This would probably lead to even more confusion which is why I will most likely not move blocks to an additional vote.

However, I will definitely include a poll to find out why it failed. I am committed to getting this into the language, in some form or another.

When voting opened for the proposal in late April 2020, it was handily defeated in a 28 to 6 vote. A secondary vote within the RFC, "If you voted no, why?", indicated that one of the primary reasons that the proposal failed was the inclusion of statement blocks.

For a time, it appeared that was the end of the discussion regarding the match expression. However, in late May 2020, Tovilo returned with a third proposal for a match expression. This third RFC importantly did not include statement blocks and, unlike previous proposals, focused solely on the core features the match expression needed to have.

The policy on rejected RFCs within the PHP community states six months must pass between the rejection of a proposal and its re-submission. That is unless the "author(s) make substantial changes to the proposal." Tovilo argued that this applied, as "many people have said without blocks they'd vote yes". This argument was accepted without much challenge from the community, allowing the proposal to move forward.

The final proposal of the match statement addressed all of the fundamental concerns that Tovilo originally laid out, wrapped neatly into a new expression. match expressions are strictly typed when being compared, throw an exception if a match is not found, and the concept of "fall through" was eliminated. Perhaps unsurprisingly, this less expansive version of Tovilo's proposal was received positively; most comments and discussion from the community were about using => or : in the syntax. The RFC passed in a 43 to 2 vote in time for inclusion into PHP 8.0. Below is a representative example of how match expressions in PHP 8 are used:

    $foo = getSomeValue();

    try { 
        // this uses strict checking of types when looking
        // for a match (0 != "0").
        $result = match($foo) {
                404 => 'Page not found',
                Response::REDIRECT => 'Redirect',
                $client->getCode() => 'Client error',
            };
    } catch(UnhandledMatchError $ex) {
        print "Did not match!";
    }

While it took a few tries, the match expression looks to be a useful addition to language. More examples of the match expression in PHP 8.0 are available for interested readers; it is also available in the latest PHP 8.0beta2 release.



to post comments

The winding road to PHP 8's match expression

Posted Sep 2, 2020 17:49 UTC (Wed) by professor (subscriber, #63325) [Link] (6 responses)

I often use the "matches falling through to the next case (if no break/continue)" -functionality in PHP switch.. It is one of its better features in my opinion, lucky me they create some other function for their modern stuff.

What i could have wished for in the old days is that continue behaved differently in a switch-statement and also that better type checking would have been possible with === (in some way) but, as always, if you know PHP you still know how to use it safely.

The winding road to PHP 8's match expression

Posted Sep 2, 2020 17:59 UTC (Wed) by professor (subscriber, #63325) [Link] (5 responses)

well.. 'often' was maybe to exaggregated.. but when it happens i like it..

The winding road to PHP 8's match expression

Posted Sep 2, 2020 18:50 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (4 responses)

Maybe better wording would be "when I intend it to happen, it is useful"? That's my experience with it in C. There have been numerous times that I've been disappointed to have to add a missed `break;` in a switch statement (C, but that's where PHP got its idea for switch; if C didn't have it, I really doubt PHP would have fallthrough).

The winding road to PHP 8's match expression

Posted Sep 2, 2020 20:10 UTC (Wed) by professor (subscriber, #63325) [Link] (3 responses)

Yes, because I know when i want the behaviour! PHP is, in my opinon, a powerful extension for C.
If i want a break i would put it there and if not i would not.. Todays (-fifteen years) "i want everything on a silver plate and stuff i dont know about is someone else´s fault and thefor bad design" is just weird.. PHP have always been what it is according to the manual (which is one of the best) but if you dont want to read it and dont know what you are doing you just dont know.. even if this is "crap" it made it possible to create some awesome (first) results without knowing anything (which made it great but also is the black sheep nowadays).

yes, i love PHP :)

The winding road to PHP 8's match expression

Posted Sep 2, 2020 21:41 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

> Yes, because I know when i want the behaviour!

This mentality works for solo projects (and even then, more towards the one-off end of things). In my experience, having such sharp tools laying around with opt-in safety mechanisms make code review and later maintenance much harder. An explicit `fallthrough;` statement would have been a better solution rather than `break;`. Why? I find fallthrough to be the rarer case. Sure, Duff's device is cool and all, but I've written one maybe once? Seen it only a few other times? Making it have a few more keyword instances would have been fine (especially since your indentation is already all kinds of whack most of the time, doing `fallthrough; case …` would probably have worked).

> "i want everything on a silver plate and stuff i dont know about is someone else´s fault and thefor bad design" is just weird

I know about it and I even use it sometimes. I still comment when I explicitly mean for fallthrough to happen because I'm not the only one working on the code in question (even if another person isn't, years-later me might have to figure out what the code in question is doing anyways).

The winding road to PHP 8's match expression

Posted Sep 2, 2020 22:29 UTC (Wed) by professor (subscriber, #63325) [Link]

You are perhaps right in a modern sense indeed.. Not everyone is like you though.

This part i dont understand at all..

"Sure, Duff's device is cool and all, but I've written one maybe once? Seen it only a few other times? Making it have a few more keyword instances would have been fine (especially since your indentation is already all kinds of whack most of the time, doing `fallthrough; case …` would probably have worked)."

The winding road to PHP 8's match expression

Posted Sep 2, 2020 22:56 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

"i want everything on a silver plate and stuff i dont know about is someone else´s fault and thefor bad design"

What we want is code that doesn't have bugs. All the evidence suggests that the number of people who can write code that doesn't contain bugs is extremely small, which means designing languages such that it's harder to write bugs is a reasonable choice.

The winding road to PHP 8's match expression

Posted Sep 3, 2020 0:37 UTC (Thu) by Henning (subscriber, #37195) [Link]

I do rarely write PHP, but quite often C so my perception is based on that. I mainly use fallthrough for two things; grouping and parts that share common code-tails.

With grouping I mean:

switch(var): {
case 1:
case 2: foo(); break;
case 3:
case 4: bar(); break;
default: foobar();
}

And parts that share common code-tails:

switch(var): {
case 1: fn1();
case 2:
case 3: fn2(); break;
case 4: fn3(); break;
default: fn4();
}

In the grouping part, there are three groups (1,2 and 3,4 and default) and in parts there is sharing between 1,2 and 3.

Both patterns seems to come up from time to time and the match statement seems to handle grouping but not code sharing common tail-parts. But I guess gotos are still available in those cases where one wants to deal with these situations.

The winding road to PHP 8's match expression

Posted Sep 3, 2020 16:12 UTC (Thu) by jezuch (subscriber, #52988) [Link] (1 responses)

Match expressions are in vogue right now it seems... Two articles about match in Python and in PHP next to each other - coincidence? ;) And Java also recently overhauled the switch thingie to make it optionally an expression and to eliminate fallthrough... This makes me curious, is a similar overhaul one of the myriad changes that the C++ committee made to the language?

The winding road to PHP 8's match expression

Posted Sep 4, 2020 16:08 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

There are indeed discussions about pattern matching in C++ (using `match`). It is aiming for C++23. However, C++'s `switch` doesn't have anything new over C's `switch`.

The winding road to PHP 8's match expression

Posted Sep 7, 2020 9:51 UTC (Mon) by hholzgra (subscriber, #11737) [Link]

In slightly related news:

This brings back memories of my never implemented plan to have switch/case with an optional comparison function parameter ... e.g.:

switch($filename, "fnmatch") {
case "*.gif": ...; break;
case "*.txt": ...; break;
default: /* unknown file extension */ break;
}


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds