|
|
Subscribe / Log in / New account

What is coming in PHP 8

By John Coggeshall
October 21, 2020

Recently, PHP 8 release candidate 2 was posted by the project. A lot of changes are coming with this release, including a just-in-time compiler, a good number of backward-compatibility breaks, and new features that developers have been requesting for years. Now that the dust has settled, and the community is focusing on squashing bugs for the general-availability release scheduled for November 26, it's a good time to look at what to expect.

General impressions, improvements, and changes

To a certain degree, PHP 8 represents a departure from the project's past. Historically, the community has placed a high value on backward compatibility, even between major releases. This doesn't seem to have been as much of a concern for this release, judging by the upgrade notes. With the scope and quantity of backward-incompatible changes, even relatively modern PHP applications will require a little tweaking to bring them up to speed.

The community has expended considerable effort in making PHP 8 into a more consistent language, both in terms of behaviors and syntax. Four separate proposals with a focus on making PHP into a more consistent language — in terms of behavior and syntax — have been implemented. These changes generally concern themselves with edge cases or preexisting quirks of the language; there are, however, a few notable changes worth mentioning explicitly.

Starting with the variable syntax tweaks proposal, the new and instanceof statements can now be used with expressions. The new statement is common across many languages, which is used to create a new instance of a class; instanceof checks if a particular class instance is of a particular type, accounting for class hierarchies. Prior to PHP 8, these statements were syntactically limited: a statement like new $myClassName that created a class whose name was stored in $myClassName was valid, but not an expression like new ("{$myClassName}{$i}"). This behavior has been corrected for PHP 8, allowing expressions to be used in both cases:

    // Create an instance of the built-in stdClass
    $obj = new ("std" . "Class");
    $trueValue = $obj instanceof ("std" . "Class");

The parentheses are important in order for PHP 8 to parse the statements correctly.

The saner numeric strings proposal significantly reworks the way PHP handles string-to-numeric type conversions, producing a more reasonable set of behaviors. This is especially true of numbers padded by white space, which are a frequent occurrence when dealing with user input:

    // PHP 8 type conversion behaviors for numeric values

    function foo(int $i) { return $i; }

    $a = foo(" 123  "); // int(123)
    $a = foo("123abc"); // throws TypeError, int type required
    $a = 123 + " 123  "; // int(246)
    $a = 123 + "123abc"; // int(246) with E_WARNING

    $intVal = "1  ";
    $a = ++$intVal; // int(2)
    $a = ("123" == " 123  "); // bool(true)

The last example is perhaps the most confusing: ("123" == " 123 ") equals true. This is not an intuitive result to many programmers, as they are both string values and it seems reasonable they should be compared as such. When PHP 8 encounters these comparisons, it first examines the strings to determine if they both appear to be numeric values. Since they both meet the conversion criteria, PHP will convert them to their numeric values before comparing them, which evaluates to true; using the strict-comparison operator === would ensure that the literal strings were compared instead.

In a similar direction, the saner string-to-number comparisons proposal cleans up some long-standing problems when comparing strings that represent numeric values. Consider the following code:

    $validValues = ["foo", "bar", "baz"];
    $value = 0;
    // search for $value in $validValues
    $result = in_array($value, $validValues));

In PHP 7, $result equals true, because any string such as "foo" would be converted to an integer zero by the in_array() function. In PHP 8, behaviors like this have been corrected to return a more intuitive result. Here is a table of how these changes will impact various expressions in PHP 8:

Expression PHP 7 PHP 8
0 == "0" true true
0 == "0.0" true true
0 == "foo" true false
0 == "" true false
42 == " 42" true true
42 == "42foo" true false

Per the proposal, "an alternative way to view these comparison semantics is that the number operand is cast to a string, and the strings are then compared using the non-strict 'smart' string comparison algorithm."

Finally, precedence changes have been made in PHP 8 for the string concatenation operator. The change ensures an expression like the one below matches the developer's likely intent:

    $a = 5;
    $b = 10;
    echo "sum: " . $a + b;

In PHP 7, the output of the above would evaluate as ("sum: " . $a) + $b, where in PHP 8 it more reasonably evaluates as "sum: " . ($a + $b).

PHP 8 has made some significant changes to various aspects of error handling, both for built-in functions and in the language itself. One specific change that stands out is the @ operator, which is used (and misused) in PHP to block errors, warnings, and notices for a single expression. Starting with PHP 8, the @ operator will no longer block fatal errors as it has previously done; the new behavior may cause errors to be revealed in existing applications that were previously ignored.

There have also been substantial changes for warnings and errors generated by built-in PHP functions. Per the request for comments (RFC), there are over 30 changes to how PHP responds to a variety of error conditions; this includes over a dozen errors that would previously have caused an ignorable warning, but which will now throw exceptions.

Lastly, developers should review the deprecation warnings that were introduced to a variety of functions and features in PHP 7. These deprecations can be found in RFCs for versions 7.2, 7.3, and 7.4; they have all been finalized for PHP 8 and will need to be addressed before upgrading.

Taken as a whole, these changes indicate that upgrading to PHP 8 won't be as straightforward a process as the transition from PHP 5 to PHP 7 was; readers will recall the project abandoned PHP 6.

PHP 8 has a lot more than the above to offer; there are also plenty of new features. In fact, too many to cover at one time. Two of our previous articles on PHP attributes and match expressions are recommended reading for details on those specific features. Readers are also encouraged to review the PHP 8.0 release notes for bug fixes, and the project's RFC section on PHP 8 for an exhaustive listing of changes and improvements.

New string functionality

String handling in PHP 8.0 received a fair amount of attention, with a variety of new functions and types being added to the language. First up is str_contains(), which determines whether a substring exists within another string; it replaces strpos() when the location of the substring is not needed:

    $str = "Linux is great";

    if  (strpos($str, "Linux")) { /* Incorrect usage */
        print "found";
    }

This code represents a common mistake in PHP. strpos() returns the position of a string found within another string, or false if not found. In this case, it returns 0; this is the equivalent to false based on PHP's type-conversion rules. The correct way to implement this using the strpos() function would be to use the strict-comparison operator:

    $str = "Linux is great";

    if (strpos($str, "Linux") !== false) {
        print "found";
    }

As Nikita Popov noted, "this operation is needed often enough that it deserves a dedicated function." str_contains() provides an alternative to strpos() that is less likely to introduce bugs:

    $str = "Linux is great";

    if (str_contains($str, "Linux")) {
        print "found";
    }

While useful, str_contains() only returns a boolean if the substring is found or not, not the position of it; if the position is needed, strpos() must be used.

In the same vein are the new str_starts_with() and str_ends_with() functions. As their names imply, they allow developers to determine if a string starts (or ends) with another string:

    $str = "Linux is great";

    if (str_starts_with($str, "Linux")) {
        print "found\n";
    }

    if (str_ends_with($str, "great")) {
        print "found\n";
    }

Lastly, PHP 8 introduces the Stringable built-in interface. This interface can be used as a type-hint for any object that implements the __toString() method:

    function needsString(string | Stringable $str) {
        /* ... */
    }

The declaration of needsString() also makes use of another addition, type unions, which enable the use of multiple type-hints in function or method declarations. The example is verbose to highlight type unions; in reality, a string type-hint supports both string types and Stringable objects. However, note that the opposite is not true: only specifying Stringable as a type does not permit string types.

New types

On the subject of types, a couple of new ones have been added. The first is the mixed type, which can be best thought of as a union of these internal types: array, bool, callable, int, float, null, object, resource, and string. In previous versions of PHP, the only way to represent a "mixed" type would be to omit typing information completely; a functional but ambiguous solution. The new mixed type provides a better way for developers to explicitly indicate their intent.

The second type added is the static type, which applies only to method return values. In PHP, static is a special class name referring to the class a method was called from, even if the method was inherited; it is known as late static binding. Prior to now, PHP has supported the self-referencing self and parent return types for class methods, but those two types alone are insufficient to cover all the cases; prior to PHP 8 there was no way for a method that was inherited from a parent class to indicate that the return value must be an instance of the child class.

Object-oriented features

A shorthand syntax was added for assigning properties when creating an instance of a class, called constructor property promotion. Essentially, it allows class properties to be implicitly declared and assigned by specifying them as constructor parameters. Consider the following, traditional, example declaring a class in PHP:

    class Foo {
        private string $bar;

        public function __construct(string $bar) {
            $this->bar = $bar;
        }
    }

In PHP 8, the class can be defined with much less code:

    class Foo {
        public function __construct(private string $bar) {}
    }

In the example, the private property $bar is automatically created in the Foo class, being assigned its value when the object is instantiated. The syntax does not prevent developers from adding additional logic to the constructor; properties will automatically be defined and assigned their values first, followed by any custom logic. If a parameter is modified by custom logic, then the corresponding property must be updated manually.

PHP 7.4 introduced support for weak references, and in PHP 8, that concept was expanded further into a new WeakMap class. To understand what these features do, a slight tangent into PHP garbage collection is necessary. When a data structure, such as an object, is created, the allocated memory remains until all references in the script to that data have been destroyed. Once the object is no longer referenced, PHP will release that memory for something else. This becomes problematic when objects are cached, since the reference to an object in the cache will prevent PHP from freeing the memory — even if the only place that object still exists is in the cache. For a longer-running PHP process, the result is, functionally, a memory leak.

Weak references are implemented using the WeakReference class, which is to be used to indicate an object can be destroyed by the garbage collector once the only references to it are "weak" references. An example taken from the PHP documentation is provided below:

    $obj = new stdClass;
    $weakref = WeakReference::create($obj);

    var_dump($weakref->get()); // returns the object
    unset($obj);
    var_dump($weakref->get()); // returns NULL

In PHP 8, weak references have been expanded to include a new WeakMap class; it provides an array-like interface for storing weak object references in a single data structure. Here's an example of how WeakMap might be used in a caching mechanism:

    class StatsGenerator {

        private WeakMap $cache;

        public function __construct() {
            $cache = new WeakMap;
        }

        public function getUserStats(User $user) : UserStats {
            return $this->cache[$user] ??= $this->generateUserStats($user);
        }

        protected function generateUserStats(User $user) : UserStats {
            /* Do expensive stats generation */
        }

    }

In the StatsGenerator class, a simplistic caching mechanism is created that uses a WeakMap instance to store the result of a statistics-generating operation in memory. The first time getUserStats() is called, the generateUserStats() method performs an (expensive) operation to build a UserStats object; subsequent calls to the method return the previously-generated UserStats object. This cache will persist as long as $user has references that exist in the application outside of the StatsGenerator class. Once all references to $user are gone, PHP will garbage-collect the UserStats object from $cache as well:

    // Find a user by ID
    $user = getUserById(42);

    $generator = new StatsGenerator;

    // This call generates new statistics based on $user
    $userStats = $generator->getUserStats($user);

    // This returns the previously-generated statistics
    $userStats = $generator->getUserStats($user);

    // Destroy the only real reference to the $user object,
    // also destroys the UserStats instance inside of StatsGenerator
    unset($user); 

    // Regenerate the statistics
    $user = getUserById(42);
    $userStats = $generator->getUserStats($user);

There are also some exception-related improvements to be found in PHP 8. Moving forward, exceptions can be caught without having to create a variable to store the exception in the catch block:

    try {
        throw new Exception("Something went wrong.");
    } catch(Exception) { // no need to provide a variable here in PHP 8
        Log::error("Something broke.");
    }

Additionally, the throw operation has been changed from a statement to an expression in PHP, which enables exceptions to be thrown in a variety of new ways that previously required substantially more code to accomplish:

    // Pre-PHP 8
    if (!isset($user['email'])) {
        throw new UserMissingEmail($user);
    }

    $email = $user['email'];

    // With PHP 8
    $email = $user['email'] ?? throw new UserMissingEmail($user);

The nullsafe operator and named parameters

Then, there is the nullsafe operator, which allows null-coalescing-like behavior for read-only operations. For PHP developers like myself, it is a welcome addition to the language. Consider the following example taken from the proposal for the operator, representative of typical error checking in a PHP application:

    $country =  null;

    if ($session !== null) {
        $user = $session->user;
            if ($user !== null) {
                $address = $user->getAddress();
                if ($address !== null) {
                    $country = $address->country;
                }
            }
        }
    }

With the nullsafe operator, this verbosity can be eliminated entirely:

    $country = $session?->user?->getAddress()?->country;

Last in our look at new features in PHP 8 is named arguments, which expand the way parameters can be passed into methods and functions:

    function register(?string $email = null, string $first, string $last) {
        /* ... */
    }

    // Call register(), passing in the parameters by their declared names
    register(
        first: "Joe",
        last: "Example"
    );

Named parameters are a long-desired feature for PHP developers; the proposal for them was first created in 2013. With this implementation, many doors open for the types of code PHP developers can write.

Conclusion

PHP 8 looks to be a significant step forward for the language, and it is great to see the project continuing to improve after over 25 years. While it does seem that developers will have their work cut out for them in upgrading their applications, it would appear that all of the backward-compatibility breaks were done for good reasons. For those of us who maintain PHP code bases, now would be a perfect time to start testing and reporting any unexpected issues to the PHP bug tracker. Two more release candidates are expected before the final version 8.0 release; the next, release candidate 3, is scheduled for October 29.



to post comments

What is coming in PHP 8

Posted Oct 23, 2020 4:01 UTC (Fri) by flussence (guest, #85566) [Link]

It still feels very PHP, but it's good to see it pushing this hard to keep up with more modern languages. Things like strpos() are a mistake a lot of other language designers only recently got around to fixing.

What is coming in PHP 8

Posted Oct 23, 2020 21:34 UTC (Fri) by jkingweb (subscriber, #113039) [Link] (6 responses)

I'm a bit disappointed "123" == " 123 " is still true. It is an unintuitive result, and one I've found I could only get used to (if you can call it that) by using strict comparison everywhere to avoid it entirely.

PHP's type coersion certainly has value, but I don't find that little rule helpful at all. If I want to compare two strings as integers, I'll cast one or both of them to int myself, thanks.

What is coming in PHP 8

Posted Oct 25, 2020 1:51 UTC (Sun) by sorpigal (guest, #36106) [Link] (2 responses)

This is just one of those language features (like the JavaScript 'with' keyword) which exist but are not necessary or wise to use in real programs. I haven't surveyed the world but I like to think that a reflexive strict equality is the norm in PHP. The art of using any language includes knowing which parts of the language are ugly and best avoided; PHP is unusual only in that its ugly surface area is broader than most.

What is coming in PHP 8

Posted Oct 27, 2020 22:13 UTC (Tue) by ThinkRob (guest, #64513) [Link] (1 responses)

That may not be a sound justification... Take a look at Javascript: The idea that a sufficiently nimble developer can dodge language footguns shouldn't persist long after perusing the top 50 NPM packages. And Javascript is [somewhat] sane compared to PHP < 8!

I'm glad PHP is getting rid of some of these operator-facing weapons, but they've got a massive armoury and it doesn't look like it'll be a quick disarmament process.

What is coming in PHP 8

Posted Oct 29, 2020 20:52 UTC (Thu) by sorpigal (guest, #36106) [Link]

I think we are in vehement agreement. My only point was that changes to the implicit coercion behavior are mostly irrelevant in the real world where strict equality is extraordinarily common. The only useful improvement would be to remove implicit coercion altogether. PHP is improving rapidly, but there is still internal resistance to correctness and the rest of the world is not standing still. It remains an unwise choice. I applaud all of the improvements in PHP8, but more and more rapid progress is also needed. Case sensitivity, multiple dispatch, operator overloading, and improved control of casting are on my short list.

String zero

Posted Oct 27, 2020 16:21 UTC (Tue) by Richard_J_Neill (subscriber, #23093) [Link]

I agree. I'd also like to see.
A string should always be false if empty, otherwise true.

"" #false
"0" #true <-- change this, it's currently considered as false.
"0.0" #true

At the moment, the "it might be a disguised zero" rule takes precedence over the "non empty strings are true".
This would mean that
if ($str) ...
can always be used to check for empty strings, as well as becoming consistent with Python and JS.

Given that this does possibly break backward compatibility, maybe it should be a pref in PHP.ini (like python 2's "from future import division").

strings should only be equal if they are identical.

Posted Oct 27, 2020 16:34 UTC (Tue) by Richard_J_Neill (subscriber, #23093) [Link] (1 responses)

String comparison could get the best of both worlds by checking types, and casting only if one side is not a string.

123 == 123 #true, because both are numeric - not controversial.
123 == "123" #true, because the string is cast to number. This is helpful.
123 == " 123 " #true, because the string is cast. This is helpful, but not necessarily obvious/expected
"aaa" == "bbb" #false, because the strings are not equal - not controversial.
"aaa" == "aaa " #false, because the strings are not equal - not controversial.
"123" == " 123 " #false, because the strings are simply NOT equal. This is the behaviour we should have, NOT the behaviour we currently get

strings should only be equal if they are identical.

Posted Nov 3, 2020 15:50 UTC (Tue) by ledow (guest, #11753) [Link]

I would argue that we should be throwing warnings for any intra-type conversion that wasn't explicitly requested by the programmer.

But, hell, I'll just wait for the first compromise because some user-passed string is compared to some hashed value or similar and allows compromise because people didn't realise that it would convert and evaluate in a way they never intended.


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds