LWN: Comments on "PHP and P++"

PHP and P++

mina86 — Thu, 22 Aug 2019 11:36:54 +0000

Just use Python with type hints if you like static typing but find PHP to have its use cases.

PHP and P++

acomjean — Mon, 19 Aug 2019 18:11:47 +0000

I use lots of languages at work (python, java, R), but for websites, I still reach for php. With Symfony framework, twig templating, I really enjoy it.

I haven't seen <? /* PHP code */ ?> style in years. Certainly not in anything I've developed in the past decade. Though we've been spoiled as php has maintained easy backward compatibility.

But PHP is now on a much more aggressive upgrade path, I get why nobody wants to test their old php code and make sure it works with the new version, but thats what we're looking at now. The php 7 series is so much faster the the 5x I'm surprised there are still laggards. Nobody like maintaining old versions, but splitting P++ I feel just splits the limited development manpower (Php is not the new hotness JS is...) . This was tried before with the language "hack", by Facebook. It never caught on.

https://docs.hhvm.com

https://www.php.net/supported-versions.php

PHP and P++

mjblenner — Sun, 18 Aug 2019 23:25:05 +0000

You could try the -b (or -bb) switch. e.g:

python3 -bb

>>> b'true' == 'true'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string

PHP and P++

Cyberax — Sun, 18 Aug 2019 17:13:04 +0000

What? I have no idea what you're saying.

PHP and P++

rweikusat2 — Sun, 18 Aug 2019 16:22:10 +0000

Why don't you just repeat the original statement without using a pointless aside sharing a couple of characters with a text of mine to pseudo-connect the repetition to this text?

PHP and P++

k8to — Sun, 18 Aug 2019 03:00:59 +0000

Do you mean an exception on type mismatch?

That sounds probably useful for most code i write, and it would cause huge explosions in most code I have to work on that other people write. Probably a good idea all around.

PHP and P++

felix.s — Sat, 17 Aug 2019 16:50:29 +0000

There are ~7 C standards that GCC supports, rather than being able to mix and match paragraphs from different standards.

You mean, like you can with -fno-delete-null-pointer-checks, -fstrict-aliasing, -f{w,t}rapv, -fgnu89-inline, -fms-extensions, -f{un,}signed-char, -fno-asm, and so on? And let's not forget __attribute__((optimize))…

Sure, these aren't strictly the same thing; ISO C standards are designed to be largely forward-compatible (portable C89 code free of UB can be usually compiled as C11 with no changes), so you'll have a hard time finding cases where the different versions of the standard flat-out contradict each other, creating incompatible dialects. But these options do change how compiler handles certain specific situations that are implementation-defined or UB according to the standard, and some code does depend on one dialect or the other.

PHP and P++

dvdeug — Sat, 17 Aug 2019 12:44:54 +0000

Is it 47, or is it '/'? The latter smacks of "one character set to rule them all", because 47 is 'å' in certain dialects of EBCDIC and can be part of a multibyte character in SJIS.

> which - coincidentally - makes everyone bend over backwards to get support for the characters his language is written in except people from the USA

To the extent that's true, it's less true than any of the systems that preceded it, and one character set to rule them all seems to be the best way to reduce that problem. UNIX basically assumes that whatever character set is being used, it's a superset of ASCII, which can hardly be the fault of Unicode that was created 20 years later. Heck, in 1998, simply supporting 8-bit characters was a release goal for Debian Hamm, because many Un*x utilities didn't out of the box. That is, you could have any character set you want, as long as it's ASCII.

On any of the pre-Unicode European solutions, an Estonian named Tõnisson would be out of luck in adding his correct name to a document that French and Germans had already added their names to; one byte worked for Western Europe, and who wanted to waste more space for Estonians with names like Tõnisson? If you were lucky enough to be using something that supported ISO-2022 (i.e. someone from East Asia was probably involved), the Estonian could type his name, but not actually search safely for names, as Päts could be encoded various ways, depending on whether a German or an Estonian entered the name.

And - coincidentally - Unicode was the first and usually only character set for hundreds of languages around the world. Speakers of small, less powerful, languages like Lakota or Greenlandic or Xhosa had to resort to font hacks to get any support for the language at all, whereas now it comes free with a decent-sized Unicode font.

PHP and P++

Freeaqingme — Sat, 17 Aug 2019 00:09:25 +0000

I've used both PHP for a very long time and have been using Go for the past 4 years or so. Though I really like to use Go being statically typed and allow for low level machine access, I think there's still plenty of use cases for PHP. I haven't done any kind of formal analysis but my gut says that for certain templating or 'simple' webdevelopment projects, PHP may be the better choice in terms of implementation speed.

Also, using an IDE that provides static analysis, using OOP, and declaring strict_types=1 in every file, PHP is a pretty decent programming language. Most of the crap it receive(s|d) is based on PHP4 and people who were able to get something to compile, but in no way could be considered programmers or software engineers.

PHP and P++

Cyberax — Fri, 16 Aug 2019 20:14:01 +0000

"$" sign is not an "S-with-a-bar". It can be written as "S" with two smaller bars on top and bottom (like in the font I'm using right now).

But what does this have to do with the mess that are the file names?

PHP and P++

mpr22 — Fri, 16 Aug 2019 19:17:01 +0000

For me, NFKC is the obviously-right way to normalize the names of filesystem entities.

PHP and P++

rweikusat2 — Fri, 16 Aug 2019 16:17:11 +0000

Please wake me when the unicode consortium start to consider S + combining vertical bar aka $ a precomposed character ...

PHP and P++

Deleted user 129183 — Fri, 16 Aug 2019 16:11:10 +0000

> NFC, NFD or broken UTF-8?

Since in Unicode, precomposed characters exist only for compatibility with pre-Unicode encodings, NFD should be probably the way to go.

PHP and P++

remicardona — Fri, 16 Aug 2019 12:33:29 +0000

well played sir, well played [tips hat]

PHP and P++

ale2018 — Fri, 16 Aug 2019 11:14:40 +0000

That is kind of irrelevant. A system choice. Even with ASCII it has always been possible to create files whose names begin with a minus (-), or contain backspaces (x08), spaces ( ), or other characters that may confuse human and machine interpreters alike. To paraphrase the POTUS, it's not the gun that shoots you in the foot.

PHP and P++

juliank — Fri, 16 Aug 2019 11:03:22 +0000

Yeah, it would be easier if comparison were strictly typed.

PHP and P++

h2g2bob — Fri, 16 Aug 2019 10:39:41 +0000

As NYKevin said, the problem is that comparing bytes and unicode will return False. So you'll find this code the hard way:

enable_foo = b'true'
if enable_foo == u'true':
...

Obviously enable_foo is from one or more read() or recv() in a different module. Or from ctypes. Or from users of your library code.

PHP and P++

Cyberax — Fri, 16 Aug 2019 07:45:53 +0000

LOL. Walked right into it.

PHP and P++

amacater — Fri, 16 Aug 2019 07:43:58 +0000

Surely: Go2 considered harmful from the outset

PHP and P++

Cyberax — Fri, 16 Aug 2019 06:37:59 +0000

There's no go2, but even when it's finally here it's going to be backwards compatible with go1.

You might dislike Go, but they make great efforts to preserve backwards compatibility.

PHP and P++

da4089 — Fri, 16 Aug 2019 05:49:12 +0000

go1 or go2?

PHP and P++

Cyberax — Fri, 16 Aug 2019 01:14:57 +0000

Any of them would be better than the status quo.

PHP and P++

flussence — Fri, 16 Aug 2019 01:01:42 +0000

NFC, NFD or broken UTF-8?

PHP and P++

roc — Thu, 15 Aug 2019 22:35:34 +0000

Oh, you also kind of want

4) have a tool that reformats source code automatically (especially line breaks) and encourage a culture of using it routinely

so that you can run that tool after applying the automatic updates from point 2. Not a big deal, though you want this for other reasons too.

PHP and P++

roc — Thu, 15 Aug 2019 22:22:17 +0000

Rust's edition system is working pretty well so far. The key points are:
1) have modules explicitly state which edition they use and make sure all your tools respect that.
2) have tools available from day 1 of the new edition that automatically update code to the new edition as much as possible, and emit clear messages where non-automatic changes need to be made.
3) ensure modules from different editions can be used together, and make that as seamless as possible. For example, Rust introduced "raw identifiers" that let you write identifiers that are reserved words, so if a module API uses an identifier that's a reserved word in a later edition, code in the later edition can still use it.

This constrains the kinds of changes you can make between editions, but it does allow you to make a lot of significant backwards-incompatible changes. It works better when your language is strongly statically-typed like Rust.

Python completely failed at 1, 2 and 3, for various reasons.

PHP and P++

roc — Thu, 15 Aug 2019 22:09:56 +0000

Treating everything as bytes is fine for filesystem APIs, but a big problem arises when you want to print path names; if you don't know the encoding, and the path name is not ASCII, you can't print them correctly. A slightly lesser problem is the reverse: when you receive a path name that happens to be in Unicode (because it comes from user input in Unicode, for example), and is non-ASCII.

If you care about those problems then you need to define the encoding of path names, and decide how to handle path names that aren't valid in the encoding.

PHP and P++

Cyberax — Thu, 15 Aug 2019 21:02:18 +0000

> which - coincidentally - makes everyone bend over backwards to get support for the characters his language is written in except people from the USA
How does UTF-8 make everybody bend over backwards?

At this point mandating UTF-8 for file names is pretty much the only sane way.

PHP and P++

rweikusat2 — Thu, 15 Aug 2019 20:56:00 +0000

> or a Unix filesystem path (which is neither text nor bytes but an unholy amalgamation of both)

A UNIX filesystem name is a of bytes whose values are neither 0 nor 47. A UNIX filesystem path is sequence of UNIX filesystem names separated by non-empty sequences of bytes with value 47 ('/'). The unholy idea that there's one character set to rule them all (which - coincidentally - makes everyone bend over backwards to get support for the characters his language is written in except people from the USA) and that The Character Set Encoding is as dictated to uses as The Character Set by some entity selling operating systems is decades newer than this.

PHP and P++

juliank — Thu, 15 Aug 2019 20:41:04 +0000

Yet in practice, translating was fairly trivial, and a lot of people were simply too lazy and did not bother.

PHP and P++

mathstuf — Thu, 15 Aug 2019 17:29:30 +0000

Not all feature flags actually interact with each other. I'm also not sure that "feature flag" is the correct term here. Instead, something like CMake's policy system might be better (AFAIK, it is similar to Perl's `use 5.20;` statement, but also allows some fine-grained control). Code declares its minimum version (which sets policy settings). CMake then notices when a policy would be triggered by some code and says "hey, newer versions of CMake interpret this code differently, but the old behavior will be used right now". Most policies are orthogonal to each other, but when they do interact, the newer one usually just assumes the new behavior of the old policy (e.g., the one which rewrites the variable expansion code assumes another policy, so it is documented as "policy 10 is not relevant under policy 53; post-10 behavior is used"). Setting a policy setting to use the old behavior is a pretty big code smell and basically indicates that something isn't covered by the new behavior (warranting an issue).

If the PHP VM can warn when it sees code which changes behavior under the new policies, fixing them is much easier because usages get called out. When code is OK with the new behavior it says "I'm ready" and the VM just does the new thing instead of warning and doing the old behavior. CMake has been able to keep very strong backwards compatibility using this pattern and I think that PHP would be able to do so as well if it went down a similar route.

PHP and P++

NYKevin — Thu, 15 Aug 2019 17:20:58 +0000

The brackets around print() were the *least* of Python 3's problems. If that had been the entire change, then both 2to3 and 3to2 would have been completely trivial programs, everyone would have transformed their code once, and then it would have been over and done with. Once every now and then, some ancient code would spit out a "SyntaxError: missing parentheses in call to print," you'd Google it, and StackOverflow would tell you "run it through 2to3," and that would be it.

The real problem was Unicode support. It's basically impossible to determine by static analysis how to transform a string-manipulation program written in Python 2 into Python 3, because you don't know the language-level types of anything, and you also don't know whether any given 8-bit string (Python 2 str) is semantically text, bytes, or a Unix filesystem path (which is neither text nor bytes but an unholy amalgamation of both).

PHP and P++

xnox — Thu, 15 Aug 2019 16:39:33 +0000

Indeed. Lots of people migrated from python2 to golang, instead of python3.

Ideally, I would wish to not have PHP at all, but I also don't see it going away any time soon in practice. I wonder how many generations of developers it will take =/

It is a bit of an existential crisis because if it freezes and doesn't evolve anymore it might die too. However, it seems to work out great for LaTeX2e.

PHP and P++

iabervon — Thu, 15 Aug 2019 16:32:53 +0000

The real problem with feature flags is that you actually have to support (and test) 2^N combinations of flags. Having a linear number of profiles that choose the combinations that are allowed is still not constant effort, but it's (literally) exponentially better.

This is effectively what GCC does with -std=c99 and such. There are ~7 C standards that GCC supports, rather than being able to mix and match paragraphs from different standards.

PHP and P++

juliank — Thu, 15 Aug 2019 16:24:53 +0000

Just switch to Go instead. Seriously, why bother with all that crap?

PHP and P++

burki99 — Thu, 15 Aug 2019 14:55:27 +0000

I'm also sceptical, seeing how a seemingly innocent change like brackets around print in Python 3 took years to resolve. The only chance I see to make this happen is a per file option giving you the option to freely to intermix PHP and P++ in any project maybe along the lines of the already existing declare(strict_types=1) (e.g. declare(strict_syntax=1))

PHP and P++

dskoll — Thu, 15 Aug 2019 14:36:39 +0000

Ah, I see the FAQ says you can mix PHP and P++, but color me skeptical on how well that would work in practice for extensions and library code.

PHP and P++

dskoll — Thu, 15 Aug 2019 14:35:35 +0000

Extension/module authors would have to either pick one language and stick with it, or maintain two versions of their extensions or modules. This would be a disaster.

PHP and P++

ju3Ceemi — Thu, 15 Aug 2019 14:28:50 +0000

This looks like a complicated ways of implementing feature flags.
The issue with short_open_tag and likes is the requirement to support the associated code : for every legacy feature, you need to support and maintain the code

This is the issue, which is not resolved by this p++ idea.

If you want those old features, but also a stricter mode, you shall use feature flags, which could be "strict" by default : people who want the "legacy-compliant" php simply need to change the configuration
Exactly what is done today with short_open_tag (it is enabled by default, thought)