|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for August 22, 2019

Welcome to the LWN.net Weekly Edition for August 22, 2019

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

OpenPOWER opens further

By Jake Edge
August 21, 2019

OpenPOWER Summit

In what was to prove something of a theme throughout the morning, Hugh Blemings said that he had been feeling a bit like a kid waiting for Christmas recently, but that the day when the presents can be unwrapped had finally arrived. He is the executive director of the OpenPOWER Foundation and was kicking off the keynotes for the second day of the 2019 OpenPOWER Summit North America; the keynotes would reveal the "most significant and impressive announcements" in the history of the project, he said. Multiple presentations outlined a major change in the openness of the OpenPOWER instruction set architecture (ISA), along with various related hardware and software pieces; in short, OpenPOWER can be used by compliant products without paying royalties and with a grant of the patents that IBM holds on it. In addition, the foundation will be moving under the aegis of the Linux Foundation.

Blemings also wrote about the changes in a blog post at the foundation web site. To set the stage for the announcements to come, he played a promotional video (which can be found in the post) that gave an overview of the foundation and the accomplishments of the OpenPOWER architecture, which includes underlying the two most powerful supercomputers in the world today.

Opening the presents

Ken King, general manager of OpenPOWER alliances for IBM, was up next to fill the audience in on what IBM and the foundation were up to. He said that he liked the idea of being Kris Kringle but that his employees may actually think he is more like Scrooge, at least when they come to him for money, he said with a laugh. The announcements would be "game changing for the open community", he said.

[Hugh Blemings]

The foundation has made significant progress to date; it has more than 350 members at this point. There are more than 40 server products available from 20 or so different vendors. Systems range from the workstation class up through supercomputers, he said. All of that was accomplished by the OpenPOWER community.

IBM has long been involved in open-source software, including its $1 billion dollar commitment to Linux in 1999, but it has more recently moved into the open-hardware world. In 2013 it started the OpenPOWER Foundation and in 2016 it started the OpenCAPI Consortium to foster the Coherent Accelerator Processor Interface (CAPI) standard. It is now time to take the next step, King said.

"First and foremost", he said, the POWER ISA is being licensed to the OpenPOWER Foundation so that anyone can use it to build products royalty free and with the patent rights. Changes to the ISA will be made via an open-governance model within the foundation. A majority vote will govern extensions and changes to the ISA. IBM is also contributing a softcore implementation of the ISA.

[Ken King]

Reference designs for OpenCAPI and the OpenCAPI memory interface (OMI) will be contributed to the community through the consortium. Those specifications will now be community controlled, rather than controlled by IBM. The idea is not only to make them industry-wide standards but also to drive convergence with other standards such as Compute Express Link (CXL), King said.

In addition, because of the importance of moving to open governance, the OpenPOWER Foundation will become a part of the Linux Foundation, he said. It will still have its own board and decision making, but will be an entity under the Linux Foundation. That switch doesn't change anything for the OpenPOWER community, it is just as critical, he said, but it is making a statement to the industry that OpenPOWER will be a part of the leading organization for open governance in the world.

Various vendors and research institutions reviewed the idea before it was made public, he said; there was lots of endorsement and even excitement. POWER is the only architecture that is completely open from the lowest levels of the hardware and firmware up through the server software stack. It is not a "de-commitment from IBM", he said; it is a "further commitment" to POWER by the company.

Demo time

Later in the morning, IBM distinguished engineer Anton Blanchard took the stage to describe some steps he and his colleagues had taken to show what can be done with the softcore that was being released. He is a software person who created a hardware counter in VHDL in college many years ago; "it probably didn't count", he said to laughter.

[Anton Blanchard]

They only had a few months to pull something together. What they came up with was the "first step in a journey"; it is a simple OpenPOWER-compliant core written in VHDL 2008. It uses an open-source tool called GHDL to simulate the hardware. The processor is called MicroWatt and will be available soon on GitHub, he said. It is "very simple by CPU standards"; it is a single-issue, in-order processor that implements the scalar subset of the OpenPOWER ISA. It also uses components (e.g. a UART) from other open-hardware projects.

But there needs to be something to run on the simulated processor, so they turned to MicroPython, which is a small, embedded version of the Python language. Michael Neuling ported MicroPython to the MicroWatt core, which did not require any generic changes to the code, just the expected platform-specific code. He demonstrated that in a video that showed building and running the simulator, then typing "1 + 2" at the MicroPython prompt. After a noticeable five or more second delay, "3" was displayed; "it is a little slow, obviously", he said to laughter and applause.

The next step was to put the MicroWatt onto a Xilinx FPGA board. He showed a video of the code being loaded onto the board. From the MicroPython prompt, it printed the canonical "hello world" string, but that doesn't really show that much, he said. Code to calculate Fibonacci numbers was then run to produce the first 100 numbers in short order.

There is a bit of "smoke and mirrors" in the demo, Blanchard said. They ran out of time, so there are no hardware divide instructions in the MicroWatt. For now, the toolchain was altered so that it did not emit them. They should be able to fairly easily pull in a hardware-divide implementation from other open-hardware projects, he said. MicroWatt is just a proof of concept, without any optimization, but it does work.

They also put the MicroWatt on a Xilinx Alveo accelerator board, which only used a small fraction of the FPGA on the Alveo. So they decided to put a bunch of MicroWatts on it and were able to fit 41 into roughly 80% of the FPGA. Blanchard said he only had a day to come up with a demo for that; repeatedly writing "hello world" from each MicroPython interpreter between the cores produced a jumble of output. "Oops, I forgot the locking", he said.

MicroWatt has been lots of fun to play with, Blanchard said. Next up is to add a simple supervisor state, which would enable it to run some of the Internet of Things operating systems. With a bit more work on supervisor mode, it should be able to run Linux. He would love to see others start to work with it and perhaps create a Verilog implementation.

Compliance and licensing

In a "birds of a feather" (BoF) session held later that morning, Blanchard and OpenPower Foundation president Mendy Furmanek answered questions about the announcements. In particular, they filled in some information about the compliance side of the equation. It is important not to fracture the ecosystem with incompatible implementations, Furmanek said. But not all products will want or need to implement the entire ISA.

[Mendy Furmanek]

For now, the foundation has identified four separate tiers of instructions. In order to get the patent rights for a product, it will have to comply with one of those four tiers, but it can add instructions from the higher-level tiers as well. The levels are: scalar fixed point (what MicroWatt is based on), floating point, Linux server, and AIX server. There is a compliance workgroup within the foundation that will determine the compliance levels. In addition, changes to the ISA will only require a majority vote of a workgroup made up of foundation members if they are backward compatible; changes that are not backward compatible require a unanimous vote.

Blanchard said that they did not want to have too many options, like what RISC-V has done with optional instructions and such. The Linux server level is more complicated, he said, but Linux can run on the simpler levels, which may be appropriate for embedded applications. It would not take a huge amount of effort to take something like MicroWatt and turn it into a chip that could run Linux, he said.

Furmanek said that the foundation would like to see contributions from members and non-members alike. The AIX level is effectively set up to give IBM a place to continue building out on its POWER roadmap if the community decides it wants to go in a different direction. But POWER has a robust and unified ecosystem, she said, and IBM wants to ensure that continues.

Licensing the OpenPOWER designs that get created based on the foundation's work must be done under a Creative Commons (CC) license, she said. Other licenses are not geared toward hardware and, in particular, don't have patent terms that make sense. Those contributions can additionally be licensed using other licenses, as long as the CC terms are passed on. The foundation is still working with its lawyers to figure all of this out.

The patent rights will come when building actual hardware. There is no indemnification for patents that others might assert against OpenPOWER, but there is a defensive termination clause. If someone is using the OpenPOWER patents and asserts other patents against OpenPOWER, their right to use the OpenPOWER patents will terminate. There has been discussion of creating an organization like the Open Invention Network for hardware, but it is not clear that it actually would work, she said.

One common concern that was mentioned multiple times during the morning presentations was the dearth of open-source tools to work with and build actual hardware. Electronic design automation (EDA) tools are generally proprietary and costly. That also came up at the BoF, There are some open-source tools available, Furmanek said, but they have not reached commercial grade, at least yet. Her background is in the EDA space and she is hopeful that there will be more news in that area before too long.

There were several other keynotes in the morning as well as a panel discussion, all of which helped to round out the overall picture. It is clear that IBM, the foundation, and its members are quite enthusiastic about these changes. The impact they will have on the industry is unclear. It may be, as some say, too little and too late, but that remains to be seen.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for funding to travel to San Diego for OpenPOWER Summit NA.]

Comments (15 posted)

Making containers safer

By Jake Edge
August 21, 2019

LSS-NA

On day one of the Linux Security Summit North America (LSS-NA), Stéphane Graber and Christian Brauner gave a presentation on the current state and the future of container security. They both work for Canonical on the LXD project; Graber is the project lead and Brauner is the maintainer. They looked at the different kernel mechanisms that can be used to make containers more secure and provided some recommendations based on what they have learned along the way.

Caring about container safety

Graber began by asking "why do we care about safe containers?" Not everyone does, he said, but the Linux containers project, which LXD and LXC are part of, has been working on containers for over ten years. LXC and LXD are used to create "system containers", which run unmodified Linux distributions, not "application containers" like those created using Docker. The idea is that LXD users will use the same primitives as they would if they were running the distribution in a virtual machine (VM); that they are actually running them on a container is not meant to be visible to them.

Administrators of these system containers will often give SSH access to the "host" to their users, who will run whatever they want on them. That is one of the reasons the project cares a lot about security. It uses every trick available, he said, to secure those containers: namespaces, control groups, seccomp filters, Linux security modules (LSMs), and more. The goal is to use these containers just like VMs.

Since the project targets system containers, it builds images for 18 distributions with 77 different versions every day, Graber said. That includes some less-popular distributions in addition to the bigger names; it also builds Android images. Beyond that, LXD is being used as part of the recent Linux desktop on Chromebooks feature of Chrome OS. There are per-user VMs in Chrome OS, but the Linux desktop distribution runs in a container with some persistent storage, he said. It has GPU passthrough and other features to make the desktop seamlessly integrate with Chrome OS.

All of the users of those distribution images built by the project can run any code they want inside those containers, which means that the Linux containers project needs to care a lot about security, Graber said.

Privileged versus unprivileged

[Stéphane Graber]

There are two main types of containers, he said: privileged and unprivileged. But the Linux kernel has no notion of containers, they are purely a user-space construct built from the tools provided by the kernel. Privileged containers are those where root inside the container is the same user as root outside the container (i.e. UID 0). That is not true for unprivileged containers because UID 0 inside the container is mapped to some other, unprivileged user outside of the container via user namespaces.

Sadly, he said, the vast majority of containers that are run today are privileged containers. That includes most Docker containers and most of the containers that are run with Kubernetes. The main problem is that an attacker who can break out of the container now has root privileges on the host; the whole system is compromised. The security of those containers depends on LSMs, Linux capabilities, and seccomp filters; the container's privileges are not isolated enough and the policies for the various security mechanisms tend to "fail open".

The LXD project does not consider privileged containers to be safe to run; it is not a configuration that is supported. The project does what it can to close any of the holes it knows about, but strongly recommends against using privileged containers.

For unprivileged containers, since root in the container does not map to UID 0 in the host system, a container breakout is still serious, but not as damaging as it is for a privileged container. There is also a mode where each LXD container in a system will have its own non-overlapping UID and GID ranges in the host, which limits the damage even further. Any breakout will result in a process with a UID and GID that is not shared with any other process in any other container (or the host system itself).

User namespaces have been around since the 3.12 kernel, but few other container management systems use the feature to isolate their containers. Part of the reason for that is the difficulty in sharing files between containers because of the UID mapping. LXD is currently using shiftfs on Ubuntu systems to translate UIDs between containers and the parts of the host filesystem that are shared with it. Shiftfs is not upstream, however; there are hopes to add similar functionality to the new mount API before long.

The perils of privileges

[Christian Brauner]

After that, Graber turned the floor over to Brauner, who started by rhetorically asking "are privileged containers really that unsafe?" His answer was an unequivocal "yes"; he listed a half-dozen or so "pretty bad" CVEs that have affected privileged containers over the last few years. That list included CVE-2019-5736, which was the runc container-confinement breakout that was disclosed in February; it was a bad way to start the year in terms of container security. As far as he can tell, all of those CVEs would not affect unprivileged containers like those created by LXD.

It should be fairly trivial to use all of the available security mechanisms, but it turns out not to be. It is often the case that there is some way to block the problem behavior, but it is not used by the container managers for a variety of reasons. Some of those technologies may not be well documented, he said, which is a problem that the kernel developers should fix.

He began with namespaces, which are not used enough in his view. In the application container world, too few of the namespaces are used, typically just the mount namespace. All of them have some security benefit by isolating some resource from the rest of the system. The most obviously useful namespace is the user namespace, which isolates privileges between containers.

Namespaces have a "clunky API for sure", he said. Kernel developers should find a way to make it "nicer in some way". Properly ordering the creation of the namespaces at container startup time is important. In addition, there is no way to atomically setns() into all of the namespaces for a process. Brauner said he has some ideas on how to make that work better.

Next up was seccomp filters, which are "essential for privileged containers", Brauner said. Allowing privileged containers to call open_by_handle_at(), for example, will lead directly to a compromise. Seccomp filters provide a "useful safety net" for unprivileged containers, but are not truly required. Typically, unprivileged containers can maintain a blacklist of system calls that cannot be called, while privileged containers will need to create a whitelist of safe system calls.

LSM support is also essential for privileged containers, he said. Access to various files in procfs and sysfs must be blocked or the container can be compromised. The LSMs most frequently used by container managers are SELinux and AppArmor, but other "minor" LSMs (which can stack) are also added into the mix sometimes.

Recent and future features

Brauner then described some security features that had landed in the kernel recently as well as some upcoming features that may be coming or are wished for. The ability to defer seccomp filter decisions to user space was added for the 5.0 kernel. It allows user space to inspect the arguments to the system call in a race-free way, so things like path names can be inspected. LXD uses that new feature to allow the distributions in its containers to successfully call mknod() for certain devices (e.g. /dev/null) but not others that are dangerous to have in the container. The old way of handling that was to bind mount the safe devices from the host filesystem.

Deferring to user space is a "nifty feature", he said, but there are some problems with it. For example, it requires that user space handle the system call itself, which means there are some tricky privilege issues that need to be carefully considered. If the system call should be made, it needs to be done in the context of the container user, with its privileges, not those of the container manager.

All of that also makes the feature a bit annoying to use, he said. It would be better if there were a way to tell the kernel to simply resume the system call. There is also a problem with flags passed to some new system calls, such as clone3(), because they are not passed directly as a parameter but are instead inside a structure whose address is passed. But that means the in-kernel seccomp filtering cannot use the flag values as it is restricted to the parameters passed in registers and cannot chase pointers. He sent an email to the ksummit-discuss mailing list about seccomp and hopes to discuss some of those annoyances and possible solutions to them at the Kernel Summit in September.

Stacking major LSMs (SELinux, AppArmor, and Smack) is something the LXD project would like to see as well. Being able to run containers with their own LSM on a host with a different major LSM, such as an Android container that uses SELinux on a Ubuntu system (which uses AppArmor) or an Ubuntu container on Fedora (which also uses SELinux), would be useful.

The SafeSetID LSM has been merged for Linux 5.3. It restricts UID/GID transitions to only those allowed by a whitelist. It came from Chrome OS and will be quite useful for privileged containers.

The new mount API split the functionality of the mount() system call into a bunch of separate calls that will allow some nice features for container managers. For example, it will allow anonymous mounts, which are mounts that are not attached to any path in the filesystem but will still allow access to the files for the process holding the file descriptor for the mount. There may be a way to add the UID/GID shifting feature to the new API to eliminate the need for shiftfs.

Brauner also mentioned the new process ID (PID) file descriptor (pidfd) feature. Pidfds are file descriptors that refer to a process, so that signals can be sent to the right process without fear of hitting the wrong target if the PID gets reused. It also allows processes to get exit notifications for non-child processes. Pidfds are used by LXD; there may be more features coming for pidfds as well, he said.

In wrapping up, Graber said that other container managers can learn from what the LXD project has done. He thinks it is imperative that they stop using privileged containers and start using user namespaces, but they do not have to figure everything out on their own. He does not believe that containers can ever really contain unless they separate the privileges inside the container from those outside of it.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for funding to travel to San Diego for LSS-NA.]

Comments (26 posted)

Reconsidering unprivileged BPF

By Jonathan Corbet
August 16, 2019
The BPF virtual machine within the kernel has seen a great deal of work over the last few years; as that has happened, its use has expanded to many different kernel subsystems. One of the objectives of that work in the past has been to make it safe to allow unprivileged users to load at least some types of BPF programs into the kernel. A recent discussion has made it clear, though, that the goal of opening up BPF to unprivileged users has been abandoned as unachievable, and that further work in that direction will not be accepted by the BPF maintainer.

The BPF verifier goes to great lengths to ensure that any BPF program presented to the kernel is safe to run. Memory accesses are checked, execution is simulated to ensure that the program will terminate in a bounded period of time, and so on. Many of these checks are useful to ensure that all programs are safe and free of certain types of bugs, but others are aimed specifically at containing a potentially hostile program — an obvious necessity if the kernel is to accept BPF programs from unprivileged users.

Much of this work was done in 2015 for the 4.4 kernel; in particular, a great deal of effort went into preventing BPF programs from leaking kernel pointer values to user space. Those pointers could be highly useful to an attacker who is trying to figure out where specific data structures or code are to be found on a target system, so making them easily available to unprivileged processes is clearly a bad idea. "Constant blinding" was added for 4.7. In essence, this mechanism will exclusive-OR constant values in programs with a random number (repeating the operation at run time when the values are actually used), preventing an attacker from sneaking in unverified BPF code disguised as constants. Other patches have been aimed at preventing speculative-execution attacks by BPF programs.

After all that work, though, there is still only one place where unprivileged users can, administrator willing, load BPF programs: as filters on open sockets. In 2015, BPF maintainer Alexei Starovoitov declared that "I think it is time to liberate eBPF from CAP_SYS_ADMIN". Nearly four years later, that has not happened, and the work that has been done more recently has been focused instead on giving administrators more control over who can load BPF programs; see the (unmerged) /dev/bpf effort for one example.

While Starovoitov has stopped working on unprivileged BPF, others have still been putting some thought in that direction. Andy Lutomirski recently posted a set of patches intended to make BPF a bit more suitable for this use case. It implements access permissions for BPF maps, adds a way to mark specific BPF functions as requiring privilege, and allows the loading of all types of programs by unprivileged users. "This doesn't let you *run* the programs except in test mode, so it should be safe. Famous last words." These patches have received no comments.

In the ongoing discussion about the /dev/bpf work, though, Starovoitov made the perhaps surprising statement that "unprivileged bpf is actually something that can be deprecated". Lutomirski, unsurprisingly, didn't like that idea:

I hope not. There are a couple setsockopt uses right now, and and seccomp will surely want it someday. And the bpf-inside-container use case really is unprivileged bpf -- containers are, in many (most?) cases, explicitly not trusted by the host.

Starovoitov responded that "Linux has become a single-user system" where anybody who can run any code at all can break out of containment and obtain root privileges. The whole idea of unprivileged BPF, he said, has been a mistake:

When we say 'unprivileged bpf' we really mean arbitrary malicious bpf program. It's been a constant source of pain. The constant blinding, randomization, verifier speculative analysis, all spectre v1, v2, v4 mitigations are simply not worth it. It's a lot of complex kernel code without users. There is not a single use case to allow arbitrary malicious bpf program to be loaded and executed.

Lutomirski responded (more than once) that some use cases do exist. He mentioned seccomp(), which still uses the old "classic BPF" language rather than the "extended BPF" that has been the target of development work in recent years; there are developers now who would like to have extended BPF features available in seccomp() filters. Per-user systemd instances are another example; systemd makes use of BPF now and could benefit from making that functionality available to unprivileged users as well. There might well be others if the kernel were able to support them, he said: "There aren't major unprivileged eBPF users because the kernel support isn't there".

Starovoitov made it clear that he was not impressed, though: "I'm afraid these proposals won't go anywhere". He reiterated his claim that there are no known use cases for unprivileged BPF. What he would like to see, instead, is "less privileged BPF" where, for example, a process could be given a new CAP_BPF capability (or access to a /dev/bpf file) that would allow the loading of BPF programs without opening the door to other privileged operations. That, he said, would improve the safety of applications that actually exist without the need to expend effort supporting the unprivileged use case which, he claims, does not exist.

And that is the impasse at which the conversation stands now. At its core, it's a fundamental difference of opinion over whether a Linux system can ever be truly hardened against an unprivileged user. If the answer is "no", then there is little point in maintaining a lot of complex code in the BPF subsystem to try to effect that hardening. Accepting that answer, though, is tantamount to saying that the Linux privilege model just doesn't work in the end: the combination of software bugs and hardware vulnerabilities will always undermine it, so we might as well just give up. That would be a discouraging conclusion to say the least.

Comments (42 posted)

On-disk format robustness requirements for new filesystems

By Jonathan Corbet
August 19, 2019
The "Extendable Read-Only File System" (or "EROFS") was first posted by Gao Xiang in May 2018; it was merged into the staging tree for the 4.19 release. There has been a steady stream of work on EROFS since then, and its author now thinks that it is ready to move out of staging and join the other official filesystems in the kernel. It would seem, though, that there is one final hurdle that it may have to clear: robustness in the face of a corrupted on-disk filesystem image. That raises an interesting question: to what extent do new filesystems have to exhibit a level of robustness that is not met by the filesystems that are currently in heavy use?

As suggested by its name (and its acronym), EROFS is a read-only filesystem. It was developed at Huawei, and is intended for use in Android systems. EROFS is meant to differ from existing read-only filesystems in the area of performance; it uses a special compression algorithm that creates fixed-length blocks that, it is claimed, allows random access to compressed data with a minimum of excess I/O and decompression work. Details can be found in this USENIX paper [PDF] published in July.

Gao has made several requests in recent times to move EROFS out of the staging tree; the latest was posted on August 17. It read:

In the past year, EROFS was greatly improved by many people as a staging driver, self-tested, betaed by a large number of our internal users, successfully applied to almost all in-service HUAWEI smartphones as the part of EMUI 9.1 and proven to be stable enough to be moved out of staging.

(EMUI is Huawei's version of Android.)

It would seem that there is little opposition to this move in general. As part of reviewing the code, though, Richard Weinberger noticed that the code generally trusts the data it reads from disks, often failing to check it for reasonableness. He quickly found a way to create a malformed filesystem that would put the kernel into an infinite loop, creating a system that is a bit more read-only than anybody had in mind. The problem was fixed just as quickly, but not before starting a discussion on whether robustness against hostile filesystem images should be a requirement for new filesystems entering the kernel.

Nobody disagrees that it would be a good thing if a filesystem implementation would do the right thing when faced with a hostile (or merely corrupt) filesystem image; that would make it possible to allow unprivileged users to mount filesystems without fear of handing over the keys to the entire system, for example. But, as Ted Ts'o pointed out, heavily used, in-kernel filesystems like ext4 and XFS don't meet that standard now, so requiring new filesystems to reach that level of robustness is presenting them with a higher bar:

So holding a file system like EROFS to a higher standard than say, ext4, xfs, or btrfs hardly seems fair. There seems to be a very unfortunate tendency for us to hold new file systems to impossibly high standards, when in fact, adding a file system to Linux should not, in my opinion, be a remarkable event.

In the case of EROFS, as Chao Yu pointed out, the intended use case makes this kind of robustness less important. The Android system images shipped in this filesystem format will be verified with a system like dm-verity, so the filesystem implementation should not be confronted with anything other than signed and verified images. Even so, the EROFS developers agree that this kind of bug should be actively sought out and fixed.

It seems that views about robustness against bad images vary somewhat among filesystem developers. With regard to these bugs in ext4, Ts'o said that "while I try to address them, it is by no means considered a high priority work item". He characterized the approach of the XFS developers as being similar. Christoph Hellwig disagreed strongly with that claim, though, saying that XFS developers work hard to handle corrupt filesystem images, "although there are of course no guarantees". Eric Biggers asserted that dealing with robustness issues should be mandatory, "but I can understand that we don't do a good job at it, so we shouldn't hold a new filesystem to an unfairly high standard relative to other filesystems".

Hellwig arguably took the strongest position with regard to the standards that should be applied to new filesystems:

We can't really force anyone to fix up old file systems. But we can very much hold new ones to (slightly) higher standards. That's the only way to get the average quality up. Same as for things like code style - we can't magically fix up all old stuff, but we can and usually do hold new code to higher standards.

What those higher standards should be was not spelled out. They probably do not extend to absolute robustness against corrupt filesystem images, but it seems that developers would like to see at least an effort made in that direction. As Biggers put it:

If the developers were careful, the code generally looks robust, and they are willing to address such bugs as they are found, realistically that's as good as we can expect to get.

Whether EROFS meets the "looks robust" standard is a bit controversial at the moment. On the other hand, there is little doubt that the EROFS developers are willing and able to fix bugs quickly as they are reported. For the purposes of moving EROFS into the kernel proper, chances are that will be good enough. Unless some other show-stopping issue comes up, this little snag seems unlikely to keep this code from graduating out of the staging tree. Future filesystem developers will want to take notice, though, that reviewers will be paying more attention to robustness against on-disk image corruption than they have in the past.

Comments (23 posted)

PHP and P++

By Jonathan Corbet
August 15, 2019
PHP is the Fortran of the world-wide web: it demonstrated the power of code embedded in web pages, but has since been superseded in many developers' minds by more contemporary technologies. Even so, as with Fortran, there is far more PHP code out there than one might think, and PHP is still chosen for new projects. There is a certain amount of tension in the PHP development community between the need to maintain compatibility for large amounts of ancient code and the need to evolve the language to keep it relevant for current developers. That tension has now come into the open with a proposal to split PHP into two languages.

PHP has been around for a long time; a previous version of the LWN site was implemented in PHP/FI in 1998. For most of its 25 years of existence, PHP has been criticized from multiple directions. Its development community has done a lot of work to address many of those criticisms while resisting others that, it was felt, went against the values of the language. Often these changes have forced code written in PHP to change as well; such changes tend to be the most controversial.

For example, consider the current controversy over "short open tags". PHP code is embedded within an HTML page with a sequence like this:

    <?php /* PHP code */ ?>

Back in the early days, though, the language also understood an abbreviated version:

    <? /* PHP code */ ?>

The latter form is, among other things, not properly XML compliant and has been deprecated for years. If the short_open_tag setting is enabled, though, these tags will still be recognized. The PHP community decided some time ago that it wanted to remove all settings that affect the language globally, and it seems that short_open_tag is the only one remaining. But a proposal to remove it has been through multiple iterations, has elicited strong opposition, and has inspired lengthy discussion threads on the project's mailing lists. The PHP project resolves such issues through voting; as of this writing, there are 28 votes in favor of removal and 24 opposed. Unless things change before the vote closes on August 20, this particular measure will fail to reach the 2/3 majority required to pass.

Bringing peace

This vote highlights the sort of division that can be found in the PHP community. Referring to "a growing sense of polarization", PHP developer Zeev Suraski tried to improve the situation with a proposal titled "Bringing Peace to the Galaxy"; so far, it would appear to have failed to do so, at least in the way that was intended.

Suraski described the disagreement as being between those who want to push the language forward and those who value backward compatibility above almost all else. "To a large degree, these views are diametrically opposed. This made many internals@ discussions turn into literally zero sum games - where when one side 'wins', the other side 'loses'". The answer that he came up with was to give both sides what they want by splitting the language in two:

  • "Classic" PHP would continue to be developed with a strong emphasis on keeping millions of lines of existing code working. The language would not be frozen, but it would be highly resistant to changes that break compatibility.
  • A new language, called "P++" for now, would take the opposite approach, breaking compatibility and adding features (such as strict typing, greater consistency across the language, new types and, naturally, killing off short open tags).

On its surface, this looks like a fork of PHP or, at a minimum, a split like that seen between Python versions 2 and 3. Suraski envisions something a bit different, though. Unlike Python, the PHP community would continue to develop (and add features to) its old version, with no plans for leaving it behind at some point. He also envisions supporting both languages from the same code base and a common runtime system. A single binary would implement both PHP and P++. So, rather than creating a fork, Suraski aims to create a single project with two faces.

Suraski's post (along with the subsequently posted P++ FAQ) was intended to provoke discussion; as one might imagine, it succeeded. Some developers like the idea, but many more seem to be concerned about it, for a number of reasons. One of those was expressed by Dan Ackroyd among others:

PHP internals is already lacking programming resources to do everything we want to be doing.

Maintaining two versions at once would be more work, so this idea is not feasible without a dramatic increase in the number of people working on PHP core.

Suraski optimistically responded that his proposal "will take no additional resources", mostly as the result of the use of a single code base. It is not clear that others in the community find this argument persuasive, though.

The idea of "rebranding" PHP with the new language is appealing to Suraski, who sees it as a way of getting away from PHP's not-entirely-positive reputation. Others, such as Nikita Popov, worried that rebranding will leave a valuable name behind without any corresponding benefit. Rebranding is also something that can only be done once, Popov said; it won't be an option five years down the road when the desire to add yet another set of incompatible features arises. Suraski responded that rebranding can bring new life to a project by attracting interest and getting developers who have written PHP off to take another look. The backward-compatibility break in P++ was described as a one-time thing, where all of the changes could be made at once, so Suraski was not worried about having to do it again anytime soon.

To some in the conversation, P++ was reminiscent of the Hack language, which is a fork of PHP created at Facebook. This concern is addressed in the FAQ, where Suraski essentially said that things will work out differently because it's the PHP community doing the work. Hack is a single-company project, the FAQ reads, and is not as widely distributed as PHP is. The P++ language, instead, would come automatically with a future version of PHP, so it would be there, waiting, whenever new developers wanted to try it.

What polarization?

What may well turn out to be the majority view, though, was well expressed by Arvids Godjuks, who seems to feel that the entire conception of the problem is wrong. The division described by Suraski does not really exist, Godjuks said; instead, almost all developers are interested in both compatibility and language evolution. The right thing to do is to continue, as the community has done so far, to find a balance between those two requirements:

Right now PHP does have somewhat of a plan and direction it is going, it is going at a decent pace - not too slow, not too fast. The community is able to adopt the new features and changes in a timely manner and gracefully introduce their support or requirement without everyone running like headless chickens. So maybe solidify the plan, make it into an actual roadmap? That will allow people to make long term plans and decisions and make [backward compatibility] less of an issue.

Instead, Godjuks said, splitting the language would force developers to choose between two versions of PHP, neither of which has the flexibility found in current PHP. One of those two versions would probably wither and die. This point of view was supported in the only post by Rasmus Lerdorf in the thread; Lerdorf, of course, is the creator of PHP. He said:

Forcing a balance, even if sometimes the arguments get rather heated (and they were just as heated, if not more so 20+ years ago), keeps everyone on the same page and working on the same code-base without the us vs. them situation that is bound to creep in.

The discussion is far from resolved at this point, but perhaps some indication of the community's feeling can be found in this poll asking whether the P++ idea is worth pursuing. As of this writing, there are zero votes in favor and 28 opposed (including Suraski, who described the poll as "a false choice").

All told, the P++ idea would appear to be fighting a strong headwind in the community; unless something changes, it seems that it will be hard to build a critical mass of developers interested in making it happen. The discussion will not be wasted, though, if it helps to focus the community's collective mind on how it wants to see the language develop in the coming years. Supporting existing code and keeping the language relevant into the future are both important goals; if the PHP community can find a way to balance those priorities, the language may well continue to thrive for a long time.

Comments (38 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: kdevops; Kernel lockdown; distri; Git 2.23; notqmail; Quotes; ...
  • Announcements: Newsletters; events; security updates; kernel patches; ...
Next page: Brief items>>

Copyright © 2019, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds