Leading items
Welcome to the LWN.net Weekly Edition for July 2, 2020
This edition contains the following feature content:
- The (non-)return of the Python print statement: Python's founder proposes a significant syntax change.
- Four years of Zephyr: an overview of the Zephyr system and its history so far.
- Emulating Windows system calls in Linux: several options for helping Wine handle Windows system calls.
- Stirring things up for Fedora 33: the next Fedora release could have a number of significant changes.
- First PHP 8 alpha released: what's coming in the next major PHP release.
- Managing tasks with todo.txt and Taskwarrior: a tour of a couple of text-oriented to-do list managers.
- Generics for Go: after years, Go may finally get generic types.
This week's edition also includes these inner pages:
- Brief items: Brief news items from throughout the community.
- Announcements: Newsletters, conferences, security updates, patches, and more.
Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.
The (non-)return of the Python print statement
In what may have seemed like an April Fool's
Day joke to some, Python creator Guido van Rossum recently floated
the idea of bringing back the print statement—several months after
Python 2, which had such a statement, reached its end of life. In fact, Van
Rossum acknowledged that readers of his message to the python-ideas mailing
list might be checking the date: "No, it's not April 1st.
" He
was serious about the idea—at least if others were interested in having the
feature—but he withdrew it fairly quickly when it became clear that there
were few takers. The main reason he brought it up is interesting, though:
the new parser for CPython makes it
easy to bring back print from Python 2 (and before).
Prior to Python 3, the print statement was the usual way to print output to the screen:
>>> print '1 + 2 = ', 1+2 1 + 2 = 3But Python 3 changed print from a statement to the print() function. Of the changes for Python 3, switching to print() was perhaps one of the easiest, but it still led to a fair number of complaints. It was rather straightforward for Python 2 code to adopt the new behavior using:
from __future__ import print_functionBut the change did break a lot of working code, so perhaps it makes sense to bring back the print statement, Van Rossum said.
The new parser is based on a parsing
expression grammar (PEG) and it was a fairly simple matter to make the
change: "One thing that the PEG parser makes possible in about 20
lines of code is something not entirely different from the old print
statement.
" He has a prototype working, but it enables far more
than just print statements:
>>> len "abc" 3Or any method:
>>> import sys >>> sys.getrefcount "abc" 24Really, *any* method:
>>> class C: ... def foo(self, arg): print arg ... >>> C().foo 2+2 4
He noted that there are some downsides too, including a "bare" print statement being interpreted as the print() function and not a call to it. Potentially more problematic is the behavior when the first argument to the print statement uses parentheses.
>>> print (1, 2, 3) 1 2 3[...]
>>> print (2+2), 42 4 (None, 42)
Currently a bunch of effort is made in the parser to
recognize code that is trying to use the print statement, so that it
results in a SyntaxError that suggests adding parentheses
("Did you mean print('hello world')
"). That
code could be removed if print was resurrected. Van Rossum said
that it was not an "all or nothing" proposal, it could be dialed back
somewhat by restricting what kinds of function calls it would work for or
restricting it only to "print". He also noted that he would
withdraw the idea "if the response is a resounding 'boo, hiss'
".
It may not have been resounding, but the response was definitely mostly of
the "boo, hiss" variety (including some that used that phrase directly, of
course). Ethan Furman said
that while too many parentheses made code hard to read for him, too few is
also problematic, so he was not in favor of any change. Naomi Ceder had
mixed feelings; she has struggled to switch to the
print() function over the last five years (after 15 years of
print without parentheses). "As someone who teaches Python
and groans at explaining exceptions, I'm -0 on print without parens and -1
on other calls without parens.
"
Gregory P. Smith agreed
with Ceder,
but took it further ("-1 overall for me...
"), even though
Smith liked the print statement for Python 2 and earlier.
Beyond just print, though, he is concerned that calling functions
without requiring parentheses will just lead to "a whole new world of
typos and misunderstandings
". Other languages allow that kind of
thing, but Python is not those languages.
"I love that the new parser allows us to even explore these possibilities.
We should be very cautious about what syntax changes we actually adopt.
"
Overall, Smith's sentiment seemed popular; there were some who viewed parts of the idea favorably, but the full-blown proposal, including making any kind of call without parentheses, was not well-liked. For his part, Van Rossum said there was an element of "because we can" to the idea; he was pleasantly surprised at how easy it was to make it work with the new parser. He described why the PEG parser made things so much easier in his initial message:
But Van Rossum said that he would "happily withdraw
" the idea
since it did not seem to be gaining any real traction. It seems likely
that the idea came out of the blue for many—there were multiple good
reasons to switch to a print() function as part of the
Python 3 transition, after all. But the proposal does serve another purpose: it
allows the CPython core developers to see the scope of the types of changes
the PEG parser is bringing to the table. That may well open up some
interesting features moving forward.
Four years of Zephyr
The Zephyr project is an effort to provide an open-source realtime operating system (RTOS) that is designed to bridge the gap between full-featured operating systems like Linux and bare-metal development environments. It's been over four years since Zephyr was publicly announced and discussed here (apparently to a bit of puzzlement). In this article, we give an update on the project and its community as of its v2.3.0 release in June 2020; we also make some guesses about its near future.
The authors are both Zephyr developers working for Nordic Semiconductor; Cufí was the release manager for the v2.3.0 release.
A Zephyr primer
While Zephyr can scale up to much larger systems, a typical target for the RTOS is a microcontroller without a memory-management unit (MMU) that has a sub-100MHz CPU, 512KB or less of on-chip NOR flash memory, and 32 to 256KB of built-in static RAM. Like the Linux kernel, Zephyr is configurable using the Kconfig language, and uses devicetree to describe hardware.
Unlike other RTOS choices, Zephyr is much more than a kernel. It's an RTOS with "batteries included". The project aims to provide all the software needed to develop, release, and maintain a firmware application. This includes a toolchain with compilers as well as flash and debug tools. The zephyr repository includes the kernel, protocol stacks, drivers, filesystems, and more. Upstream Zephyr also includes code from dozens of other third-party projects pulled in by a tool called "west". These include cryptographic libraries, hardware-abstraction layers (HALs), protocol stacks, and the MCUboot bootloader.
Zephyr is a library operating system. Its build system combines the application, kernel, and any additional code into a single, statically linked executable, all in a single address space (most microcontrollers do not have MMUs anyway). Each image targets a specific board, and is typically executed in-place from flash. Because of this, almost all configuration is done at compile time. Buffers are statically pre-allocated. The entire devicetree is accessible at build time, so a fixed set of devices and drivers are compiled in. This minimizes image size and maximizes information stored in flash, thus saving precious RAM.
Some features are tailored toward the fact that Zephyr can't self-host, so Zephyr developers build the whole "distribution" alongside the application. For starters, it has a cross-platform build and configuration system based on CMake and Python. This runs natively on Linux, macOS, and Windows, thanks in part to the fact that both Kconfig and devicetree are handled in Python 3 instead of the original Unix-only tools. This is critical since Windows is the most widely used OS for microcontroller firmware development.
More details
The Zephyr kernel supports multiple architectures and scheduling algorithms. There are cooperative and preemptive threads, along with facilities for reducing interrupt latencies and guaranteeing the execution of key threads. An optional user mode can use the Memory Protection Units (MPUs) typically present in microcontrollers to isolate and sandbox threads or groups of threads from one another and the kernel.
Zephyr supports six major architectures (x86, Arm, ARC, NIOS II, Xtensa, and RISC-V) and also runs in emulation. Both 32- and 64-bit processor support exists for some architectures. Within the Arm architecture, the emphasis has been on the usual 32-bit Cortex-M cores, but experimental support for Cortex-R and Cortex-A (including 64-bit Cortex-A) exists and continues to improve. Beyond "real hardware," Zephyr runs on QEMU, and as an ELF executable. It supports a simulated radio, which can save time and expense when testing and debugging radio frequency (RF) issues. In all, there are upstream support files for over 200 "boards".
Zephyr has logging and shell subsystems. These have configurable transports, including traditional serial ports (for both) and over the network (for logging). Logging is optionally asynchronous; in this case, a separate thread actually sends log messages. The logging calls themselves post compact messages to a queue, which can be done quickly, so logging can be done even from within interrupt context.
Hardware-specific APIs are built around a lightweight device driver model that is tightly integrated with the kernel. It supports a wide range of peripherals and sensors under this common model. Multiple storage options are available. These range from basic key-value storage optimized for NOR flash to filesystems.
Zephyr's batteries also include various communication stacks. Its networking stack has BSD-like socket APIs and supports various protocols. Zephyr has a fully open-source Bluetooth Low Energy (BLE) protocol stack that runs on multiple hardware devices. It also has an 802.15.4 stack and supports the Thread protocol. Controller Area Network (CAN) and USB device communication are supported out of the box. Zephyr additionally supports firmware upgrades over many of these transports.
But why, though?
One common question about Zephyr is "why?" Why was it started, when there are so many RTOS choices already?
Microcontrollers have become more powerful and include more memory, cores, security features, and must support more complex protocols and communication stacks. Firmware complexity has increased markedly over the years. As a result, homegrown frameworks supplied by a single silicon vendor, whether built directly against bare metal or a minimalistic kernel, are increasingly difficult to develop and maintain. Zephyr aims to be a scalable solution to this problem, developed cooperatively in the open, that integrates all its functionality around a common set of APIs and primitives.
Zephyr is being used by various parts of its community at this point. As readers might guess, Zephyr's members are likely to be using it. Some silicon vendors also have Zephyr support. For example, supported hardware is available from members NXP and Nordic Semiconductor. STMicro has also contributed code for some SoCs and sensors, and has been responsive to users. Various other architectures and boards are supported. Finally, there is an active open-source community, which we'll get into a bit more in the next section.
Who's behind all of this?
Zephyr is a Linux Foundation project with a paid membership and governance structure. It is, however, developed openly. Anyone may use and contribute to Zephyr without becoming a member — many do.
The project's members include silicon vendors, device makers, engineering organizations, and others. Funding is used, among other things, to pay for continuous integration, web hosting, marketing, and other infrastructure. Members receive voting seats on the project's Technical Steering Committee (TSC), and its Governing Board is comprised of members only.
Given that, it might be tempting to look at the project as simply a pay-to-play industry organization, but that wouldn't be a fair assessment. At least where the code and other technical contributions are concerned, Zephyr is a true open-source project. Patches and reviews are open to all. The core Zephyr repository is Apache 2.0 licensed, and contributors retain copyright over their contributions. In the v2.3.0 release, about one in four commits came from emails that weren't from member-owned domains.
The project's meetings, including the TSC's, are largely open for anyone attend and participate in — alas using a proprietary videochat platform — and have open minutes that are sent to the mailing lists.
Zephyr development takes place on GitHub. Patches are sent as pull requests and non-security bugs are tracked as GitHub issues. Subsystem maintainers have overall responsibility for their areas. Maintainers must be approved by the TSC, but maintainership is open to all, not just project members, and there are maintainers who are not from member organizations.
The project has open user and developer mailing lists with searchable archives; chat is currently done via Slack. See this help page for more details.
Wrapping up
Beyond expanding and improving what it currently supports, Zephyr will not stay still when it comes to adding features. In particular, the next release, which is currently scheduled for September, is targeted to include the following: Bluetooth advertising extensions, a thorough rework of its device model, improvements for MMU-based systems such as demand paging, initial support for a new TCP stack, and improved toolchain support for both proprietary toolchains and LLVM.
As we've seen, Zephyr's niche lies somewhere between bare-metal and full-featured operating systems. While relatively young in its current form, Zephyr is moving quicky and has accomplished much. The project has broad ambitions, and aims to provide a soup-to-nuts menu for firmware development. With that, we hope to have resolved any remaining puzzlement — and perhaps to see some more LWN readers saying hello and giving Zephyr a try.
Emulating Windows system calls in Linux
The idea of handling system calls differently depending on the origin of each call in the process's address space is not entirely new. OpenBSD, for example, disallows system calls entirely if they are not made from the system's C library as a security-enhancing mechanism. At the end of May, Gabriel Krisman Bertazi proposed a similar mechanism for Linux, but the objective was not security at all; instead, he is working to make Windows games run better under Wine. That involves detecting and emulating Windows system calls; this can be done through origin-based filtering, but that may not be the solution that is merged in the end.To run with any speed at all, Wine must run Windows code directly on the CPU to the greatest extent possible. That must end, though, once the Windows program makes a system call; trapping into the Linux kernel with the intent of making a Windows system call is highly unlikely to lead to good results. Traditionally, Wine has handled this by supplying its own version of the user-space Windows API that implemented the required functionality using Linux system calls. As explained in the patch posting, though, Windows applications are increasingly executing system calls directly rather than going through the API; that makes Wine unable to intercept them.
The good news is that Linux provides the ability to intercept system calls in the form of seccomp(). The bad news is that this mechanism, as found in current kernels, is not suited to the task of intercepting only system calls made from Windows code running within a larger process. Intercepting every system call would slow things down considerably, an effect that tends to make gamers particularly cranky. Tracking which parts of a process's address space make Linux system calls and which make Windows calls within the (classic) BPF programs used by seccomp() would be awkward at best and, once again, would be slow. So it seems that a new mechanism is called for.
The patch set adds a new memory-protection bit for mmap() called PROT_NOSYSCALL which, by default, does not change the kernel's behavior. If, however, a given process has turned on the new SECCOMP_MODE_MMAP mode in seccomp(), any system calls made from memory regions marked with PROT_NOSYSCALL will be trapped; the handler code can then emulate the attempted system call.
The cover letter notes that one should not rely on this mechanism the way OpenBSD uses its origin verification:
seccomp() is used for this non-security feature, the text continues, because the alternative would be to duplicate much of its functionality.
The patch series generated a fair amount of discussion from developers who
were not entirely comfortable with this mechanism. Kees Cook, for example,
asked
whether it would instead be possible to rewrite the Windows binary code at
load time, replacing
system calls with calls to the emulation functions. The answer, it seems,
is "no". Modifying a game's code is likely to set off checks made to
defeat cheaters, who also would otherwise make code modifications of their
own. Wine developer Paul Gofman added
that, to make such changes, Wine "would need some
way to find those syscalls in the highly obfuscated dynamically
generated code, the whole purpose of which is to prevent disassembling,
debugging and finding things like that in it
".
Matthew Wilcox, instead, suggested that the personality() mechanism could be extended to support a Windows personality. This, essentially, would create a new system-call entry point that would emulate the Windows calls. Gofman replied that this approach had been considered, but that the cost of executing the personality() call on each transition between Linux and Windows code would be too high. A possible solution here is to implement a special personality that looks at a flag, stored in user-space memory, to determine how system calls should be handled. Gofman offered to create a Wine patch using such a mechanism if an implementation existed; Krisman said that he would give it a try.
Andy Lutomirski had a couple of other suggestions, the first of which was a prctl() operation that would redirect all system calls through a user-space trampoline. System calls from the trampoline itself would be executed normally. In Wine's case, that trampoline could emulate system calls from Windows code while passing Linux system calls through to the kernel. Krisman indicated interest in this approach, and may implement a version of this idea as well.
Lutomirski's other
idea was to allow a process to establish an (extended) BPF filter
program for all system calls; he later extended
this idea to have it handle all "architectural privilege
transitions
" for the process. This approach offers a lot of
flexibility and may be useful far beyond Wine, but it suffers from a
significant flaw: in the absence of unprivileged BPF, it could only be
invoked by a privileged process, which is a show-stopper for Wine. Unless
something changes, unprivileged BPF is an idea
that isn't going anywhere in Linux, so the filter program does not look
like a solution that Wine could use.
The end result of this discussion is that the problem is reasonably well understood and there is a shared desire to solve it. What form that solution will take is far from clear, though; there are a few approaches that need to be experimented with. Expect to see more patches in the future as the developers work to find which idea works best.
Stirring things up for Fedora 33
The next release of the Fedora distribution — Fedora 33 — is currently scheduled for the end of October. Fedora's nature as a fast-moving distribution ensures that each release will contain a number of attention-getting changes, but Fedora 33 is starting to look like it may be a bit more volatile than its immediate predecessors. Several relatively controversial changes are currently under discussion on the project's mailing lists; read on for a summary.
The end of mod_php
For many years, the mod_php Apache module was the preferred way to run PHP code in response to web requests. But that was many years ago; the recommended module now is php-fpm. Reasons for switching include support for threaded modules, the ability to work with nginx, and removal of the PHP interpreter from the web server's address space. Fedora has supported both modules for some time, but is currently planning to remove mod_php and support only php-fpm as of Fedora 33.
Unsurprisingly, a module with as much history as mod_php still has a few users, a couple of whom made themselves heard in the mailing-list discussion. For example, John M. Harris Jr. said:
Neal Gompa responded:
It is absolutely a good idea to take away this choice from our builds when the advice from framework developers, ecosystem experts, and the upstream developers is to *not* use mod_php.
Unless some higher level of outcry manifests, it seems likely that there is nowhere near enough opposition to derail this plan. Fedora users who are running mod_php will want to look at moving over to php-fpm in the near future.
No more swap partition
Fedora normally installs itself with a modest swap partition to take pressure off of memory. This proposal, however, proposes to do away with the swap file and, instead, set up a virtual swap device using zram. The zram device works by compressing memory contents and storing them back in memory; if the contents of RAM compress well, swapping them to zram can free up considerable amounts of space. Additionally, zram is often faster than swapping to a storage device, even when considering that the CPU must do the compression and decompression.
This proposal created a fairly long discussion thread, most of which made little useful progress toward a considered decision of this proposal. Participants seemed generally in favor, with a few being not fully convinced that the change made sense. One objection that was raised is that a system with no swap file cannot be hibernated; according to Chris Murphy, who is the developer behind this proposal, that is not a big concern:
One other question that came up is what should happen to an existing swap file when an older Fedora system is upgraded. The obvious solution is to just leave that arrangement in place, but Murphy also made the case for removing the swap file and switching the system to zram; to do otherwise, he said, would fragment the user base. How that detail will be worked out remains to be seen, but swap-on-zram as a whole looks set to go forward.
Compiler policies
Longstanding Fedora policy says that the entire distribution is to be built with the GCC compiler, with exceptions only for packages that cannot be built with GCC. The plan for Fedora 33 is to get rid of that requirement. Instead, the compiler used for any given package would be the one preferred by upstream (assuming that the upstream for that package has expressed a preference, of course). That would open the way for building a number of LLVM-preferring packages without maintainers having to struggle to get a working build with GCC.
Compilers tend to evoke emotional reactions, and that was the case here.
Kevin Kofler opposed
the change, saying: "We have a system compiler for a
reason
". Some, such as Jakub Jelinek,
pointed to ABI incompatibilities and worried about introducing subtle bugs
into the distribution, but the prevailing view seems to be that such
incompatibilities are rare and should be treated as bugs when they are
found. Gompa stated
that moving OpenMandriva to LLVM hurt both performance and security, but
did not offer specifics.
Jeff Law, who is driving this change, said
that using the upstream-preferred compiler would make life easier for a lot
of Fedora package maintainers. He also asserted that "packages in
Fedora should be as close to upstream as possible
", and that the
choice of toolchain is a part of that. This change, too, seems likely to
make it through to the Fedora 33 release.
Default editors
By far the biggest thread (so far) was, entirely predictably, provoked by this proposal to set the default editor on Fedora systems to nano. Not everybody understands the implications of changing which toolchain is used to build a given package, but everybody knows what their favorite editor is and is usually willing to tell others about it. Fedora currently does not set a default editor, which means that users typically end up in vi (or vim), which is not the friendliest choice for people who are not familiar with that editor's quirks. Picking nano would give new users a fighting chance at figuring out how to exit the editor, at least, and maybe even (intentionally) making a change or two first.
The thread was long, but a summary need not be. Opinions varied from strong opposition like this message from Jan Kratochvil:
to this
comment from Adam Williamson: "My only regret is that I have but
one +1 to give to this proposal!
". In the end, most participants
seem to recognize that vi is not the friendliest experience for new users,
and that experienced users can easily set the EDITOR environment
variable to get the editor they want.
Switching to Btrfs
Then, there is this proposal to make Btrfs the default filesystem for new Fedora installations. This is not a new idea; indeed, there was once an approved plan to switch to Btrfs for Fedora 16 in 2011. There are obvious advantages to switching to a modern filesystem like Btrfs, including its storage-management and snapshotting capabilities. Btrfs turned out to be too unstable in 2011, though, and that plan was dropped.
Concerns about stability were quick to come up this time as well; Vitaly Zaitsev was quick to assert that:
On the other hand, Btrfs developer (and member of the group pushing this
proposal) Josef Bacik pointed
out that Btrfs is deployed on vast numbers of machines at Facebook, a
choice that has "worked out very well
" even though Facebook
deliberately puts it on low-quality storage hardware.
It is too soon to predict whether Btrfs will prove more successful at being adopted by Fedora now than it did nine years ago. But the value to the distribution of having a filesystem like Btrfs available is clear, and there is a long list of capable developers pushing this proposal. Fedora users may yet get Btrfs by default before 2030.
One that didn't make it
As long as they don't actively cause problems, retired packages on Fedora systems tend to go into a sort of limbo state. They remain on the systems where they have been installed even as those systems are upgraded to new releases where those packages are no longer present. Over time, they may develop problems or security issues, but they sit there like that server everybody has forgotten about until something happens to draw somebody's attention.
This proposal envisioned a scheme where retired packages would be explicitly obsoleted by a new metapackage called fedora-retired-packages; that would cause them to be automatically removed one release after they were retired. Some developers welcomed the idea of cleaning up unmaintained packages, while others strongly opposed removing packages that probably still work from systems where they may be needed. A separate subthread got into disagreements about the mechanism used, suggesting that removal should be handled explicitly in the DNF package manager rather than through a magic metapackage.
In the end, Miroslav Suchý, who was pushing this proposal, decided to withdraw it. He will work on a new proposal for a tool that can allow users to remove unmaintained packages if they see fit.
The Fedora 33 schedule currently places a deadline of June 30 for system-wide change proposals and July 21 for proposals for self-contained changes. The above proposals were all of the system-wide variety with the exception of dropping mod_php. There is, thus, time for more changes to be proposed for the next Fedora release. Community members might be forgiven, though, if they thought that there is already enough on the list for this time around.
First PHP 8 alpha released
The PHP project has released the first alpha of PHP 8, which is slated for general availability in November 2020. This initial test release includes many new features such as just-in-time (JIT) compilation, new constructs like Attributes, and more. One of twelve planned releases before the general availability release, it represents a feature set that is still subject to change.
The PHP 8 release is being managed by contributors Sara Golemon and Gabriel Caruso. Dubbed "Alpha 1", this first release of PHP 8 is one of three releases to be done prior to a feature freeze. During this time, more widespread testing of new features is performed by the community and implementation details are worked out. This process will continue until August 4, at which point the feature set will be frozen to coincide with the first beta release scheduled for August 6.
Some interesting features
The release announcement omitted any specifics on new features or other changes, which would typically accompany a release. For now, the proposals that have been approved and implemented in the Request for Comments section of the PHP wiki is the best source of what is in the release now, and what might still be on the way.
Major-version PHP releases always have at least one significant improvement, and in this case, that is JIT for PHP 8. JIT will enable the engine to compile PHP code — a single PHP function or an entire application — to machine code for better performance. The RFC on the implementation provides a significant amount of detail regarding the JIT implementation for interested readers.
Beyond major improvements like JIT, there are also other new language-level features including Attributes. For those who are unaware, Attributes offer structured syntactic metadata for declarations in PHP code such as classes, functions, methods, and properties. Similar features already exist in other popular languages, such as annotations in Java and decorators in Python. Attributes replace the widely used phpDocumentor syntax for documentation block comments that is often deployed in PHP applications to serve a similar need. Currently in PHP 7, these comments are parsed at run time using PHP's reflection API to extract metadata.
One example of the existing use of this approach is the popular unit-testing framework PHPUnit, which uses documentation block annotations to implement things like order of operations in testing methods. With PHP 8, these same annotations can be defined and formalized into the language itself, eliminating the need for the expensive run-time comment parsing currently required. Attributes also may play a role in defining (or excluding) targets of PHP 8's new JIT compiler. Note that the Attributes feature is still in flux, with significant changes to the behavior still being decided and implemented before the August feature-freeze deadline.
PHP 8 will also support some desirable new language features for typing, continuing the trend of building robust data-type handling into the traditionally dynamically-typed language. Looking at current PHP 7 releases, there is support for typing in function/method declarations. However, this support is limited to two options: either a single data type is specified as part of the declaration, or no data type is specified at all. If no data type is specified, it is up to the developer to implement their own type-checking logic on an otherwise typeless value. This is less than ideal, as sometimes a method could reasonably be written that is given an integer or floating point number, but not a string. Like annotations, currently PHP projects handle this dilemma with documentation block comments that are intended to specify variable type details — but those details are not enforceable by PHP itself. To address this shortcoming, PHP 8 now supports type unions in declarations as part of its syntax, allowing developers to specify multiple types for functions, methods, and properties:
class Number {
private int|float $number;
public function setNumber(int|float $number) : void
{
$this->number = $number;
}
public function getNumber() : int|float
{
return $this->number;
}
}
In the preceding example, the | operator is used to define multiple potential data types PHP will accept. In this case, an int or a float in the various contexts of the example. These checks are largely performed at runtime, although compile-time checks are also used to catch some cases.
Other changes expected in PHP 8 and implemented in Alpha 1 are the unbundling of legacy extensions like xmlrpc, previously rejected features like catching exceptions without requiring a variable for them, and the new Stringable interface to better handle __toString() implementations consistently in an application. In PHP, __toString() is a "magic method" of an object that, when provided, is used to return a string representation of an object. The Stringable interface provides a type that either accepts a primitive string data-type, or an object implementing the __toString() method for use in type-hinting.
More to come
This is only the first public release of the upcoming PHP 8 code base. As PHP 8 releases head toward general availability, future articles will follow the progress. The next release, Alpha 2, is scheduled for July 9. There are still many different discussions happening regarding features that may make it into the PHP 8.0 release, depending on whether they can be finalized in time for the feature-freeze deadline. In the meantime, early adopters can begin testing their code bases and reporting any bugs they might find.
Managing tasks with todo.txt and Taskwarrior
One quote from Douglas Adams has always stayed with me: "I love
deadlines. I like the whooshing sound they make as they fly by
". We
all lead busy lives and few ever see the bottom of our long to-do lists.
One of the oldest items on my list, ironically, is to find a better system
to manage all my tasks. Can task-management systems make us more productive
while, at the same time, reducing the stress caused by the sheer number of
outstanding tasks? This article looks at todo.txt and Taskwarrior.
The management of tasks is rather personal and people have completely different approaches and philosophies. This is, of course, reflected in the requirements for, and expectations from, a task manager. Requirements can also change as our interaction with computers changes. For example, while I put a lot of emphasis on managing tasks via the command line in the past, these days I'm more interested in a good mobile app (to add tasks on the go and to receive reminders) and web support (to get an overview of all tasks).
A good way to filter tasks is also essential for me. One of the reasons for using task-management software is so you can stop worrying about tasks until they become relevant. This requires a way to find relevant tasks when needed, such as when the due date is coming up soon or because you're in a relevant setting or place (often called a "context" in task-management systems). Going to the supermarket would be a good time to bring up a shopping list, for example. Task-management systems offer a number of ways to organize information that can be used in filters, such as tags, contexts (often stored as tags in the form of @tag, such as @home), and lists.
In a series of two articles, we'll review four systems for managing tasks and to-do items around which open-source ecosystems have formed.
Simple task management with todo.txt
Todo.txt is a simple plain-text format to specify tasks. Each line describes one task, and tasks can have a priority (e.g. (A)), a project (+LWN), and a context (@home). The specification also defines the tag:value syntax but only mentions due (due dates) specifically. A number of custom tags are in common use, such as t for threshold dates (i.e. start dates) and rec for recurring tasks. Tasks are marked as complete by adding a lowercase x at the beginning of the line. An example might look something like this:
(A) Proofread article +LWN due:2020-06-25 Revisit task managers @home t:2025-01-01 x Provide todo.txt examples +LWN
The todo.txt web site lists a lot of tools built around the file format. Unfortunately, the first impression isn't particularly great since a lot of the tools are out-of-date or unmaintained. Todo.txt Touch, the project's official app for iOS, which is placed prominently on the web site, had its last commit in 2014 and was removed from Apple's App Store in 2017 because of incompatibilities with Dropbox. The Android app was removed from Google Play for the same reason.
![Markor [Markor]](https://static.lwn.net/images/2020/todo-markor-sm.png)
While it would be nice if the web site offered a more curated list of actively developed software, clicking on all the links eventually revealed that there is an active ecosystem around todo.txt. There is support for a wide range of editors, including a Vim plugin that supports syntax highlighting and presents overdue tasks as errors. Additionally, todoTxtWebUi lets you add tasks in your browser; it also supports basic filters, but there's no way to define and store more complex filters.
Simpletask is an actively developed Android app. Adding new tasks is simple and the app makes it possible to create complex filters. There is support for Dropbox and Nextcloud. Using cloud services appears to be the recommended way to sync tasks in the todo.txt ecosystem; the problem of conflicts, which can happen when tasks are edited on multiple devices, is not addressed, however.
Markor (seen at right) is another interesting app for Android in this context. It is not a task manager; instead it is an editor with support for a number of formats including Markdown, YAML, and todo.txt. Adding tasks is a pleasure due to Markor's syntax highlighting, which can be seen in the screen shot. Markor doesn't allow users to group, sort, or search tasks, but improvements are under discussion.
Overall todo.txt is a simple system that aims to get out of your way.
The system reflects the philosophy of founder Gina Trapani, who
remarked: "To me, todo.txt is a task list, not a reminder
tool, or a calendar
". While I personally want a task manager that
reminds me of upcoming tasks so I can stop thinking about them until I need
to, a simple approach has its advantages and will appeal to
some.
Fighting tasks with Taskwarrior
Taskwarrior is another task manager around which a healthy community has formed. In contrast to todo.txt, Taskwarrior supports a rich set of features and attributes, including various dates (such as start, end and due dates), dependencies, projects, and tags. User-defined attributes can also be added. Taskwarrior sets virtual tags automatically depending on the situation, such as TODAY, or, maybe more commonly seen, OVERDUE. The project even supports a Document Object Model (DOM) through which data can be accessed.
While tasks are stored in human-readable text files, interaction is through the command-line tool task. It makes adding, editing, and querying tasks easy. Taskwarrior supports filters, automatically calculates priorities, and integrates a calendar view and statistics. It does not dictate the user's workflow or the task-management methodology to be followed, but there is a helpful write-up about implementing the popular Getting Things Done (GTD) system with Taskwarrior.
Many tools build on Taskwarrior. For example, Tasksh is an interactive shell which makes listing and editing tasks easy. It's particular useful for the periodic review of tasks. VIT, the Visual Interactive Taskwarrior, is a curses-based frontend, which will feel familiar to those who work with Vim and Mutt. With these tools, the Taskwarrior ecosystem offers a range of complementary text-based tools.
For those who prefer managing their tasks in a web browser, TaskwarriorWeb is one option. It has a simple but modern design. Unfortunately, it doesn't expose all of Taskwarrior's functionality (such as dependencies) and has limited capabilities to group and filter tasks. Furthermore, the status of the project isn't clear. While a move to the official GitHub organization for Taskwarrior was agreed to in 2018, the project still hasn't moved; many pull requests remain open, including one to implement some important functionality: filtering by tags.
There are two options for Android. TaskwarriorC2 is a cross-platform GUI client for Taskwarrior available on Google Play. Despite using the Taskwarrior logo, the app does not come from the Taskwarrior project; in addition, TaskwarriorC2 does not have a license in its repository, though the source is available. While the app offers many filters and reports, I didn't find the interface to be intuitive. Foreground is an Android app that is visually more appealing and easier to use. It shows much promise but is quite limited at the moment. For example, you cannot filter by project and there are no notifications, which is a feature some users expect from a task manager on a mobile device.
Of course, the question of syncing data will come up when someone wants to use Taskwarrior on multiple devices. Unlike todo.txt, Taskwarrior offers a solution in the form of Taskserver. For those who don't want to run their own server, there are several hosted alternatives. FreeCinc is an open-source, shared Taskserver where users can store tasks at no charge. Inthe.AM is another open-source online system available at no charge, but it goes beyond merely syncing tasks. It offers several features that extend Taskwarrior, such as RSS and iCalendar feeds, integration with Trello (a proprietary project-management tool), and adding tasks via email or SMS text message. Inthe.AM also offers a web interface to manage tasks with a modern look (seen below), although not all functionality from Taskwarrior is exposed.
Taskwarrior has a healthy ecosystem; there are many other interesting tools that cannot be covered in detail. Bugwarrior enables the import of issues from a number of bug-tracking systems, taskopen is a script for taking notes and opening URLs with Taskwarrior, and kanbanwarrior is a simple script that facilitates a Kanban workflow. There are also extensions for GNOME Shell (Taskwarrior Integration and Taskwhisperer).
Summary
Todo.txt and Taskwarrior show different approaches to task management. While todo.txt follows a simple approach to capturing and dealing with tasks, Taskwarrior offers a feature-rich system that enables different workflows for task management. Both systems are widely used and offer a range of tools. Taskwarrior, in particular, has great text-based tools. For both systems, solutions for the web and mobile devices are more limited at this point. Next up, we'll review tools that use the Org mode file format and iCalendar standard. Stay tuned ...
Generics for Go
The Go programming language was first released in 2009, with its 1.0 release made in March 2012. Even before the 1.0 release, some developers criticized the language as being too simplistic, partly due to its lack of user-defined generic types and functions parameterized by type. Despite this omission, Go is widely used, with an estimated 1-2 million developers worldwide. Over the years there have been several proposals to add some form of generics to the language, but the recent proposal written by core developers Ian Lance Taylor and Robert Griesemer looks likely to be included in a future version of Go.
Background
Go is a statically typed language, so types are specified in the source code (or inferred from it) and checked by the compiler. The compiler produces optimized machine code, so CPU-intensive code is significantly more efficient than languages like Python or Ruby, which have bytecode compilers and use virtual machines for execution.
Generics, also known as "parameterized types" or "parametric polymorphism", are a way to write code or build data structures that will work for any data type; the code or data structure can be instantiated to process each different data type, without having to duplicate code. They're useful when writing generalized algorithms like sorting and searching, as well as type-independent data structures like trees, thread-safe maps, and so on. For example, a developer might write a generic min() function that works on all integer and floating-point types, or create a binary tree that can associate a key type to a value type (and work with strings, integers, or user-defined types). With generics, you can write this kind of code without any duplication, and the compiler will still statically check the types.
Like
the first versions of Java, Go doesn't ship with user-defined generics. As
the Go FAQ notes,
generics "may well be added at some point
"; it also describes
how leaving them out was an intentional trade-off:
Generics are convenient but they come at a cost in complexity in the type system and run-time. We haven't yet found a design that gives value proportionate to the complexity, although we continue to think about it. Meanwhile, Go's built-in maps and slices, plus the ability to use the empty interface to construct containers (with explicit unboxing) mean in many cases it is possible to write code that does what generics would enable, if less smoothly.
Part of the reason actual users of the language don't complain loudly about the lack of generics is that Go does include them for the built-in container types, specifically slices (Go's growable array type), maps (hash tables), and channels (thread-safe communication queues). For example, a developer writing blog software might write a function to fetch a list of articles or a mapping of author ID to author information:
// takes ID, returns "slice of Article" (compiler checks types) func GetLatestArticles(num int) []Article { ... } // takes "slice of int" of IDs, returns "map of int IDs to Author" func GetAuthors(authorIDs []int) map[int]Author { ... }
Built-in functions like len() and append() work on these container types, though there's no way for a developer to define their own equivalents of those generic built-in functions. As many Go developers will attest, having built-in versions of growable arrays and maps that are parameterized by type goes a long way, even without user-defined generic types.
In addition, Go has support for two features that are often used instead of generics or to work around their lack: interfaces and closures. For example, sorting in Go is done using the sort.Interface type, which is an interface requiring three methods:
type Interface interface { Len() int // length of this collection Less(i, j int) bool // true if i'th element < j'th element Swap(i, j int) // swap i'th and j'th elements }
If a user-defined collection implements this interface, it is sortable using the standard library's sort.Sort() function. Since sort.Slice() was added in Go 1.8, developers can use that function and pass in a "less-than closure" rather than implementing the full sorting interface; for example:
// declare a struct for names and ages and a slice of those structs with four entries people := []struct { Name string Age int }{ {"Gopher", 7}, {"Alice", 55}, {"Vera", 24}, {"Bob", 75}, } // sort people using the "less-than closure" specified in the call sort.Slice( people, func(i, j int) bool { // i and j are the two slice indices return people[i].Name < people[j].Name }, )
There are other ways to work around Go's lack of generics, such as creating container types that use interface{} (the "empty interface"). This effectively boxes every value inserted into the collection, and requires run-time type assertions, so it is neither particularly efficient nor type-safe. However, it works and even some standard library types like sync.Map use this approach.
Some developers go so far as to argue that generics shouldn't be added to Go at
all, since they will bring too much complexity. For example, Greg
Hall hopes
"that Go never has generics, or if it does, the designers find some
way to avoid the complexity and difficulties I have seen in both Java
generics and C++ templates
".
The Go team takes the complexity issue seriously. As core developer Russ Cox states in his 2009 article "The Generic Dilemma":
It seems like there are three basic approaches to generics:
- (The C approach.) Leave them out. This slows programmers. But it adds no complexity to the language.
- (The C++ approach.) Compile-time specialization or macro expansion. This slows compilation. It generates a lot of code, much of it redundant, and needs a good linker to eliminate duplicate copies. [...]
- (The Java approach.) Box everything implicitly. This slows execution. [...]
The generic dilemma is this: do you want slow programmers, slow compilers and bloated binaries, or slow execution times?
Still, many Go developers are asking for generics, and there has been a huge amount of discussion over the years on the best way to add them in a Go-like way. Several developers have provided thoughtful rationale in "experience reports" from their own usage of Go. Taylor's entry in the official Go blog, "Why Generics?", details what adding generics will bring to Go, and lists the guidelines the Go team is following when adding them:
Most importantly, Go today is a simple language. Go programs are usually clear and easy to understand. A major part of our long process of exploring this space has been trying to understand how to add generics while preserving that clarity and simplicity. We need to find mechanisms that fit well into the existing language, without turning it into something quite different.
These guidelines should apply to any generics implementation in Go. That's the most important message I want to leave you with today: generics can bring a significant benefit to the language, but they are only worth doing if Go still feels like Go.
The recent proposal
Taylor, in particular, has been prolific on the subject of adding
generics to Go, having written no
fewer than six proposals. The first four, written from 2010 through 2013,
are listed at the bottom of his document, "Go
should have generics". About them, he notes: "all are
flawed in various ways
".
In July 2019 he posted the "Why Generics?" blog article mentioned above,
which links to the lengthy
2019 proposal written by Taylor and Griesemer for a version of generics
based on "contracts".
Almost a year later, in June 2020, Taylor and Griesemer published the
current proposal, which avoids adding contracts. In Taylor's words:
An earlier draft design of generics implemented constraints using a new language construct called contracts. Type lists appeared only in contracts, rather than on interface types. However, many people had a hard time understanding the difference between contracts and interface types. It also turned out that contracts could be represented as a set of corresponding interfaces; thus there was no loss in expressive power without contracts. We decided to simplify the approach to use only interface types.
The removal of contracts comes in part based on work by Philip Wadler
and his collaborators in their May 2020 paper, "Featherweight Go [PDF]" (video
presentation). Wadler is a type theorist who has contributed to the
design of Haskell, and was involved in adding generics to Java back in
2004. Rob Pike, one of Go's creators, had asked Wadler if he would
"be interested in helping us get polymorphism right (and/or figuring
out what 'right' means) for some future version of Go
"; this
paper is the response to Pike's request.
The 2020 proposal suggests adding optional type parameters to functions and types, allowing generic algorithms and generic container types, respectively. Here is an example of what a generic function looks like under this proposal:
// Stringify calls the String method on each element of s, // and returns the results. func Stringify(type T Stringer)(s []T) []string { var ret []string for _, v := range s { ret = append(ret, v.String()) } return ret } // Stringer is a type constraint that requires the type argument to have // a String method and permits the generic function to call String. // The String method should return a string representation of the value. type Stringer interface { String() string }
The type parameter is T (an arbitrary name), specified in the extra set of parentheses after the function name, along with the Stringer constraint: type T Stringer. The actual arguments to the function are in the second set of parentheses, s []T. Writing functions like this is not currently possible in Go; it does not allow passing a slice of a concrete type to a function that accepts a slice of an interface type (e.g., Stringer).
In addition to generic functions, the new proposal also supports parameterization of types, to support type-safe collections such as binary trees, graph data structures, and so on. Here is what a generic Vector type might look like:
// Vector is a name for a slice of any element type. type Vector(type T) []T // Push adds a value to the end of a vector. func (v *Vector(T)) Push(x T) { *v = append(*v, x) } // v is a Vector of Authors var v Vector(Author) v.Push(Author{Name: "Ben Hoyt"})
Because Go doesn't support operator overloading or define operators in terms of methods, there's no way to use interface constraints to specify that a type must support the < operator (as an example). In the proposal, this is done using a new feature called "type lists", an example of which is shown below:
// Ordered is a type constraint that matches any ordered type. // An ordered type is one that supports the <, <=, >, and >= operators. type Ordered interface { type int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, uintptr, float32, float64, string }
In practice, a constraints package would probably be added to the standard library which pre-defined common constraints like Ordered. Type lists allow developers to write generic functions that use built-in operators:
// Smallest returns the smallest element in a slice of "Ordered" values. func Smallest(type T Ordered)(s []T) T { r := s[0] for _, v := range s[1:] { if v < r { // works due to the "Ordered" constraint r = v } } return r }
The one constraint that can't be written as a type list is a constraint for the == and != operators, because Go allows comparing structs, arrays, and interface types for equality. To solve this, the proposal suggests adding a built-in comparable constraint to allow equality operators. This would be useful, for example, in a function that finds the index of a value in a slice or array:
// Index returns the index of x in s, or -1 if not found. func Index(type T comparable)(s []T, x T) int { for i, v := range s { // v and x are type T, which has the comparable // constraint, so we can use == here. if v == x { return i } } return -1 }
Taylor and Griesemer have developed a tool for experimentation (on the go2go branch) that converts the Go code as specified in this proposal to normal Go code, allowing developers to compile and run generic code today. There's even a version of the Go playground that lets people share and run code written under this proposal online — for example, here is a working example of the Stringify() function above.
The Go team is asking developers to try to solve their own problems with the generics experimentation tool and send detailed feedback in response to the following questions:
First, does generic code make sense? Does it feel like Go? What surprises do people encounter? Are the error messages useful?
Second, we know that many people have said that Go needs generics, but we don't necessarily know exactly what that means. Does this draft design address the problem in a useful way? If there is a problem that makes you think "I could solve this if Go had generics," can you solve the problem when using this tool?
Discussion
There has been a lot of public discussion about generics on the main golang-nuts mailing list since the latest proposal was published, as well as on Hacker News and reddit.com/r/golang threads.
As Pike said [YouTube] last year, "syntax is not the problem, at least not yet", however, many of the threads on the mailing list have been immediately critical of the syntax. Admittedly, the syntax is unusual, and it adds another set of (round) parentheses to Go, which is already known for having lots of parentheses (for example, Go's method definitions use one set for the method's receiver type, and another for the method's arguments). The proposal tries to preempt the syntax bikeshedding with an explanation of why they chose parentheses instead of angle brackets:
When parsing code within a function, such as v := F<T>, at the point of seeing the < it's ambiguous whether we are seeing a type instantiation or an expression using the < operator. Resolving that requires effectively unbounded lookahead. In general we strive to keep the Go parser efficient.
Most responders on the mailing list are proposing the use of angle brackets like C++, Java, and C#, for example, using List<T> instead of List(T). Taylor is much more interested in whether the semantics of the new proposal make sense, but has been patiently replying to each of these syntax threads with something like the following:
Let's see what real code looks like with the suggested syntax, before we worry about alternatives. Thanks.
This has happened so many times that one mailing list contributor, Tyler Compton, compiled a helpful list of all the syntax-related threads.
Generics will help eliminate types and functions repeated for multiple
types, for example sort.Ints, sort.Float64s, and
sort.Strings in the sort package. In a comment on Hacker
News, Kyle Conroy showed "a four-line replacement for the various
sql.Null*
types in the standard library
":
type Null(type T) struct { Val T Valid bool // Valid is true if Val is not NULL }
Mailing list contributor Pee Jai wondered
whether there's a way to constrain a type to only allow structs, but Taylor
indicated that's not possible; he noted
that "generics don't solve all problems
". Robert Engels said
that the reflect package
would still be needed for this case anyway.
In one thread, "i3dmaster" asked
some questions about custom map types, and Taylor clarified
that "custom container types aren't going to support len()
or range
". Creators of collection types won't have access
to this special syntax, but will need to define their own Len()
method, and their own way to iterate through the collection.
Go core contributor Bryan Mills has posted insightful replies on a number of threads. He has also created his own repository with various notes and code examples from his experiments with generics, including an explanation about why he considers type lists less than ideal. The repository also includes various attempts at re-implementing the append() built-in using generics as proposed.
Timeline
In their recent blog entry, Taylor and Griesemer are clear that adding generics to the language won't be a quick process — they want to get it right, and take into account community feedback:
We will use the feedback we gather from the Go community to decide how to move forward. If the draft design is well received and doesn't need significant changes, the next step would be a formal language change proposal. To set expectations, if everybody is completely happy with the design draft and it does not require any further adjustments, the earliest that generics could be added to Go would be the Go 1.17 release, scheduled for August 2021. In reality, of course, there may be unforeseen problems, so this is an optimistic timeline; we can't make any definite prediction.
My own guess is that August 2021 (just over a year away) is optimistic for a feature of this size. It's going to take quite a while to solicit feedback, iterate on the design, and implement generics in a production-ready way instead of using the current Go-to-Go translator. But given the number of proposals and the amount of feedback so far, generics are sure to be a much-used (and hopefully little-abused) feature whenever they do arrive.
Page editor: Jonathan Corbet
Next page:
Brief items>>