|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for July 19, 2018

Welcome to the LWN.net Weekly Edition for July 19, 2018

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Deep learning and free software

By Jake Edge
July 18, 2018

Deep-learning applications typically rely on a trained neural net to accomplish their goal (e.g. photo recognition, automatic translation, or playing go). That neural net uses what is essentially a large collection of weighting numbers that have been empirically determined as part of its training (which generally uses a huge set of training data). A free-software application could use those weights, but there are a number of barriers for users who might want to tweak them for various reasons. A discussion on the debian-devel mailing list recently looked at whether these deep-learning applications can ever truly be considered "free" (as in freedom) because of these pre-computed weights—and the difficulties inherent in changing them.

The conversation was started by Zhou Mo ("Lumin"); he is concerned that, even if deep-learning application projects release the weights under a free license, there are questions about how much freedom that really provides. In particular, he noted that training these networks is done using NVIDIA's proprietary cuDNN library that only runs on NVIDIA hardware.

Even if upstream releases their pretrained model under GPL license, the freedom to modify, research, reproduce the neural networks, especially "very deep" neural networks is de facto [controlled] by PROPRIETARIES.

While it might be possible to train (or retrain) these networks using only free software, it is prohibitively expensive in terms of CPU time to do so, he said. So, he asked: "Is GPL-[licensed] pretrained neural network REALLY FREE? Is it really DFSG-compatible?" Jonas Smedegaard did not think the "100x slower" argument held much water in terms of free-software licensing. Once Mo had clarified some of his thinking, Smedegaard said:

I believe none of the general public licenses (neither liberal nor copyleft) require non-[ridiculous] cost for the freedoms protected.

I therefore believe there is no license violation, as long as the code is _possible_ to compile without non-free code (e.g. blobs to activate GPUs) - even if ridiculously expensive in either time or hardware.

He did note that if rebuilding the neural network data was required for releases, there was a practical problem: blocking the build for, say, 100 years would not really be possible. That stretches way beyond even Debian's relatively slow release pace. Theodore Y. Ts'o likened the situation to that of e2fsprogs, which distributes the output from autoconf as well as the input for it; many distributions will simply use the output as newer versions of autoconf may not generate it correctly.

Ian Jackson strongly stated that GPL-licensed neural networks were not truly free, nor are they DFSG compatible in his opinion:

Things in Debian main [should] be buildable *from source* using Debian main. In the case of a pretrained neural network, the source code is the training data.

In fact, they are probably not redistributable unless all the training data is supplied, since the GPL's definition of "source code" is the "preferred form for modification". For a pretrained neural network that is the training data.

But there may be other data sets that have similar properties, Russ Allbery said in something of a thought experiment. He hypothesized about a database of astronomical objects where the end product is derived from a huge data set of observations using lots of computation, but the analysis code and perhaps some of the observations are not released. He pointed to genome data as another possible area where this might come up. He wondered whether that kind of data would be compatible with the DFSG. "For a lot of scientific data, reproducing a result data set is not trivial and the concept of 'source' is pretty murky."

Jackson sees things differently, however. The hypothetical NASA database can be changed as needed or wanted, but the weightings of a neural network are not even remotely transparent:

Compare neural networks: a user who uses a pre-trained neural network is subordinated to the people who prepared its training data and set up the training runs.

If the user does not like the results given by the neural network, it is not sensibly possible to diagnose and remedy the problem by modifying the weighting tables directly. The user is rendered helpless.

If training data and training software is not provided, they cannot retrain the network even if they choose to buy or rent the hardware.

That argument convinced Allbery, but Russell Stuart dug a little deeper. He noted that the package that Mo mentioned in his initial message, leela-zero, is a reimplementation of the AlphaGo Zero program that has learned to play go at a level beyond that of the best humans. Stuart said that Debian already accepts chess, backgammon, and go programs that he probably could not sensibly modify even if he completely understood the code.

[...] Debian rejecting the example networks as they "aren't DFSG" free would be a mistake. I view one of our roles as advancing free software, all free software. Rejecting some software because we humans don't understand it doesn't match that goal.

Allbery noted that GNU Backgammon (which he packages for Debian) was built in a similar way to AlphaGo Zero: training a neural network by playing against itself. He thinks the file of weighting information is a reasonable thing to distribute:

I think it's the preferred form of modification in this case because upstream does not have, so far as I know, any special data set or additional information or resources beyond what's included in the source package. They would make any changes exactly the same way any user of the package would: instantiating the net and further training it, or starting over and training a new network.

However, Ximin Luo (who filed the "intent to package" (ITP) bug report for adding leela-zero to Debian) pointed out that there is no weight file that comes with leela-zero. There are efforts to generate such a file in a distributed manner among interested users.

So the source code for everything is in fact FOSS, it's just the fact that the compilation/"training" process can't be run by individuals or small non-profit orgs easily. For the purposes of DFSG packaging everything's fine, we don't distribute any weights as part of Debian, and upstream does not distribute that as part of the FOSS software either. This is not ideal but is the best we can do for now.

He is clearly a bit irritated by the DFSG-suitability question, at least with regard to leela-zero, but it is an important question to (eventually) settle. Deep-learning will clearly become more prevalent over time, for good or ill (and Jackson made several points about the ethical problems that can stem from it). How these applications and data sets will be handled by Debian (and other distributions) will have to be worked out, sooner or later.

A separate kind of license for these data sets (training or pre-trained weights), as the Linux Foundation has been working on with the Community Data License Agreement, may help a bit, but won't be any kind of panacea. The license doesn't really change the fundamental computing resources needed to use a covered data set, for example. It is going to come down to a question of what a truly free deep-learning application looks like and what, if anything, users can do to modify it. The application of huge computing resources to problems that have long bedeviled computer scientists is certainly a boon in some areas, but it would seem to be leading away from the democratization of software to a certain extent.

Comments (41 posted)

Six (or seven) new system calls for filesystem mounting

By Jonathan Corbet
July 12, 2018
Mounting filesystems is a complicated business. The kernel supports a wide variety of filesystem types, and each has its own, often extensive set of options. As a result, the mount() system call is complex, and the list of mount options is a rather long read. But even with all of that complexity, mount() does not do everything that users would like. For example, the options for a mount operation must all fit within a single 4096-byte page — the fact that this is a problem for some users is illustrative in its own right. The problems with mount() have come up at various meetings, including at the 2018 Linux Storage, Filesystem, and Memory-Management Summit. A set of patches implementing a new approach is getting closer to being ready, but it features some complexity of its own and there are some remaining concerns about the proposed system-call API.

This patch set, from David Howells, is in its ninth revision. It makes extensive changes within the virtual filesystem layer to create the concept of a "filesystem context" that describes a specific mount operation. The questions about the internal changes have mostly been resolved at this point; things seem about ready to go in at that level. But the patch set also replaces the mount() system call with a rather more complex set of operations. (To be precise, mount() would not go away as long as it is needed, but it is unlikely to gain new functionality after the new system calls go in.)

The new way of mounting

In current kernels, a single mount() call does everything required to mount a filesystem at a specific location in the system hierarchy. With these patches applied, instead, the process would begin with a call to the new fsopen() system call:

    int fsopen(const char *fsname, unsigned int flags);

The fsname parameter identifies the type of the filesystem to be mounted — ext4 or nfs, for example — while flags is either zero or FSOPEN_CLOEXEC. This call doesn't mount any filesystems, it just creates the context in which the mount operation can be described and carried out. The return value is a file descriptor representing that context.

The next step is to provide the details for the mount to be performed; this is done by writing a series of strings to that file descriptor. The first character of the string is either "s" (to specify the source filesystem), "o" (to provide a mount option), or "x" (to execute a command). So a reasonable series of writes could be:

    s /dev/sda1
    o noatime
    x create

Note that these strings are not terminated by newlines; each write() call is supposed to convey exactly one of these strings. In this case, the strings written say that the filesystem found on /dev/sda1 should be mounted with the noatime option. The final line (with the create command) brings the filesystem context into fully formed existence, but does not actually mount it anywhere. There is also a reconfigure command that can be used to change the settings in an existing context.

Things can go wrong at any step, in which case the write() call will return an error. More detailed information about the problem can be had by reading from the file descriptor. This feature addresses one of the other problems with mount(): the inability to communicate the details of a problem to user space.

Assuming all goes well, the next step is to mount the filesystem with a call to:

    int fsmount(int fd, unsigned int flags, unsigned int ms_flags);

The filesystem-context file descriptor created by fsopen() is passed as fd to fsmount(). Once again, the only flag for flags is FSMOUNT_CLOEXEC, while ms_flags describe how the mount is to be performed. They can be used to create an unbindable or slave mount, for example (see this article for details on mount types). Some of those flags, though, duplicate options like noatime or read-only.

fsmount() returns another file descriptor corresponding to the newly mounted filesystem. Do note, though, that while the filesystem is "mounted", it has not been mounted at any specific location in the filesystem tree, so it will not be visible to users. Actually placing the filesystem into a mount namespace requires yet another system call:

    int move_mount(int from_dfd, const char *from_path,
                   int to_dfd, const char *to_path, unsigned int flags);

To put a mounted filesystem into a spot in the hierarchy, move_mount() would be called with the file descriptor from fsmount() passed as from_dfd (from_path would be NULL). The location where the filesystem should be placed is described by to_dfd and to_path in the usual manner for *at() system calls. Among other things, the to_dfd file descriptor will identify the mount namespace in which the mount appears — something that can be tricky to do currently. The flags argument is used to control behavior like following symbolic links or whether to automount filesystems when determining the source and destination locations.

As might be expected, move_mount() can also be used to relocate a fully mounted filesystem within the tree.

Other operations

That is the basic sequence of operations to mount a filesystem in the new order. But, of course, the real world is more complex than that. Users want to query filesystems, remount them into different namespaces, remount them with different options, and more. Three more system calls have been provided to make these actions possible; the first of those is fsinfo():

    int fsinfo(int dfd, const char *filename,
	       const struct fsinfo_params *params,
	       void *buffer, size_t buf_size);

This call can be used to query just about any attribute of a mounted filesystem. It is somewhat complex; interested readers can see the patch changelog for details, or the man page patch for a lot of details.

If the goal is to create a new mount of an existing filesystem, a more straightforward path is to use open_tree():

    int open_tree(unsigned int dfd, const char *pathname, unsigned int flags);

Without special flags, this call is similar to calling open() on a directory with the O_PATH flag set. It returns a file descriptor corresponding to that directory that can only be used for a small set of operations — move_mount(), for example. But with the OPEN_TREE_CLONE flag, it will make a copy of the filesystem mount that can then be mounted elsewhere; it can thus be used to create a bind mount. Add the AT_RECURSIVE flag, and a whole hierarchy can be cloned and made ready for mounting in a different context.

Finally, there is fspick():

    int fspick(unsigned int dirfd, const char *path, unsigned int flags);

This system call can be thought of as the equivalent of fsopen() for an existing mount point. It returns a file descriptor that can be written to in the same way to change the mount parameters; the "x reconfigure" string at the end creates the equivalent of a remount operation.

Playing with fire

There is relatively little controversy around most of this work, perhaps because few people have the stamina to plow through a 32-part patch set deep in the virtual filesystem layer. The concerns that have been raised have to do with the configuration API for file descriptors returned by fsopen() and fspick(). Andy Lutomirski was clear about his concerns, saying: "I think you’re seriously playing with fire with the API". His worry, echoed by Linus Torvalds, is that the API based on write() calls could be dangerous.

In particular, Lutomirski worried that an attacker might succeed in getting a setuid program to write to one of these file descriptors, giving that attacker access to files or devices that would otherwise be protected. This problem could be avoided by using the credentials of the process that created the file descriptor for all subsequent operations — something that is supposed to happen already — but that is not seen as a practical possibility; as Torvalds noted, even code that tries to get that right often makes mistakes and ends up using the credentials of the process calling write() instead.

Solving this problem requires changing the API so that a call to write() does not have arbitrary side effects in the kernel. One possibility is to create yet another system call and use it to communicate the mount parameters to the kernel; that would prevent problems resulting from a redirected write. The alternative, which seems likely to be the way things will go in the end, is to add a different system call to replace the "x" operation at the end of that series of writes. It would look something like:

    int fscommit(unsigned int fd, unsigned int cmd);

Where fd is the file descriptor for the under-construction mount point, and cmd is either FSCOMMIT_CREATE or FSCOMMIT_RECONFIGURE. The CAP_SYS_ADMIN capability would be required to perform this operation. The end result would be that, while an attacker might be able to convince a setuid program to write to the file descriptor, that attacker would not be able to actually make the changes effective without having already gained a high level of privilege.

Regardless of the final conclusion, this patch set will need to go through at least one more round before it can be merged. Torvalds has also complained that the motivation behind this work is not well described: "I sure want to see an explanation for *WHY* it adds 5000+ lines of core code" (a lot of interesting information can be found in Howells's response to that request). There is clearly some work to be done still, so this work will probably not be ready for the next merge window. In the not-too-distant future, though, the mount() system call seems likely to become obsolete.

Comments (15 posted)

Tracking pressure-stall information

By Jonathan Corbet
July 13, 2018
All underutilized systems are essentially the same, but each overutilized system tends to be overloaded in its own way. If one's goal is to maximize the use of the available computing resources, overutilization tends not to be too far away, but when it happens, it can be hard to tell where the problem is. Sometimes, even the fact that there is a problem at all is not immediately apparent. The pressure-stall information patch set from Johannes Weiner may make life easier for system administrators by exposing more information about the real utilization state of the system.

A kernel with this patch set applied will have a new virtual directory called /proc/pressure containing three files. The first, cpu, describes the state of CPU utilization in the system. Reading it will produce a line like this:

    some avg10=2.04 avg60=0.75 avg300=0.40 total=157656722

The avg numbers give the percentage of the time that runnable processes are delayed because the CPU is unavailable to them, accumulated over ten, 60, and 300 seconds. In a system with just one runnable process per CPU, the numbers will all be zero. If those numbers start to increase significantly, that means that processes are running more slowly than they otherwise would due to overloading of the CPUs. Administrators can use this information to determine whether the amount of delay due to CPU contention is within the bounds they can tolerate or whether something must be done to ensure that things run more quickly.

These delay numbers resemble the system load average, in that they both give a sense for how busy the system is. The load average is simply the number of processes waiting for the CPU (along with those in short-term I/O waits), though; it needs to be interpreted relative to the number of available CPUs to have meaning. The stall information, instead, tracks the actual amount of waiting time. It is also tracked over a much shorter time range than the load average.

The final number (total) is the total amount of time (in microseconds) during which processes were stalled. It is there to help with the detection of short-term latency spikes that wouldn't show up in the aggregated numbers. A system where a CPU is nearly always available but where occasional 10ms latency spikes are experienced may be entirely acceptable for some workloads, but not for others. For the latter group, the total count can be monitored to detect when those spikes are happening.

The next file is /proc/pressure/memory; as might be expected, it provides information on the time that processes spend waiting due to memory pressure. Its output looks like:

    some avg10=70.24 avg60=68.52 avg300=69.91 total=3559632828
    full avg10=57.59 avg60=58.06 avg300=60.38 total=3300487258

The some line is similar to the CPU information: it tracks the percentage of the time that at least one process could be running if it weren't waiting for memory resources. In particular, the time spent for swapping in, refaulting pages from the page cache, and performing direct reclaim is tracked in this way. It is, thus, a good indicator of when the system is thrashing due to a lack of memory.

The full line is a little different: it tracks the time that nobody is able to use the CPU for actual work due to memory pressure. If all processes are waiting for paging I/O, the CPU may look idle, but that's not because of a lack of work to do. If those processes are performing memory reclaim, the end result is nearly the same; the CPU is busy, but it's not doing the work that the computer is there to do. If the full numbers are much above zero, it's clear that the system lacks the memory it needs to support the current workload.

Some care has been taken to distinguish paging due to thrashing from other sorts of paging. A process that is just starting up will experience a lot of page faults as its working set is brought in, but those are not really indicative of system load. For that reason, refaulted pages — those which were evicted due to memory pressure and subsequently brought back in — are used to calculate these metrics (see this article for a description of how refaults are tracked). Even then, though, there is a twist, in that a process may need different sets of pages during different phases of its execution. To try to detect the transition between different working sets, the patch set adds tracking of whether each page has made it to the active list (was used more than once, essentially) since it was faulted in. Only the pages that are actually used are counted when the stall times are calculated.

The final file is /proc/pressure/io, which tracks the time lost waiting for I/O. This number is likely to be more difficult to make good use of without some sense for what the baseline values should be. The block subsystem isn't able to track the amount of extra time spent waiting due to contention for the device, so the resulting numbers will not be directly related to that contention.

The files in /proc/pressure track the state of the system as a whole. In systems where control groups are in use, there will also be a set of files (cpu.pressure, memory.pressure, and io.pressure) associated with each group. They can be used to ensure that the resource limits for each group make sense; they should also make it easier to determine which processes are thrashing on a busy system.

This functionality has apparently been used within Facebook for some time, and has helped considerably in the optimization of system resources and the diagnosis of problems. "We now log and graph pressure for the containers in our fleet and can trivially link latency spikes and throughput drops to shortages of specific resources after the fact, and fix the job config/scheduling", Weiner said. There is also evidently interest from the Android world, where developers are looking for better ways of detecting out-of-memory situations before system performance is entirely lost. Linus Torvalds has indicated that the idea looks interesting to him. There are still some open questions on how the CPU data is accumulated (see this message for a long explanation), but one assumes that will be worked out before too long. So, in all likelihood, the pressure-stall patches will not be stalled for too long before making it into the mainline.

Comments (10 posted)

Kernel symbol namespacing

By Jonathan Corbet
July 18, 2018
In order to actually do anything, a kernel module must gain access to functions and data structures in the rest of the kernel. Enabling and controlling that access is the job of the symbol-export mechanism. While the enabling certainly happens, the control part is not quite so clear; many developers view the nearly 30,000 symbols in current kernels that are available to all modules as being far too many. The symbol namespaces patch set from Martijn Coenen doesn't reduce that number, but it does provide a mechanism that might help to impose some order on exported symbols in general.

Kernel code can make a symbol (a function or a data structure) available to loadable modules with the EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL() macros; the latter only makes the symbol available to modules that have declared a GPL-compatible license. There is also EXPORT_SYMBOL_GPL_FUTURE(), which is meant to mark symbols that will be changed to a GPL-only export at some future time. The usage of this mechanism is also a matter for the future, though; it has not been employed since just after it was introduced in 2006. On the rare occasions when symbols have been changed to GPL-only exports, it has proved easier to just change them without putting advance notice in the code.

EXPORT_SYMBOL() works by declaring a kernel_symbol structure:

    struct kernel_symbol
    {
	unsigned long value;
	const char *name;
    };

After the link phase, this structure holds a pointer to the name of the symbol and the address corresponding to that symbol. The structures corresponding to all exported symbols are gathered together by the linker into two ELF sections in the kernel (or module) binary: __ksymtab and __ksymtab_gpl. There is no particular ordering of, or separation between, these symbols in either section; they all appear in one big pile.

Not all exported symbols are alike, though. While most of them exist because loadable modules need them to get their job done, that is not universally the case. Some may be exported as a convenient way of debugging kernel code. Others are part of a large subsystem that consists of multiple modules, and should only be used within that particular subsystem. There is no way, beyond code comments, to mark symbols like these.

Coenen's patch set seeks to address this problem by adding a simple namespace concept to exported symbols. While the default behavior will continue to be to put symbols into the unnamed global namespace, the possibility will exist to segregate symbols to a separate space where an explicit effort will be required to use them. There are two new macros for exporting symbols:

    EXPORT_SYMBOL_NS(symbol, namespace);
    EXPORT_SYMBOL_NS_GPL(symbol, namespace);

One might expect these new macros to create new sections for the namespaced symbols, but that's not what was done. Instead, the name of the namespace is appended to the symbol name and the result is placed in the same __ksymtab (or __ksymtab_gpl) section as before. So if kmalloc() were to be exported in a new MM namespace, it would appear in the resulting binary as kmalloc.MM. (Note that, in reality, a core symbol like kmalloc() probably would not be segregated in this way.)

To use symbols from a specific namespace, a module would declare its access to that namespace with:

    MODULE_IMPORT_NS(namespace);

This mechanism does use a new ELF section ("__knsimport") to hold a list of the namespaces that a given module has imported. Listing the imported namespaces is essentially all it does; the mechanism doesn't go much deeper than noting that a module wants access to a particular namespace.

The actual enforcement of the namespace mechanism can be described as "light handed". There are no indications at compile time that a namespaced symbol is being used; in the fictional example from above, code could call kmalloc() without having imported the MM namespace, and the compiler would do nothing differently. Things do change in the post-compilation modpost step, where a warning will be issued for the use of symbols from a namespace that has not been imported. Another warning will happen when the module is loaded: the kernel will notice the use of a symbol without a declaration to import its containing namespace, but nothing will prevent the actual use of this symbol.

The patch set only creates one namespace: USB_STORAGE for a set of USB symbols. It includes a mechanism to automatically create a patch for other subsystems, a feature that Greg Kroah-Hartman described as "frickin amazing". Overall, it's a small start for a mechanism that may someday help the kernel community get a handle on its huge pile of unsorted symbols, but the kernel itself started small as well. If it proves useful, it will grow over time and, perhaps, bring some order to a notoriously undisciplined part of the kernel.

Comments (2 posted)

Python post-Guido

By Jake Edge
July 17, 2018

The recent announcement by Guido van Rossum that he was stepping away from his "benevolent dictator for life" (BDFL) role for Python was met with some surprise, but not much shock, at least in the core-developer community. Van Rossum has been telegraphing some kind of change, at some unspecified point, for several years now, though the proximate cause (the "PEP 572 mess") is unfortunate. In the meantime, though, the project needs to figure out how to govern itself moving forward—Van Rossum did not appoint a successor and has left the governance question up to the core developers.

Van Rossum has been burning out over the last few years, at least partly due to keeping up with contentious discussions for PEPs he is interested in. The discussion around PEP 572 ("Assignment Expressions") is quite probably the worst offender in the history of Python. It spanned multiple enormous threads, on two different mailing lists (python-ideas to start, then to python-dev once it was "ready"), spawned two separate polls (neither of which were favorably inclined toward the feature), and seemed, at times, interminable. Perhaps the most irritating part of it was its repetitive nature; the same ideas were brought up time and again, no matter how many times the PEP's authors (originally Chris Angelico, who was joined by Van Rossum and Tim Peters toward the end of the process) and others repeated the arguments against them. It was clear that many were just reacting emotionally (and sometimes histrionically) to the proposal: not reading the PEP or any of the discussion, then loudly proclaiming that their opinion was clearly the only sensible one.

Van Rossum said he would be sticking around "for a while" as a regular core developer, but he left it to the community to determine the governance of the project moving forward. He seems curious to see what develops: "So what are you all going to do? Create a democracy? Anarchy? A dictatorship? A federation?" As many noted in the resignation thread, it was hoped that he would continue as BDFL for some time to come; leaving because of a contentious PEP discussion, rather than a simple retirement decision, is particularly sad. Amid all of the well wishes, many of the replies to Van Rossum's announcement did what the Python community so often does: rolled up its sleeves and got down to work.

New governance

There were two main areas that Van Rossum called out for governance: how PEPs are decided and how new core developers are added. The latter seems to already be based on a vote of the existing core developers. They are the only ones allowed to post to the core-committers mailing list, which is where Van Rossum posted his resignation, presumably to avoid wading through hundreds of messages—nearly all undoubtedly positive and grateful, though surely there would have been some trolls as well.

For PEPs, and any other major language decisions, Christian Heimes suggested either a triumvirate or quintumvirate (a governing body with three or five members) as a ruling body. Victor Stinner thought that the PHP process, where core developers vote on feature proposals, should be considered. Stinner's solution was not particularly popular, though. Brett Cannon put it this way:

For me, I think a key asset that Guido has provided for us as a BDFL is consistency in design/taste. Design by committee through voting does not appeal to me at all as that can too easily lead to shifts in preferences and not have the nice cohesion we have with the language's overall design, especially considering that there will always be subjective choices to make (someone has to eventually choose the colour of the shed). People, including me, have also pointed out that by having Guido to look up to you we have had a very consistent view of how the community should behave and that too has been an asset. IOW I don't like Victor's proposal. ;)

The idea of a triumvirate (or an N-virate for some small, odd N) has seemed to gain some traction, though who would be on it, how long they would serve, and other details are still being discussed. There is also the inevitable question of what the name of such a group might be, with various ideas—some perhaps more serious than others—being suggested. But, as Raymond Hettinger said, there is no real rush:

For the time being, I propose that we shift into low gear and defer major language changes for a while -- that will give us time to digest the changes already in motion and it will give the other implementations more of a chance to catch up (we've been out-running them for a while).

Much of what has been discussed is the PEP decision-making process and how that will change. Prior to his resignation, Van Rossum was the final arbiter of PEPs, except where he delegated his power to a BDFL-Delegate. Many see the role of the "Python Council of Elders" (PCOE) or the "design stewards" (two of the more popular names for the governing body) as largely finding the right person to delegate to for the decision on a given PEP. That group would also be the deciding body of last resort if consensus on a decision is not being reached.

But there is also the question of how long people serve on such a body. Some are calling for "lifetime" appointments with an understanding that folks can stand down at any point, while others would like to see people rotate out of those positions over time. Before that can be determined (presumably via a PEP or set of competing PEPs), though, the role of the group has to be determined. Heimes suggested three functions:

  • Primary to delegate responsibilities to domain experts
  • Secondary to provide consistency and trust
  • Lastly to have final word in case of controversial bike shedding

If the main role is to delegate, though, there is less of a need for it to be a lifetime job. As Doug Hellmann put it:

If the primary approach to decision making is to delegate unless an arbiter is absolutely necessary, then long-term consistency and stability comes less from finding individuals to commit to serving for very long terms on the N-virate as it does from everyone having a good understanding of the history of discussions and from a willingness to keep the status quo in situations where consensus isn't reached (note "consensus" rather than "unanimous agreement").

Building the system to support and encourage turnover, like we do with release managers, lowers the level of effort someone is signing up for when they agree to serve. Given the *many* discussions of burnout in the Python community and open source in general, that seems like an important feature.

How decisions would be made and communicated also came up. There were suggestions of requiring a unanimous vote by the body, but that may be too restrictive. Barry Warsaw suggested not publicizing the individual votes of the members, just the outcome, but Larry Hastings and others saw it differently:

I prefer more transparency in governance generally, and as a member of the community governed by this body I'd prefer more rather than less insight into the process and the thinking that went into the decision. I don't think it's a requirement for the PCOE to present as a unified front or to work in secret for them to be supportive of each other and of the body's decision.

Sunlight, not darkness,

Hastings and others see the PCOE as being akin to the US Supreme Court—a body that only makes decisions when there are disputes that can't be resolved any other way. But Łukasz Langa wondered why having three members was so popular:

I see a bunch of problems with such a low number, like the ability for a single corporation to take over the design process of Python by employing just two of the three members (consistently voting over the third one). 3 also has high likelihood of ties if one of the members abstains. And so on.

Constitution

He also was concerned with how the role of the design stewards will be determined: "Python needs a 'constitution' which will codify what the council is and how it functions." Many are calling that document "PEP 2", but how it would be accepted given the situation is completely up in the air. Langa had a suggestion, but one that might not be popular with Van Rossum: "Ideally Guido would accept the PEP but I'm not sure if he is willing to. If that is indeed the case then how should this be done so that the document is universally accepted by all committers?"

That sentiment was shared by many in the thread; it is clear that there is nearly universal hope that Van Rossum will still have an active role—perhaps even as a BDFL-Delegate on some PEPs. Carol Willing likely summed up the views of many on Van Rossum's participation: "mostly I want Guido to do whatever rocks his world". Cannon had a concrete idea if Van Rossum is willing: "In my ideal scenario, people write up PEPs proposing a governance model and Guido chooses one, making it PEP 2. "

For his part, Van Rossum did briefly pop into the thread to help clarify his role in deciding on governance: "I’m still here, but I would like to be out of the debate and out of the decision loop. I’m also still President of the PSF [Python Software Foundation]. But this is not for the PSF to decide. You all are doing fine."

So "divine intervention" of a sort is probably not in the cards. The core developers are going to need to figure this out for themselves. Willing suggested that there be two guiding principles in determining a governance model: "If what evolves embraces the Zen of Python [PEP 20] and 'I came for the language and stayed for the community', I am confident that the language will benefit technically." Indeed, the Python community is a strong one, which is a testament to Van Rossum's leadership over the last 28 years or so.

As part of the process of coming up with a governance plan, Nathaniel Smith is organizing an informational PEP to survey the governance of other open-source projects. The idea would be to see if there are parts and pieces that could be used for Python. Another effort, some of which even predates Van Rossum's resignation, is to figure out a better way to discuss PEPs and to try to reach consensus on them. Hettinger suggested one possibility:

For the bigger decisions (and there aren't many coming up), I have some suggestions on ways to improve the discussions so that the interested parties can have a more equal say in the outcome and so that the discussions can be more time efficient (it takes too much time to keep-up with long-running, active threads).

Essentially the idea would be have a wiki/faq editable by all the participants. It would include the key examples, arguments for and against, and rebuttals which can be collected into a current-state-of-the-conversation. This would be somewhat different than the current PEP process because currently PEP authors dominate the conversation and others can get drowned out too easily. (This idea is modeled on the California Legislative Analyst Voters Guide which summarizes proposals and has statements and rebuttals from both proponents and opponents).

Neil Schemenauer put it in economic terms:

Perhaps this can be seen as a kind of economic problem. What is the cost of posting to a PEP discussion thread vs the cost of everyone reading that post? Or, what is the value of the comment vs what is cost for everyone to read it?

With the current discussion method, the costs are often disproportionate. You have hundreds of people reading the thread. So that cost is pretty high. Posting a half-baked comment is too easy. Starting a new thread with a new subject line is too easy.

He suggested a separate mailing list for PEP discussions once they had finished their run on the "free-wheeling wild west" of the python-ideas mailing list. The PEP-discussion list would have some ground rules to try to maximize the use of everyone's time. Disproportionate cost for fully engaged participants versus a Python user or developer who just wanted to vent likely played a big role in the burnout that led to Van Rossum's resignation.

It's clear that it will take some time for the dust to settle and for concrete plans to be formulated, but one gets the sense that the Python community is ready—even if not entirely willing—for self-governance. The process will play out in the open, though, which is likely to be helpful to other projects that go through similar, or even dissimilar, transitions. In the open-source world, projects can learn a great deal from each other, from a technical perspective, of course, but also in areas like governance and community.

We would be remiss not to add our own "thank you Guido" to the pile. Our site depends on Python and has for 16 years or more. Van Rossum has done the world a great service with his efforts—that seems unlikely to change even after all of this is behind us. In many, many ways, the Python community is a reflection of its BDFL; its generally pleasant tone and overall friendliness to everyone is something that plenty of other projects should try to emulate.

Comments (29 posted)

The PEP 572 endgame

By Jake Edge
July 18, 2018

Over the last few months, it became clear that the battle over PEP 572 would be consequential; its scale and vehemence was largely unprecedented in the history of Python. The announcement by Guido van Rossum that he was stepping down from his role as benevolent dictator for life (BDFL), due in part to that battle, underscored the importance of it. While the Python project charts its course in the wake of his resignation, it makes sense to catch up on where things stand with this contentious PEP that has now been accepted for Python 3.8.

We first looked at the discussion around PEP 572 back in March, when the second version of the PEP was posted to the python-ideas mailing list. The idea is to allow variable assignment inline, so that certain constructs can be written more easily. That way, an if or while, for example, could have a variable assignment in the statement and the value of the variable could be used elsewhere in the block (and, perhaps, beyond). The scope of those assignments is one of the areas that has evolved most since the PEP was first introduced.

Discussing PEPs

Even early on, some chunk of the discussion was really a "meta-discussion" about the medium for debating PEPs—whether a mailing list is truly better than, say, a forum of some kind. That debate somewhat foreshadowed later developments, as there are now various thoughts on changing the PEP-discussion process to make it so that people can follow the arguments and chime in with their own without having to commit to constant monitoring of the mailing list posts. As seen with this PEP, those posts are often redundant or repetitive, evince no familiarity with either the PEP or the preceding discussion, and generally are simply a waste of participants time—but, of course, not all of them are.

Discussions on python-ideas are meant to be more freewheeling but to eventually converge on something concrete and potentially agreeable before moving to python-dev. It has been hoped that the python-dev discussions will be more focused and less rancorous, but that clearly did not play out for PEP 572. Van Rossum brought up the "PEP 572 mess" in a session at the 2018 Python Language Summit; he wanted to explore other ways to discuss PEPs as part of that session. Shortly after that session, he posted an idea about better PEP discussions to the python-committers mailing list, where only core developers can post. He suggested moving PEP discussions to GitHub repositories (or those of another online Git provider):

This way the discussion is still public: when the PEP-specific repo is created the author(s) can notify python-ideas, and when they are closer to submitting they can notify python-dev, but the discussion doesn't attract uninformed outsiders as much as python-{dev,ideas} discussions do, and it's much easier for outsiders who want to learn more about the proposal to find all relevant discussion.

There were some questions about the suitability of GitHub for PEP discussion, but most seemed inclined to try it and see how it worked out. Mark Shannon did try the idea out for PEP 576, but it did not really work, at least in that case, as it "didn't reduce the amount of email traffic on python-dev".

Meanwhile back at the PEP itself, much of the complexity was removed or filed down and the title changed from "Syntax for Statement-Local Name Bindings" to "Assignment Expressions". The underlying motivation was much the same, however. PEP 572 allows for assigning values to variables in places where only expressions are allowed. The ":=" operator was chosen (among a wide array of alternative spellings) to indicate that, which could be read as "becomes". One of the canonical examples of its use is to replace a so-called "loop and a half":

    line = f.readline()
    while line:
        ...  # process line
	line = f.readline()
or
    while True:
        line = f.readline()
	if not line:
	    break
	... # process line
could be replaced with:
    while line := f.readline():
        ... # process line

The main argument for the feature is that it is more readable and makes the programmer's intent clearer. But most of the traffic in the threads, from core developers and others, was against adding it. Even the original PEP author, Chris Angelico, was not wildly in favor ("hey, I'm no more than +0.5 on it myself"), at least in the early going.

While the PEP was still being discussed on python-ideas, Van Rossum got more involved and Tim Peters posted some of his own code that he thought could benefit from assignment expressions. As the "Rationale" section of the PEP notes, "toy" examples are not necessarily useful in trying to evaluate a language feature:

However, in order to be compelling, examples should be rooted in real code, i.e. code that was written without any thought of this PEP, as part of a useful application, however large or small. Tim Peters has been extremely helpful by going over his own personal code repository and picking examples of code he had written that (in his view) would have been clearer if rewritten with (sparing) use of assignment expressions. His conclusion: the current proposal would have allowed a modest but clear improvement in quite a few bits of code.

That section also describes some code that Van Rossum found in the Dropbox code base, where it was clear that programmers valued conciseness even to the point of repeating a potentially expensive operation to express it on a single line. In addition, an essay from Peters giving some real world examples is Appendix A of the PEP. While it was still not a foregone conclusion that Van Rossum would accept the PEP, his interest and participation seemed to point that way.

To python-dev

After "four rounds in the boxing ring at python-ideas", Angelico posted the PEP to the more widely read python-dev mailing list in mid-April. That set off a firestorm of replies, alternatives, a survey of how other languages handle this feature (if at all), appeals to PEP 20 ("The Zen of Python"), and so on. As Van Rossum said in his Language Summit talk, the opposition may have picked up once people realized he was seriously considering accepting the PEP.

Angelico seemed less confident of acceptance. In his second posting of the PEP to python-dev, he said: "So here's the PEP again, simplified. I'm fairly sure it's just going to be another on a growing list of rejected PEPs to my name". That posting predates much of the firestorm, however. As it turns out, he was a bit premature with his pessimism.

Many of the objections (those that were not bikeshedding over the syntax anyway) seem to revolve around the idea that PEP 572 would add a new way to assign to a variable that would be confusing to new programmers. Several core developers were concerned about how to teach the new operator and how to distinguish it from the "normal" assignments using =. Van Rossum admitted the benefits of the feature are "moderate", so it perhaps should not be a huge surprise that there was not wild acclaim for it. In fact, an informal poll of core developers right after the summit found few in favor of the change.

Another huge thread was spawned from Antoine Pitrou's thoughts on the LWN writeup of the summit session on PEP 572. It went on in much the same vein as many of the other discussions, though there are real efforts to improve the PEP in the thread instead of simply opposing it or rehashing syntax questions that had been long resolved. On July 2, however, Van Rossum posted his decision in the middle of the thread: "Thank you all. I will accept the PEP as is." As much as anything, it may have been his way of simply ending the interminable arguments to hopefully focus on tightening up the PEP's language and to fix any corner cases.

To a large extent, that's what most of the core developers still participating started doing. For example, Steve Dower started looking more closely at the "Syntax and Semantics" section of the PEP, with an eye toward clarifying the language (and his understanding). Victor Stinner started a thread to discuss possible places to use the feature in the standard library—with an eye toward which would actually be helpful to make the code clearer.

There were, naturally, a few last-gasp attempts to head off PEP 572, but Van Rossum was having none of that. He and Peters had joined with Angelico as authors of the PEP in mid-May and multiple changes were made, though they were not posted to the list until Van Rossum's more formal "intention to accept" was posted on July 9. In it, he asked that replies and changes stick to making the PEP better:

I know it is inevitable that there will be replies attempting to convince us that a different syntax is better or that we shouldn't do this at all. Please save your breath. [I've] seen every possible such response before, and they haven't convinced me. At this point we're just looking for feedback on the text of the document, e.g. pointing out ambiguities or requests for more clarity in some part of the spec, *not* for alternative designs or passionate appeals to PEP 20.

Acceptance ... and resignation

He then accepted the PEP on July 11; the next morning he posted his resignation as BDFL, saying: "Now that PEP 572 is done, I don't ever want to have to fight so hard for a PEP and find that so many people despise my decisions." Obviously the whole experience was painful and frustrating for him, which is truly sad. It is clear that many were loudly and sometimes annoyingly opposed to the feature, and some seriously unhappy that it was accepted, but "despise" seems rather strong—that sounds like frustration talking as much as anything else.

So Python will "soon" have assignment expressions, where soon means in Python 3.8, currently scheduled for October 2019. Whatever else can be said of the feature, it has been hashed out over many months, which likely helped eliminate any major issues with it. It is spelled a bit weirdly for Python, as the language has generally eschewed multi-character operators where it can, but that was a conscious choice. Van Rossum tried to suggest using = early on, which would be syntactically possible, but "the negative reaction to that version was way stronger". Even the biggest opponent of := can likely see uses for it even if they may avoid it on principle.

The consequences of the PEP and the battle are likely to be with the project for a long time to come. Obviously, losing Van Rossum as the final arbiter of all things Python is a big blow, but the PEP-discussion process is also going to change. It is truly the end of an era for the language, but it is also the start of a new era—Python seems likely to survive and probably thrive once it gets its feet back under it.

Comments (25 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: Guido van Rossum steps down as BDFL; Quotes; ...
  • Announcements: Newsletters; events; security updates; kernel patches; ...
Next page: Brief items>>

Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds