|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for April 30, 2020

Welcome to the LWN.net Weekly Edition for April 30, 2020

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Fedora security response time

By Jake Edge
April 29, 2020

A call for faster Fedora updates in response to security vulnerabilities was recently posted to the Fedora devel mailing list; it urgently advocated changes to the process so that updates, in general, and to the kernel and packages based on web browsers, in particular, are handled more expeditiously. While Fedora developers are sympathetic to that, there is only so much the distribution can do as there are logistical and other hurdles between Fedora and its users. It turns out that, to a great extent, Fedora can already move quickly when it needs to.

In mid-April, Demi M. Obenour posted a message with the subject "Getting security updates out to users sooner". There were several aspects of the current situation that she wanted to see addressed, starting with simply providing updates more quickly:

Currently, security updates can take days to get to users. In particular, Firefox and Thunderbird often take a day or more, even though virtually every single update contains security fixes.

We need to ensure that security updates reach stable within hours of an upstream advisory. [...]

Beyond that, the metadata hashes for the old packages need to be invalidated quickly, throughout the entire update network (i.e. mirrors) "within an hour or less, preferably minutes"; that would mean that the older, vulnerable packages would no longer be installable. In the past, there have been few regressions with security updates, she said, so those should bypass most of the usual QA. For some package types, the kernel and those based on web-browser libraries (e.g. Thunderbird, Chromium, webkit2gtk), all updates should be considered to be security updates, she said, so that they can get to users more quickly.

Adam Williamson, who is part of the Fedora QA team, wondered about the "within hours" timeline. Gerald Henriksen noted that Firefox sometimes says that the problems an update is fixing are already being exploited in the wild.

Whether a security risk on Linux isn't indicated, but users shouldn't have to wonder why the media is telling them Firefox needs an immediate update and it isn't being provided by their distribution.

But in-the-wild Linux-specific web-engine exploits are vanishingly rare, Michael Catanzaro said:

I've yet to see a Linux exploit developed for a web engine vulnerability and deployed against users in the wild. Are you aware of any instance of this happening, ever? Only a very tiny minority of web engine vulnerabilities ever have exploits developed for any platform. The usual workflow is: fuzzer finds HTML that triggers an asan [AddressSanitizer] complaint, the end, you have a CVE. Now, that doesn't mean Linux exploits don't exist (they surely do). And it doesn't mean the vulnerabilities don't need to be fixed (they do). But let's be reasonable here. Most users are not at risk because we take some time to get the update out to users. Not unless a nation state is out to get you....

There are other types of browser vulnerabilities that are more dangerous; "I know this happens with Firefox occasionally, but when it does, I don't think a next-day response is so bad". There is definitely a need to get updates out in a timely manner, he said, but he thinks "a couple weeks" is timely enough in most cases. "At least with WebKit, regressions are not uncommon and a few days of testing is important to ensure quality user experience."

Kernel updates are also not good candidates for blanket "security update fast-track" treatment, Michel Alexandre Salim said. Kernel updates often do introduce regressions, so it is important to distinguish those that address a critical vulnerability from others that can get additional needed testing, he said. While it is not mentioned in the thread, the upstream kernel policy that obscures the security nature of patches, as described in this article and elsewhere, may make that harder than it needs to be. On the flipside, though, Fedora regularly updates from the stable kernel stream, which fixes lots of security problems—often before they are even known to be vulnerabilities.

In any case, Fedora kernel team member Justin Forbes said that there is already a process in place to handle the exceptional kernel update that needs to go out quickly. The process is followed two or three times a year, and it can all complete pretty quickly:

This can all happen within a matter of hours (kernels take 3 hours just to build if everything goes well). Often it means someone in releng [release engineering] is staying up late to help me deal with it, and it is greatly appreciated. But the system is not abused, it only happens on critical updates. A majority of CVEs are *not* critical updates. It is important they are patched for a variety of reasons, but the risk of going a day or 2 without a patch is approaching zero.

Petr Pisar said that even if there were a separate repository where fixes were immediately built, composed, and published, there would still be barriers to getting them into users' hands. The Fedora mirrors use a pull model, so the mirrors poll for updates at some frequency. In order to ensure that users get access immediately, the updates would need to come directly from Fedora servers, which is something that the infrastructure may not be able to handle.

Fedora project leader Matthew Miller pointed out that the subject has come up before. He filed a releng bug six years ago seeking a way to get urgent fixes out to users more quickly. In the years since, it was discussed in the bug and in other forums, put on the wiki, and eventually closed in 2018 because the overall process had gotten much faster in the meantime. Instead of a two-day wait, urgent updates could be ready in two or three hours.

A short vulnerability response time for a distribution is important to try to keep users as secure as possible, but there are obviously limits too. Distributions need to find the right balance and it would seem that Fedora has generally done so. It is also important to review those policies and procedures now and again to ensure that they are still functioning as they should.

Comments (17 posted)

Controlling realtime priorities in kernel threads

By Jonathan Corbet
April 23, 2020
The realtime scheduler classes are intended to allow a developer to state which tasks have the highest priorities with the assurance that, at any given time, the highest-priority task will have unimpeded access to the CPU. The kernel itself carries out a number of tasks that have tight time constraints, so it is natural to want to assign realtime priorities to kernel threads carrying out those tasks. But, as Peter Zijlstra argues in a new patch set, it makes little sense for the kernel to be assigning such priorities; to put an end to that practice, he is proposing to take away most of the kernel's ability to prioritize its own threads.

In the classic realtime model, there are two scheduling classes: SCHED_FIFO and SCHED_RR. Processes in either class have a simple integer priority. SCHED_FIFO processes run until they voluntarily give up the CPU, with the highest-priority process going first. SCHED_RR, instead, rotates through all runnable processes at the highest priority level, giving each a fixed time slice. In either class, processes with a lower realtime priority will be completely blocked until all higher-priority processes are blocked, and processes in either class will, regardless of priority level, run ahead of normal, non-realtime work in the SCHED_NORMAL class.

The kernel pushes a large (and increasing) amount of work out into kernel threads, which are special processes running within the kernel's address space. This is done to allow that work to happen independently of any other thread of execution, under the control of the system scheduler. Most kernel threads run in the SCHED_NORMAL class and must contend with ordinary user-space processes for CPU time. Others, though, are deemed special enough that they should run ahead of user-space work; one way to make that happen is to put those threads into the SCHED_FIFO class.

But then a question arises: which priority should any given thread have? Answering that question requires judging the importance of a given thread relative to all of the other threads running at realtime priority — and relative to any user-space realtime work as well. That is going to be a difficult question to answer, even if the answer turns out to be the same for every system and workload, which seems unlikely. In general, kernel developers don't even try; they just pick something.

Zijlstra believes that this exercise is pointless: "the kernel has no clue what actual priority it should use for various things, so it is useless (or worse, counter productive) to even try". So he has changed the kernel's internal interfaces to take away the ability to run at a specific SCHED_FIFO priority. What remains is a set of three functions:

    void sched_set_fifo(struct task_struct *p);
    void sched_set_fifo_low(struct task_struct *p);
    void sched_set_normal(struct task_struct *p, int nice);

For loadable modules, these become the only functions available for manipulating a thread's scheduling information. All three functions are exported only to modules with GPL-compatible licenses. A call to sched_set_fifo() puts the given process into the SCHED_FIFO class at priority 50 — halfway between the minimum and maximum values. For threads with less pressing requirements, sched_set_fifo_low() sets the priority to the lowest value (one) instead. Calling sched_set_normal() returns the thread to the SCHED_NORMAL class with the given nice value.

The bulk of the patch set consists of changes to specific subsystems to make them use the new API; it gives a picture of how current kernels are handling SCHED_FIFO threads now. Here's what turns up:

SubsystemPriorityDescription
Arm bL switcher 1 The Arm big.LITTLE switcher thread
crypto 50 Crypto engine worker thread
ACPI 1 ACPI processor aggregator driver
drbd 2 Distributed, replicated block device request handling
PSCI checker 99 PSCI firmware hotplug/suspend functionality checker
msm 16 MSM GPU driver
DRM 1 Direct rendering request scheduler
ivtv 99 Conexant cx23416/cx23415 MPEG encoder/decoder driver
mmc 1 MultiMediaCard drivers
cros_ec_spi 50 ChromeOS embedded controller SPI driver
powercap 50 "Powercap" idle-injection driver
powerclamp 50 Intel powerclamp thermal management subsystem
sc16is7xx 50 NXP SC16IS7xx serial port driver
watchdog 99 Watchdog timer driver subsystem
irq 50 Threaded interrupt handling
locktorture 99 Locking torture-testing module
rcuperf 1 Read-copy-update performance tester
rcutorture 1 Read-copy-update torture tester
sched/psi 1 Pressure-stall information data gathering

As one can see, there is indeed a fair amount of variety in the priority values chosen by kernel developers for their threads. Additionally, the drbd driver was using the SCHED_RR class for reasons that weren't entirely clear. After Zijlstra's patch set is applied, all of the subsystems using a priority of one have been converted to use sched_set_fifo_low(), while the rest use sched_set_fifo(), giving them all a priority of 50.

There have been responses to a number of the patches thus far, mostly offering Reviewed-by tags or similar. It seems that few, if any, kernel developers are strongly attached to the SCHED_FIFO priority values that they chose when they had to come up with a number to put into that structure field. It is thus unlikely that there is going to be any sort of serious opposition to this patch set going in.

The end result is not limited to a rationalization of SCHED_FIFO values inside the kernel, though. One of the objections Zijlstra raises about SCHED_FIFO in general is that, even if a developer is able to choose perfect priority values for their workload, all that work goes by the wayside if that workload has to be combined with another, which will have its own set of priority values. The chances of those two sets of values combining into a coherent whole are relatively small.

In current kernels, every realtime workload using SCHED_FIFO faces this problem, since the priority choices made for that workload have to be combined with the choices made for kernel threads — choices that have not really been thought through and which are not documented anywhere. Making the kernel's configuration for SCHED_FIFO priorities predictable should make life easier for realtime system designers, who are unlikely to mind having fewer variables to worry about.

Comments (3 posted)

Dumping kernel data structures with BPF

By Jonathan Corbet
April 27, 2020
For as long as operating systems have had kernels, there has been a need to extract information from data structures stored within those kernels. Over the years, a wide range of approaches have been taken to make that information available. In current times, it has become natural to reach for BPF as the tool of choice for a variety of problems, and getting information from kernel data structures is no exception. There are two patches in circulation that take rather different approaches to using BPF to dump information from kernel data structures to user space.

When your editor first encountered paleolithic Unix systems, tools like ps would obtain their information by opening /dev/kmem and rooting around directly in the kernel's memory space. This approach had the advantage of requiring no direct kernel support, but there were also some disadvantages, including security issues, lack of atomicity in the collection of complex data, and occasionally returning random garbage. This behavior was perhaps acceptable in the early days, but contemporary users have become strangely less tolerant of it. So digging around in kernel memory has long since fallen out of favor.

In current Linux systems, this problem is solved with a collection of system calls and virtual files in /proc, sysfs, debugfs, and beyond. This approach works, but has some challenges of its own. The kernel must be modified whenever the information to be output changes, "debugging" information in debugfs ends up being needed for normal system operations (where debugfs should not be enabled), and changes can be hard to make without breaking existing applications. So there is a natural desire for something more flexible and adaptable.

Structure dumpers

One approach, posted by Yonghong Song, is aimed directly at the virtual-file case. In short, it allows the attachment of BPF programs to implement /proc-style files for any supported data structure.

More specifically, it creates a new virtual filesystem that is expected to be mounted at /sys/kernel/bpfdump. It is a singleton filesystem, in that it will provide the same contents regardless of how many times (or in how many different namespaces) it is mounted. Kernel subsystems can then create subdirectories in that filesystem to make specific data structures available. For example, in the patch series, the task subdirectory is created to export the active task_struct structures from the kernel, bpf_map will allow traversal through the list of BPF maps, and netlink provides information on active netlink connections.

Then the patch series adds a new type of BPF program called BPF_TRACE_DUMP. A program of this type will be called with a pointer to a structure, and is expected to generate the output for user space, which is written using the seq_file interface. To that end, two new helper functions — bpf_seq_printf() and bpf_seq_write() — have been added. These programs are loaded into the kernel with the bpf() system call in the usual manner.

Finally, the meaning of the BPF_OBJ_PIN command, which was originally added to support programs and maps that persist after the file descriptors referring to them are closed, is extended. With this command, a BPF_TRACE_DUMP program can be "pinned" to a file created inside a /sys/kernel/bpfdump directory. So, for example, if one wanted to create a new process dumper called "myps", one could load a BPF program to generate the desired output from the task structure, then "pin" it to a file named myps under /sys/kernel/bpfdump/task.

The patch set includes a few sample programs to demonstrate the mechanism and for self testing; as an example, one can be pinned under /sys/kernel/bpfdump/netlink that generates output identical to that from /proc/net/netlink. Of course, replicating existing interfaces is not particularly interesting, but it does show how new interfaces can be created. With this capability, users can create interfaces that provide exactly the information they need in a relatively efficient manner. If new information is needed, it can be had without changing the kernel.

That said, there is some setup required; each structure type that is to be made available in this way requires a certain amount of support code to iterate through active structures and pass them to the relevant BPF program. But that is a one-time effort for each type; after that, in theory, kernel developers need never worry about exporting information from that structure type to user space again. At least, as long as nobody worries that some of the data that is being made available should, instead, be kept secret within the kernel.

printk()

The other approach, posted by Alan Maguire, is oriented more toward debugging needs. When addressing that particular use case, it's only natural to fall back on printk() to get information out to user space.

When debugging a problem, one commonly needs to look at various fields within a kernel data structure. Rebuilding the kernel with a printk() call in the right place is usually sufficient to learn something about the issue; often what is learned is that not enough fields were printed and the process needs to start over again. A nice feature to have would be the ability to simply print an arbitrary structure in its entirety; that is often easy to do in interpreted languages like Python, but it is not normally available in C.

The ability to print specific structure types has existed in the kernel for some time; for example, an rtc_time structure can be printed directly using the %ptR format directive. A relatively small number of structures is supported, though; each new one requires adding more code to printk() and that support must be updated whenever the structure is modified. So this feature is far from a capability to print an arbitrary structure.

What Maguire realized is that, with the addition of BPF type format (BTF) data to the kernel, it is possible to do something better. BTF was originally added to solve the problem of BPF program binary portability between systems. The layout of any given data structure can vary from one kernel configuration to the next, making it hard to create BPF programs that can run universally across all configurations. BTF describes the types used in the kernel as it was actually built; user-space tools can then use that information to "relocate" references within structures to the correct offsets prior to loading a BPF program into the kernel.

But, once you have a description of a structure's layout available within the kernel, you can use it to print out that structure's data. So Maguire added a new format directive to do so. The format is "%pT<type>", where type is the type of the structure pointer being passed. Making it "%pTN<type>" adds the field names as well. An example in the patch set prints an sk_buff structure (used in the networking layer to hold a packet) with a line like:

    pr_info("%pTN<struct sk_buff>", skb);

The resulting output looks like this:

    {{{.next=00000000c7916e9c,.prev=00000000c7916e9c,
      {.dev=00000000c7916e9c|.dev_scratch=0}}|
      .rbnode={.__rb_parent_color=0,.rb_right=00000000c7916e9c,.rb_left=00000000c7916e9c}|
      .list={.next=00000000c7916e9c,.prev=00000000c7916e9c}},
      {.sk=00000000c7916e9c|.ip_defrag_offset=0},{.tstamp=0|.skb_mstamp_ns=0},
      .cb=['\0'],{{._skb_refdst=0,.destructor=00000000c7916e9c}|
      .tcp_tsorted_anchor={.next=00000000c7916e9c,.prev=00000000c7916e9c}},
      ._nfct=0,.len=0,.data_len=0,.mac_len=0,.hdr_len=0,.queue_mapping=0,
      .__cloned_offset=[],.cloned=0x0,.nohdr=0x0,.fclone=0x0,.peeked=0x0,
      .head_frag=0x0,.pfmemalloc=0x0,.active_extensions=0,.headers_start=[],
      .__pkt_type_offset=[],.pkt_type=0x0,.ignore_df=0x0,.nf_trace=0x0,
      .ip_summed=0x0,.ooo_okay=0x0,.l4_hash=0x0,.sw_hash=0x0,.wifi_acked_valid=0x0,
      .wifi_acked=0x0,.no_fcs=0x0,.encapsulation=0x0,.encap_hdr_csum=0x0,
      .csum_valid=0x0,.__pkt_vlan_present_offset=[],.vlan_present=0x0,
      .csum_complete_sw=0x0,.csum_level=0x0,.csum_not_inet=0x0,.dst_pending_co

Here, the original "all on one line" format has been broken up a bit for readability. Output is limited to 1024 characters, which explains the rather abrupt ending seen above. In cases where that limit proves to be a problem, omitting the "N" qualifier will allow more fields to be output but without names. Arnaldo Carvalho de Melo suggested that an additional "z" option could suppress the printing of fields whose value is zero, making the output much more compact; that suggestion seems likely to be implemented in the next version of the patch series.

While printk() is the immediate application for this feature, Maguire suggested that it could be used in other settings as well. Ftrace could use it to print out structure contents at tracepoints, for example, or the kernel could use it to enhance the information available in oops listings.

These patch sets show two different approaches to using the kernel's BPF infrastructure to format information in kernel data structures for use outside of the kernel. They address sufficiently different use cases that it is not a question of which of the two might be accepted; there would appear to be room for both. Each makes it easier to look inside the kernel in its own way.

Comments (4 posted)

Bringing openSUSE Leap and SLE closer

April 24, 2020

This article was contributed by Marta Rybczyńska

OpenSUSE Leap is a community distribution built on top of source packages from SUSE Linux Enterprise (SLE). Recently, Gerald Pfeifer, chair of the openSUSE board, posted an announcement describing a proposal from SUSE to unify some packages between SLE and openSUSE Leap. Here we analyze the proposal and the community's reaction to it.

SUSE and openSUSE

SUSE Linux is one of the oldest Linux distributions still in existence today, with a history that starts in 1994. Today it exists in a few forms, including the commercial SLE offering, which mainly targets the server market.

The openSUSE project creates a community version of the SUSE distribution; its work is largely sponsored by SUSE (the company). OpenSUSE produces two main variants, the relatively stable openSUSE Leap and a rolling version called openSUSE Tumbleweed. Leap is built on packages from SLE, which form a stable base made up of relatively old software releases. For example, the Leap kernel version is the same as in the corresponding SLE version. The openSUSE team adds some changes to the SLE packages, then rebuilds them for Leap; the team also adds newer versions of some packages, such as desktop environments, from Tumbleweed. The current version of Leap is 15.1 (as of April 2020).

The proposal

As stated in the announcement, the main goal of this unification is to bring the SLE and openSUSE Leap distributions closer together. The proposal has been presented to the SUSE management and to the openSUSE board. SUSE's exact motivation is not clearly stated, but two main themes have shown up in the discussion: allowing SLE to easily take advantage of the work in openSUSE, and easing the transition between Leap systems and SLE. The first point, in particular, led to comments from the community; for example, Stasiek Michalski wrote:

Leap did become "the better SLE" over time, so we should have seen this coming a mile away, but I do not know how to feel about SLE basically using all of that work that the community did to achieve this.

Among responses to this statement, John Paul Adrian Glaubitz reminded participants of some basic free-software principles: "The whole point of releasing your software under a free license is to allow it to be re-used by other projects and commercial products". The benefits for the community, instead, are said to be better quality through common testing, bug reporting, and code cleanup in the packages.

The core of the proposal is to include SLE binaries directly into openSUSE Leap. The process toward this goal is expected to involve three steps. The first is to perform a code cleanup in the packages that are currently shared between openSUSE Leap 15 and SLE 15; this process has already started from the SUSE side and will delay the next SUSE release (SLE 15 SP2) to July 2020. The next Leap release will also probably be delayed to accommodate these changes; the openSUSE Leap 15.2 release is now expected in July 2020 as well.

Then, the plan is to provide two releases of Leap in parallel around October 2020: one (called "Jump" for now) will contain the SLE binaries, while the other will have those packages rebuilt by the openSUSE team (as has been done until now). This will be the time to discuss the results so far and for the community to decide whether to go further with the project. If everything goes smoothly, the Leap release after 15.2 should include the SLE binary packages by default; it should be available around July 2021.

To reach the goal, changes in the build systems will be necessary. Currently SUSE is using its own instance of the build system (called "Internal Build Service"), and openSUSE uses the public openSUSE Build Service (OBS). A large part of the work will be in the setup of the projects in OBS and the migration scripts for synchronization of packages between the two distributions.

Another important part of the puzzle will be synchronizing the changes openSUSE makes to the SLE packages it uses. The plan proposes to adapt the spec files and unify the versions of those packages. That should have limited impact on openSUSE Leap, except for some package-version changes. The exact procedure for handling the differences is not defined yet. Multiple ideas have been proposed, including implementing the openSUSE changes in SLE, splitting the packages when needed, and using the /etc/os-release file during installation to perform different actions depending on the distribution to avoid hard-coding the conditions in the spec files.

SLE and openSUSE support different sets of architectures. Those supported by SLE (aarch64, x86-64, ppc64le, and s390x) are also available in Leap. However, the community will have to find a way to integrate the SLE binary packages in the build system in such a way that still allows building packages for the ARMv7 and RISC-V architectures, which are supported by Leap, but not SLE.

The licensing (GPL) of openSUSE Leap and the maintenance process are going to stay as they are today. The proposal FAQ clearly mentions that no registration will be required to download openSUSE Leap.

Community reaction

The proposal has been posted to the openSUSE project list and to the Factory development list for discussion on both technical and project levels. The discussion resulted in mostly positive reactions. However, some doubts and issues have been expressed.

Michalski asked about the differences in the content of SLE and Leap, especially in the branding packages. Adrian Schröter explained that the two distributions will have different configurations that will be used to select distribution-specific options like branding packages.

Another discussion concerned the signatures on Leap packages. "Cunix" asked why binary packages can go from SLE into openSUSE, but it isn't possible to import openSUSE packages into SLE. This requires openSUSE users to accept the SLE signing key. Robert Schweikert responded that this is driven by security certifications that require SLE binaries to be built on the SUSE-controlled internal build system. Following that, cunix suggested that a solution should be found to allow a package to have two signatures, one from openSUSE and one from SUSE. Unfortunately this is currently not possible, and there were no other developers preferring that solution.

SUSE also expects to set up an easier process for updating a Leap package that comes from SLE, but the details are not known yet. This will replace the current process, where somebody has to contact the openSUSE package maintainer, who then turns that request into a SLE feature request. Matwey V. Kornilov commented that the ability to directly file a SLE feature request will be necessary, and the current process may take a long time. The same type of issue was mentioned by Wolfgang Rosenauer, who said that, in some cases, such updates were rejected without a clear reason. Lubos Kocman promised that these problems will be addressed by the new process.

Fedora Enterprise Linux Next

A similar idea is being developed in the Fedora project under the name "Enterprise Linux Next buildroot and compose" (or just ELN). "Buildroot" in this context means a chroot environment used to install packages in the build system. The idea is to use this buildroot, which is configured like the commercial Red Hat Enterprise Linux (RHEL) distribution, to create a build of the unstable Fedora Rawhide offering. Or, as the proposal puts it: "The main goal of ELN is to rebuild Fedora Rawhide as if it were RHEL" This will make it easier for Fedora developers to see how their changes will affect RHEL and, thus, ease the process of moving Fedora work into the enterprise distribution.

The addition of the ELN buildroot has just been approved by Fedora Engineering Steering Committee (FESCo).

As with SUSE's proposal, ELN buildroot will allow easier integration of Fedora's changes into RHEL. One difference is that ELN builds do not include RHEL binaries, so there seems to be no question about the package signatures, as in the openSUSE case. Additionally, Fedora decided to allow differences in the spec files (different variables for ELN, Fedora, and RHEL), while openSUSE seems to be targeting more unification.

Next steps

Synchronizing sources between related distributions clearly has the potential to reduce the amount of work done by both teams and to allow commercial distributions to benefit more easily from work done in their community versions. The question remains how the community and company teams will be able to work together.

On the openSUSE side, the comments are generally positive, it seems that the project will go forward. Both SUSE and openSUSE teams will have work to do to resolve the remaining issues. The next big step will happen in late 2020, when the prototype Leap version with the SLE packages is expected to be available and we see what the conclusions of both teams will be. That will also probably be the time to compare the experiences of openSUSE and Fedora's ELN.

Comments (2 posted)

Improving Python's SimpleNamespace

By Jake Edge
April 29, 2020

Python's SimpleNamespace class provides an easy way for a programmer to create an object to store values as attributes without creating their own (almost empty) class. While it is useful (and used) in its present form, Raymond Hettinger thinks it could be better. He would like to see the hooks used by mappings (e.g. dictionaries) added to the class, so that attributes can be added and removed using either x.a or x['a']. It would bring benefits for JSON handling and more in the language.

A SimpleNamespace provides a mechanism to instantiate an object that can hold attributes and nothing else. It is, in effect, an empty class with a fancier __init__() and a helpful __repr__():

    >>> from types import SimpleNamespace
    >>> sn = SimpleNamespace(x = 1, y = 2)
    >>> sn
    namespace(x=1, y=2)
    >>> sn.z = 'foo'
    >>> del(sn.x)
    >>> sn
    namespace(y=2, z='foo')

Hettinger proposed his idea to the python-dev mailing list in mid-April. He described it as follows:

SimpleNamespace() is really good at giving attribute style-access. I would like to make that functionality available to the JSON module (or just about anything else that accepts a custom dict) by adding the magic methods for mappings so that this works:
     catalog = json.load(f, object_hook=SimpleNamespace)
     print(catalog['clothing']['mens']['shoes']['extra_wide']['quantity']) # currently possible with dict()
     print(catalog.clothing.mens.shoes.extra_wide.quantity)                # proposed with SimpleNamespace()
     print(catalog.clothing.boys['3t'].tops.quantity)                      # would also be supported
The json.load() function will use the object_hook to create SimpleNamespace objects rather than dictionaries. Then a mixture of operations can be used to retrieve information from the data structure. In effect, json.load() will be using dictionary-style access to store things into the data structure and Hettinger wants the ability to work with it using attribute notation.

There are examples of production code that does this sort of thing, he said, but each user needs to reinvent the wheel: "This is kind of [a] bummer because the custom subclasses are a pain to write, are non-standard, and are generally somewhat slow." He had started with a feature request in the Python bug tracker, but responses there suggested adding a new class.

[...] but I don't see the point in substantially duplicating everything SimpleNamespace already does just so we can add some supporting dunder methods. Please add more commentary so we can figure-out the best way to offer this powerful functionality.

Guido van Rossum thought that kind of usage was not particularly Pythonic, and was not really in favor of propagating it:

I've seen this pattern a lot at a past employer, and despite the obvious convenience I've come to see it as an anti-pattern: for people expecting Python semantics it's quite surprising to read code that writes foo.bar and then reads back foo['bar']. We should not try to import JavaScript's object model into Python.

Kyle Stanley wondered if it made sense for the feature to reside in the json module; "that seems like the most useful and intuitive location for the dot notation". He thought that JSON users would not be surprised by that style of usage, but Van Rossum disagreed:

Well, as a user of JSON in Python I *would* be surprised by it, since the actual JSON notation uses dicts, and most Python code I've seen that access raw JSON data directly uses dict notation. Where you see dot notation is if the raw JSON dict is verified and converted to a regular object (usually with the help of some schema library), but there dict notation is questionable.

Several others agreed that the duality of object and dictionary access was not a good fit for Python, but there is a still a problem to be solved, as Hettinger noted: "working with heavily nested dictionaries (typical for JSON) is no fun with square brackets and quotation marks". Victor Stinner listed a handful of different projects from the Python Package Index (PyPI) that provide some or all of the features that are desired, but he did not see that any of those had "been battle-tested and gained enough popularity" that they should be considered for the standard library.

Stinner (and others in the thread) pointed to the glom library as one that might be of use in working with deeply nested JSON data. But the "AttrDict" pattern is rather popular, as Hettinger pointed out. glom can do lots more things, but it is not able to freely mix and match the two access types as Hettinger wants.

There were some who thought it might be reasonable for the json module to provide the functionality, as Stanley had suggested, including Van Rossum who seemed to come around to the idea. Glenn Linderman supported adding the feature in a bug report comment; he thinks it is useful well beyond just JSON. "Such a feature is just too practical not to be Pythonic." Similarly, Cameron Simpson thought it would make a good addition:

I'm with Raymond here. I think my position is that unlike most classes, SimpleNamespace has very simple semantics, and no __getitem__ facility at all, so making __getitem__ map to __getattr__ seems low impact.

It is true that adding dictionary-like functionality to SimpleNamespace should not affect existing code, but most in the thread still seem to be against adding the feature to that class. Eric Snow put it this way:

Keep in mind that I added SimpleNamespace when implementing PEP [Python Enhancement Proposal] 421, to use for the new "sys.implementation". The whole point was to keep it simple, as the docs suggest.

Perhaps the most radical suggestion came from Rob Cliffe. He thought it might make sense to add a new operator to the language (perhaps "..") with no default implementation. That would allow classes to define the operator for themselves:

Then in a specific class you could implement x..y to mean x['y'] and then you could write
    obj..abc..def..ghi
Still fairly concise, but warns that what is happening is not normal attribute lookup.

As Stinner pointed out, though, that and some of the other more speculative posts probably belonged in a python-ideas thread instead. It does not seem particularly likely that SimpleNamespace will be getting this added feature anytime soon—or at all. There is enough opposition to making that change, but there is recognition of the problem, so some other solution might come about. It would, presumably, need the PEP treatment, though; a visit to python-ideas might be in the offing as well.

Comments (30 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: Fedora 32; Fedora on Lenovo; Ubuntu 20.04 LTS; Kdenlive 20.04; Panfrost; Help Wanted; Quote; ...
  • Announcements: Newsletters; conferences; security updates; kernel patches; ...
Next page: Brief items>>

Copyright © 2020, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds