LWN.net Weekly Edition for August 19, 2021

Welcome to the LWN.net Weekly Edition for August 19, 2021

This edition contains the following feature content:

PEP 649 revisited: the Python developers reconsider the proper uses of annotations.
STARTTLS considered harmful: just because a client asks for TLS security doesn't guarantee that it will be provided.
A firewall for device drivers: an attempt to avoid the intractable problem of hardening all of the device drivers in the kernel.
Short subjects: Realtime, Futexes, and ntfs3: realtime is getting closer to the mainline, the futex2 effort takes a step back, and ntfs3 waits in the wings.
PostgreSQL's commitfest clog: what can be done to address PostgreSQL's review deficit?

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

PEP 649 revisited

By Jake Edge
August 18, 2021

Back in June, we looked at a change to Python annotations, which provide a way to associate metadata, such as type information, with functions. That change was planned for the upcoming Python 3.10 release, but was deferred due to questions about it and its impact on run-time uses of the feature. The Python steering council felt that more time was needed to consider all of the different aspects of the problem before deciding on the right approach; the feature freeze for Python 3.10 was only around two weeks off when the decision was announced on April 20. But now, there is most of a year before another feature freeze, which gives the council (and the greater Python development community) some time to discuss it at a more leisurely pace.

To that end, Eric V. Smith raised the issue on the python-dev mailing list on August 9. He did so in the context of PEP 649 ("Deferred Evaluation Of Annotations Using Descriptors"), which was the late-breaking proposal that caused the original plan to be put on hold. That plan was embodied in PEP 563 ("Postponed Evaluation of Annotations"), which was accepted back in 2017 and was set to become the default—and only—behavior for annotations starting in Python 3.10. The council decided to defer the change in the default until Python 3.11 at the earliest and there is the possibility of switching to the behavior described in PEP 649 instead. Smith wanted to see if the issue could be resolved at this point.

Backstory

The history was described at some length in our earlier article, but a capsule summary is probably in order. Annotations were added to Python as a general feature, long before the static-typing features that use annotations came along. As originally envisioned in 2006, annotations were simply meant as a way to attach metadata to a function's arguments and return value—and to make the information available at run time in the __annotations__ dictionary associated with the function. Syntax was added so that programmers could optionally add this information to function definitions; interpretation of the metadata was left up to whatever was processing it. Type information was certainly one of the possibilities for that metadata, but there was no standard on how to represent the types of arguments or return values.

When type hints came about, the type information was naturally stored using the existing annotations mechanism, but eventually ran into some snags, with forward references to types being particularly problematic. So PEP 563 was adopted to defer the evaluation of annotations until they were actually needed; that also removed the cost of computing the annotations every time a module was imported, which was a cost that provided almost no benefit. For static type checkers, which were the main driving force behind the type hints effort, it would not change anything; those checks are not done at run time, so the values that got stored in __annotations__ were not used.

On the other hand, there are users of the type annotations at run time and there may be users of other kinds of annotations at run time as well. It seemed to come as a bit of a surprise when developers of the pydantic data-validation tool explained the problems they had using PEP 563, but other examples came to light as well. It would not be a huge surprise to discover that annotations are being used at run time for some non-type-related purpose—somewhere out there in the enormous Python ecosystem. Annotations were added to the language as a generalized feature so it is not a huge stretch to imagine that developers have used them in unexpected ways.

For some, including Python creator Guido van Rossum, restricting annotations to type information is a reasonable way forward. But that still does not solve the problems for the run-time users of __annotations__, who ran aground on some of the scoping issues that are inherent in the PEP 563 approach. But switching to PEP 563 is a one-way door; if a project like pydantic wanted the old behavior, it would not have been able to get it. That meant that even those who were using the annotations for type information could not do so, at least reasonably easily, at run time.

PEP 649 took a different approach that was meant to resolve the forward-reference problems and to maintain the performance boost that came from not evaluating the annotations until they were actually needed. Instead of keeping that annotations around in string form, as PEP 563 does, it turns them into functions that get called the first time __annotations__ is consulted. Those functions can properly handle the different local and global scopes that bedeviled the earlier approach, which effectively used eval() (or typing.get_type_hints()) to turn the annotation strings into Python objects.

Resurrection

But PEP 649 came about rather late in the development cycle for Python 3.10 and seemingly only barely headed off the switch to PEP 563 by default for that release. Instead of waiting until April 2022, just before the Python 3.11 feature freeze, as PEP 563 author Łukasz Langa jokingly warned against, Smith raised the question. He would like to see PEP 649 adopted, which was a fairly popular option back in April when it was discussed. He wondered what the next steps would be:

My understanding is that PEP 649 is currently in front of the SC [steering council]. But do we need to have any additional discussion here? My recollection is that we backed out the PEP 563 change because we didn't feel we had enough time to come to a good decision one way or the other before 3.10.

Council member Barry Warsaw filled in some of the considerations that led the council to defer the question, beyond just the problem of running out of time. These are "decisions we’d have to live with essentially forever", he said. One of those considerations is the question of compile-time versus run-time use of the annotations; he thinks that the two groups need to cooperate:

[...] we have to be very careful that the folks who use type annotation at compile/static checking time (e.g. mypy and friends) explicitly consider the existing use cases and needs of the runtime type community. These two constituents have to work together to avoid backward incompatible changes.

Luciano Ramalho noted that the large companies that have invested a lot of effort into Python typing features tend to be focused on the static-typing case and are concerned by its costs, so it is up to the rest of the community to ensure that the other use cases are still well-supported:

[...] static checking only happens in developer workstations and CI servers, but *imports* happen all the time in code running in production, on countless servers. That became a real issue for those very same companies operating at Web scale.
So they have a strong incentive to focus on the use of annotations for static checking only, while many of us also want type hints to address use cases where Python is used as a *dynamic* language, which is its nature, and likely a strong reason for its popularity in the first place—despite the inherent runtime costs of being a dynamic language.

Another steering council member, Brett Cannon, confirmed that PEP 649 was currently under consideration, but he did think that additional discussion was needed: "I think the question is whether we have general consensus around PEP 649?" In addition, Inada Naoki said that there was a need to evaluate the memory and performance impact of PEP 649 before deciding on it.

But PEP 649 author Larry Hastings does not see things that way. As he pointed out, PEP 563 does not mention performance or memory consumption; it is focused on solving a problem in the language. PEP 649 should be treated similarly:

I think PEP 649 should be considered in the same way. In my opinion, the important thing is to figure out what semantics we want for the language. Once we figure out what semantics we want, we should implement them, and only then should we start worrying about performance. Fretting about performance at this point is premature and a distraction.
I assert PEP 649's performance and memory use is already acceptable, particularly for a prototype. And I'm confident that if PEP 649 is accepted, the core dev community will find endless ways to optimize the implementation.

Finding the right balance

There is, to a certain extent, a struggle going on between those who want to further enshrine type features into annotations and, by extension, the Python language itself, and those who are perfectly happy to see the typing features, but do not want them to preclude other uses of annotations. That echoes Ramalho's observations about the companies behind the static-typing feature somewhat. For example, PEP 646 ("Variadic Generics") proposes "syntax for type annotations that may or may not be useful or desired for regular Python", as Warsaw put it. But having the syntax of the language and that of its type annotations diverge is not something that he believes the council will allow.

There are aspects of PEP 649 (or some, as yet unwritten, successor) where typing proponents would like push annotation support in directions that might wall off other uses. Hastings is particularly concerned by that:

Annotations aren't special enough to break the rules.
I worry about Python-the-language enshrining design choices made by the typing module. Python is now on its fourth string interpolation technology, and it ships with three command-line argument parsing libraries; in each of these cases, we were adding a New Thing that was viewed at the time as an improvement over the existing thing(s). It'd be an act of hubris to assert that the current "typing" module is the ultimate, final library for expressing type information in Python. But if we tie the language too strongly to the typing module, I fear we could strangle its successors in their cribs.

Steve Holden agreed with that concern, noting that "optional" may be slowly getting elbowed aside. "Which would be unfortunate given the (explicit?) assurances that annotations would be optional; they are casting their shadow over the whole language." For some, it is about finding the balance between the needs of the new feature without precluding older uses; others, including Van Rossum, seem more willing to fully embrace annotations only for types and mostly only for static analysis.

One gets the feeling that this particular debate would have played out quite a bit differently if Van Rossum were still the benevolent dictator for the language. But he voluntarily relinquished that role and has fully embraced the steering council model that the community adopted. That model purposely provides for multiple voices that can try to find the balance in a disagreement of this sort. Van Rossum credited the council with "the wisdom of Solomon" in his reaction to the deferral decision back in April, but it may well be that the council in fact has a wisdom of a different sort entirely. Python seems likely to benefit from its multi-headed wisdom going forward.

While the fate of PEP 649 itself is somewhat unclear at this point, it does seem like there are efforts being made to enhance it to cover the problem areas that have been brought up. Given the known use cases for annotations at run time (e.g. pydantic), though, queuing up PEP 563 as the default for 3.11 seems highly unlikely. There is still lots of time to discuss, further prototype, and revise the idea well before hitting Langa's "deadline". We may see it all resolve before 2021 ends, in truth—stay tuned.

Comments (2 posted)

STARTTLS considered harmful

By Jake Edge
August 18, 2021

The use of Transport Layer Security (TLS) encryption is ubiquitous on today's internet, though that has largely happened over the last 20 years or so; the first public version of its predecessor, Secure Sockets Layer (SSL), appeared in 1995. Before then, internet protocols were generally not encrypted, thus providing fertile ground for various types of "meddler-in-the-middle" (MitM) attacks. Later on, the STARTTLS command was added to some protocols as a backward-compatible way to add TLS support, but the mechanism has suffered from a number of flaws and vulnerabilities over the years. Some recent research, going by the name "NO STARTTLS", describes more, similar vulnerabilities and concludes that it is probably time to avoid using STARTTLS altogether.

Opportunistic TLS

Normally, protocol messages are either encrypted or not, but STARTTLS allows for a kind of middle ground. It is the command used to invoke TLS for an existing plaintext connection in what is known as opportunistic TLS. Servers can advertise their ability to handle TLS connections; for example, an email (SMTP/ESMTP) server specifies whether it will accept the STARTTLS command in its reply to the client's initial message (EHLO). If desired, the client can then request encryption using the STARTTLS command; a TLS handshake will then be performed and subsequent traffic will be encrypted. This contrasts with implicit TLS, where the communication channel, typically indicated by a specific port number, only operates in the encrypted mode.

As might be guessed, it is the switch from one mode to the other that is most vulnerable to MitM attacks. In the most basic attack, known as STARTTLS stripping, an attacker who can intercept and change the traffic can simply stop any STARTTLS command from being sent between the participants, ensuring that the conversation proceeds in plaintext form. Failure to establish an encrypted session could be treated as an error by clients, but sometimes is not. The next step after trying to establish the encryption is often some kind of authentication, which might effectively be performed in plaintext if the session is not encrypted.

As reported by one of the researchers, Hanno Böck, on the oss-security mailing list, a STARTTLS flaw found in 2011 was the jumping-off point for the research. That flaw was found by Postfix creator Wietse Venema in multiple SMTP servers, including Postfix; it allowed MitM attackers to "inject plaintext content into the TCP packet of a STARTTLS command and a server would interpret it as if it was part of the TLS session", Böck said. But the researchers found that this ten-year-old vulnerability was still unfixed in some servers; in its most severe form, "it can be used for credential stealing".

The other researchers, Damian Poddebniak, Fabian Ising, and Sebastian Schinzel, are from Münster University of Applied Sciences, while Böck is an independent researcher. They presented their paper at the 30th USENIX Security Symposium in August. As part of that work, they developed a testing toolkit called EAST that was used to analyze 28 email clients and 23 servers; only three of the clients and seven of the servers were completely unaffected by the 40 separate STARTTLS problems they uncovered. In addition, it turns out that 15 servers are still vulnerable to the same flaw found by Venema in 2011; scans found that 2% of mail servers on the internet exhibit the flaw. Both the paper and the web site have more details on all of the flaws, including which servers and clients are affected.

The researchers also looked at the POP3 and IMAP message-retrieval protocols, both of which have STARTTLS commands. Like what Venema saw for SMTP, they found that some servers will process the plaintext sent with the command as if it were part of the encrypted session. Instead of discarding any buffered input, the servers end up processing it after the TLS handshake is done—and the state of the connection has changed.

The most severe attacks exploit this behavior to exfiltrate the user's login credentials. They require that the attacker also has a valid account on the SMTP or IMAP server in question, though, which reduces the scope of the problem somewhat.

The attacker can inject commands that authenticate them and then start sending (SMTP) or storing (IMAP) an email. The login credentials sent by the victim will be stored in the email that the attacker can access.

Going in the other direction, mailbox contents can be forged by adding commands to the STARTTLS response from the server. Once again, the data is buffered and interpreted in the context of the encrypted session, this time by the email client, even though it was sent (and received) before the session was established.

This bug affected many popular mail clients, including Apple Mail, Mozilla Thunderbird, Claws Mail, and Mutt.
By injecting additional content to the server message in response to the STARTTLS command before the TLS handshake, we can inject server commands that the client will process as if they were part of the encrypted connection. This can be used to forge mailbox content.

On the web page for the NO STARTTLS flaws, the researchers called out a third vulnerability type that was found. In IMAP connections, the PREAUTH command can be sent by a server to indicate that the client is already authenticated, but it also prevents the client from using STARTTLS to transition into an encrypted state. The IMAP protocol does not allow STARTTLS after PREAUTH, which turns PREAUTH into a way to prevent encryption that is somewhat similar to STARTTLS stripping. Clients should reject that type of connection when encryption has been requested by the user, but some do not. The flaw was found in the Trojitá email client in 2014, but the researchers discovered that other clients are vulnerable to it.

When coupled with the little-used IMAP referral features (for logins and for mailboxes), an attacker could cause a client to send its credentials directly to the attacker:

By using PREAUTH to prevent an encrypted connection, an attacker can use referrals to force a client to send credentials to an attacker-controlled server. Fortunately, the referral features are not supported by many clients. We found only one client - Alpine - vulnerable to this combination of PREAUTH and referrals.

Recommendations

The researchers recommend that users stick to implicit TLS ports (465 for SMTP submission, 993 for IMAP, and 995 for POP3) to avoid STARTTLS altogether, though some service providers do not give that option for email submission. Application developers should strongly consider only offering support for implicit TLS; if that is not possible, testing with EAST or something similar is needed to ensure that plaintext is not processed as if it were encrypted. Meanwhile, mail-server administrators should consider disabling STARTTLS for the email-handling protocols.

As noted in the FAQ section, STARTTLS is the only way for mail servers (mail transfer agents or MTAs) to encrypt the traffic between themselves, as there is no support for implicit TLS for SMTP when it is used to transfer email between MTAs. Those STARTTLS transactions are already vulnerable to STARTTLS stripping attacks, because servers do not know whether the other endpoint accepts TLS or not; they cannot refuse to transfer mail because the connection is not encrypted. That means there is no real advantage for attackers to adopt the NO STARTTLS techniques. Adding authentication to the connections between servers, which is being worked on, would change the equation, however, so server code needs to be analyzed for buffering and other types of STARTTLS problems.

MitM vulnerabilities that can only be exploited with the ability to alter messages between the endpoints are sometimes seen as "lesser" flaws—and they are. But the requirements for an active meddler in between the participants are sometimes misjudged. For example, any WiFi router, say at the local coffee shop, or internet service provider being used, certainly has the capability of performing these kinds of attacks—it does not require some nation-state attack against the internet backbone by any means. WiFi routers, especially those in busy locations that handle lots of users, would make prime targets for a full compromise by an attacker. That compromise would provide a perfect platform for MitM attacks of various sorts against all of its users.

While the NO STARTTLS vulnerabilities are important and should be fixed, they are not generally huge problems in their own right. Their impact is fairly limited as the researchers noted:

The demonstrated attacks require an active attacker and may be recognized when used against an email client that tries to enforce the transition to TLS. We have informed all popular email client and server vendors and most issues are already fixed. We think that the demonstrated attacks would be difficult to execute on a large scale and we primarily expect them to be used in targeted attacks.

One thing the vulnerabilities do highlight is that projects are not generally all that good at scrutinizing their code for the same kinds of problems found in other, similar tools. Vulnerabilities found years ago should quite plausibly have led other email server and client projects to ferret out their own manifestations of those bugs long before now. Instead, it took some security researchers who were curious about the wart that is STARTTLS.

It is clear that retrofitting security into existing protocols is difficult to get right. Mixing plaintext and encrypted traffic over the same connection without these kinds of botches is evidently difficult as well. Those general principles will be important to keep in mind as we move forward; backward compatibility is most certainly a "nice to have", but secure protocols, without warts and hard-to-get-right pieces, is increasingly becoming a "must have".

Comments (54 posted)

A firewall for device drivers

By Jonathan Corbet
August 13, 2021

Device drivers, along with the hardware they control, have long been considered to be a trusted part of the system. This faith has been under assault for some time, though, and it fails entirely in some situations, including virtual machines that do not trust the host system they are running under. The recently covered virtio-hardening work is one response to this situation, but that only addresses a small portion of the drivers built into a typical kernel. What is to be done about the rest? The driver-filter patch from Kuppuswamy Sathyanarayanan demonstrates one possible approach: disable them altogether.

Virtual machines typically have direct access to little or no physical hardware; instead, they interact with the world by way of emulated devices provided by the host. That puts the host in a position of power, since it is in total control over how those virtual devices work. If a driver has not been written with the idea that the devices it manages could be hostile, chances are good that said driver can be exploited to compromise the guest and exfiltrate data — even when the guest is running with encrypted memory that is normally inaccessible to the host.

The virtio work hardens a handful of virtio drivers to prevent them from misbehaving if the host decides to not play by the rules. Getting there was a lot of work (which still has not reached the point of being merged), and there is a decidedly non-zero chance that vulnerabilities remain. Even if the virtio work is perfect, though, the kernel contains thousands of other drivers, most of which have not received anything close to the same amount of attention; few of them can be expected to be sufficiently robust to stand up to a malicious device. If the host can convince a guest to load the driver for such a device, the security game may well be over.

One possible solution to this problem is to methodically go through and harden all those thousands of until-now neglected drivers. The result would surely be a better kernel, but holding one's breath for this outcome would be ill-advised. Even if the developer effort for such a project can be found, there is a lot of code that would have to be tested with a large array of devices, a significant number of which stopped being widely available many years ago. Any realistic plan must accept that many drivers will never be hardened in this way.

The alternative is to simply make those drivers unavailable; a driver that cannot run at all is unlikely to compromise the system. Most virtual machines only need a handful of drivers; the rest are just dangerous dead weight. The obvious thing to do is to build a kernel with only the needed drivers, yielding a result that is not only safer, it will also be much smaller. The problem with this idea is that distributors hate the idea of shipping multiple kernels with different configurations. Each one adds to the build, test, and support loads, and it only takes a few configuration options to create a large array of kernel images. Distributors are thus highly motivated to ship a single kernel image if possible.

This is where Sathyanarayanan's patch set comes in; it provides a way for the system administrator to control which drivers are allowed to run. It adds two new command-line options — filter_allow_drivers and filter_deny_drivers — for that purpose; specific drivers can be added to either list using a "bus:driver" notation. The string "ALL" matches anything. So, for example, booting a system with:

    filter_allow_drivers=ALL:ALL

will allow all drivers to run — the default situation. The allow list is applied first and overrides the deny list, so a configuration like this:

    filter_allow_drivers=pci:e1000 filter_deny_drivers=ALL:ALL

will allow the e1000 network adapter driver to run, but will block everything else. There is also a new driver attribute in sysfs (called allowed) that can be used to change a driver's status at run time.

Driver subsystem maintainer Greg Kroah-Hartman was not impressed with this submission; he suggested either building a special kernel image or using the existing mechanisms to block unwanted device drivers instead. These could include denying them in the system's modprobe configuration or using the knobs in sysfs to unbind drivers from their devices. As Andi Kleen explained, though, these mechanisms do not quite satisfy the requirements. Configuring modprobe does not help with built-in drivers and, in any case, the intent is to prevent untrusted drivers from running at all. By the time user space can manually unbind a driver, it has already set itself up in the kernel and may already be trying to drive a malicious device.

Another way of looking at the situation, Kleen added, is to see a guest running on a potentially hostile host as being like a server on the Internet. The server almost certainly runs a firewall to restrict access to ports that are known (or at least hoped) to be safe; the driver filter is the equivalent of the firewall for the guest. That simplifies the hardening problem to the point that it might be feasible.

Whether these arguments will convince Kroah-Hartman remains to be seen; the conversation went quiet without reaching any sort of definitive conclusion. The problem that is driving this work seems real, though; if the current solution does not make the cut, we are likely to see other attempts to do something similar in the future. Devices have gone from hiding behind the kernel to being a part of the kernel's attack surface; security-focused developers will naturally want to reduce that surface as much as possible.

Comments (21 posted)

Short subjects: Realtime, Futexes, and ntfs3

By Jonathan Corbet
August 16, 2021

Even in the dog days of (northern-hemisphere) summer, the kernel community is a busy place. There are many developments that show up on your editor's radar, but which, for whatever reason, do not find their way into a full-length feature article. The time has come to catch up with a few of those topics; read on for updates on the realtime patch set, the effort to reinvent futexes, and the ntfs3 filesystem.

Realtime

The realtime preemption story is a long one; it first showed up on LWN in 2004. Over the years, this work has had a significant impact on kernel development as a whole; much of what is just seen as part of the core kernel now had its origins in the realtime tree. The code around which the realtime work was initially built — the preemptible locking infrastructure — remains out of the mainline, though. Without the locking changes, the mainline is not able to offer the sort of response-time guarantees that realtime users need.

The locking infrastructure makes almost all locks, spinlocks included, into sleeping locks; that ensures that a higher-priority task can always take over the processor quickly. It is the sort of change that makes kernel developers nervous, since mistakes in this area can lead to all sorts of subtle problems. For that reason, predicting when the locking code will be merged into the mainline is a fool's game. Your editor knows this well, having confidently predicted that it would be merged within a year — in 2007.

Still, one might be tempted to think that the end might be getting closer. Realtime developer Thomas Gleixner has brought the locking infrastructure back to the mailing lists for consideration; the fifth revision of the 72-part patch set was posted on August 15. Normally configured kernels should behave about the same with these patches applied, but those configured for realtime operation will have realtime-specific versions of mutexes, wait/wound mutexes, reader/writer semaphores, spinlocks, and reader/writer locks.

Commentary on this work has slowed; there does not appear to be much in the way of objections at this point — though it must be noted that Linus Torvalds has not yet made his feelings known on the subject. Unless something surprising comes up, it might just be that the core realtime code will finally find its way into the mainline. Your editor, however, is too old, wise, and cowardly to venture a guess as to when that will happen.

A smaller step for futex2

Perhaps the number of comments on the realtime changes is low because most developers fear the prospect of digging into code of that complexity. There are, however, places in the kernel that are even more frightening; the futex subsystem is surely one of them. Futexes provide fast mutexes for user space; they started out as a simple subsystem but failed to remain that way. Over time, it has become clear that futexes could do with a number of improvements to make them better suited for current workloads and, at the same time, to move beyond the multiplexer futex() system call.

For some time now, André Almeida has been pushing in that direction with the futex2 proposal. This work would split the futex functionality into several single-purpose system calls, support multiple lock sizes, and more. While there has been interest in this work, progress has been slow (to put it charitably); it seems as if the kernel is no closer to a new futex subsystem than it was a year or two ago.

In an attempt to push this project forward, Almeida has posted a new patch set with significantly reduced ambitions. Rather than introduce a whole new subsystem with its own system calls, this series adds exactly one system call that works with existing futexes:

    struct futex_waitv {
        uint64_t val;
        void *uaddr;
        unsigned int flags;
    };

    int futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes,
                    unsigned int flags, struct timespec *timo);

This function will cause the calling process to wait on several futexes simultaneously, returning when one or more of them can be acquired (or the timeout expires). That functionality is not supported by the current futex API, but it turns out to be especially useful for game engines, which perform significantly better when using the new system call. This documentation patch describes the new API in more detail.

This patch set has drawn no comments in the week since it was posted. Assuming that silence implies a lack of objections rather than a lack of interest, this piece of the futex2 work might make it into a mainline release before too long. Whether the rest of the futex2 work will follow depends on how strong the use cases driving it are; if futex_waitv() solves the worst problems, there might not be much motivation to push the other changes.

Waiting for ntfs3

The kernel has long had an implementation of the NTFS filesystem, but it has always suffered from performance and functionality problems; the user community would gladly trade it for something better. By all accounts, the ntfs3 implementation posted by Konstantin Komarov is indeed something better, but it is still not clear when it will be merged; this work was first posted one year ago, and version 27 of the patch set was posted on July 29.

The delay in accepting this work is proving frustrating to users; this complaint from Neal Gompa is typical:

I know that compared to all you awesome folks, I'm just a lowly user, but it's been frustrating to see nothing happen for months with something that has a seriously high impact for a lot of people.
It's a shame, because the ntfs3 driver is miles better than the current ntfs one, and is a solid replacement for the unmaintained ntfs-3g FUSE implementation.

Torvalds has said that maybe it is time to merge this code, but that still may not happen right away.

The biggest holdup for ntfs3 at the moment would appear to be concerns about the level of development effort behind it. From the public evidence, it seems that ntfs3 is a one-person project, and that makes other filesystem developers nervous. Those developers have been reporting test failures for ntfs3 that have gone unfixed. Meanwhile, Komarov is sometimes unresponsive to questions; various comments on the version 26 posting (from early April) got no answers, for example. This sort of silence gives the impression that ntfs3 does not have a lot of effort behind it. (It's worth noting that some other developers have been happy with the level of response from Komarov).

Unsurprisingly, the filesystem developers are unenthusiastic about the prospect of taking on a new NTFS implementation that may turn out to have serious problems and which does not come with a promise of reliable support. For ntfs3 to be merged, those fears will need to be addressed somehow. One way for that to happen, as suggested by Ted Ts'o, would be for other developers, perhaps representing one or more distributors that would like to see a better NTFS implementation in the kernel, to start contributing patches to ntfs3 and commit to helping with its maintenance going forward.

Comments (16 posted)

PostgreSQL's commitfest clog

By Jonathan Corbet
August 12, 2021

While it may seem like the number of developers would be the limiting factor in a free-software project, the truth of the matter is that, for all but the smallest of project, the scarcest resource is reviewer time. Lots of people like to crank out code; rather fewer can find the time to take a close look at somebody else's patches. Free-software projects have taken a number of different approaches to address the review problem; the PostgreSQL developer community is currently struggling with its review load and considering changes to its commitfest process in response.

Part of the review problem is clerical in nature: patches must be tracked along with their review status. Some projects, like the Linux kernel, take a distributed approach; review status is tracked in the patches themselves and subsystem maintainers are expected to keep up with which patches are ready to be merged. PostgreSQL developers, naturally, prefer to keep that information in a central database. Roughly every other month, outstanding patches are gathered for a month-long commitfest, during which the project makes a decision on the fate of each one of them. Each commitfest has a designated manager who is responsible for ensuring that all patches have been dealt with by the end of the commitfest.

That is the intended result, anyway. What actually happens, as Simon Riggs recently pointed out on the PostgreSQL Hackers mailing list, is that a lot of patches languish in the queue with no firm decision being made; this can happen as the result of a lack of reviews or a failure of the author to respond, among other reasons. Riggs noted that the 2021-09 commitfest, which is scheduled for September, has 273 patches queued (since increased to 279): "Of those, about 50 items have been waiting more than one year, and about 25 entries waiting for more than two years". The community has been working hard to clear the queue during each commitfest, Riggs said, but still "it's overflowing".

A look at past commitfests (which can be viewed at commitfest.postgresql.org) shows that a great many patches are "dealt with" by deferring them to the next commitfest. The recently concluded 2021-07 commitfest considered 342 patches; of those, 233 (just over 2/3) were deferred to the next commitfest. When the punting rate is that high, actually clearing out the commitfest queue becomes a distant prospect at best.

There is a longstanding expectation within the PostgreSQL project that anybody submitting a patch for consideration in a given commitfest should take the time to review somebody else's patch, preferably one of similar complexity. In theory, that would balance the numbers of submitters and reviewers; in practice it does not seem to be getting the job done. Part of the problem, almost certainly, is that some submitters just never quite get around to fulfilling that side of the bargain; life is busy after all. One of the commitfest manager's jobs is to encourage developers to do reviews; that is, needless to say, a task that is even less fun than patch review. In the discussion, Noah Misch suggested an obvious technical solution: track each submitter's review balance in the database to make it clear who is not living up to expectations. But, as Tomas Vondra pointed out, there are a lot of subjective questions about what constitutes a review, equivalent complexity, and more.

Even if the rule were fully observed, though, it seems unlikely that the problem could go away. Many patches need input from multiple reviewers; they may also go through the review process many times with changes in response to feedback in between. Thus, the number of needed reviews is sure to exceed the number of submitted patches by a significant margin.

Bruce Momjian suggested that part of the problem, in the last year at least, is the complete lack of in-person developer meetings. Greg Stark agreed: "Every year there are some especially contentious patches that don't get dealt with until the in-person meeting which pushes people to make a call". He also noted, though, that the number of patches in this category isn't sufficient to explain the size of the backlog. Michael Banck suggested holding virtual meetings to make decisions on patches; there are few developers out there who are clamoring for more online meetings, but enough might be convinced to attend one to make some progress possible.

One thing that almost everybody seemed to agree on is that many of the patches that slide from one commitfest to another simply should not be there. According to Tom Lane:

As a community, we don't really have the strength of will to flat-out reject patches. I think the dynamic is that individual committers look at something, think "I don't like that, I'll go work on some better-designed patch", and it just keeps slipping to the next CF.

Robert Haas expressed a similar point of view:

I think [commitfest managers] have been pretty timid about bouncing stuff that isn't really showing signs of progress. If a patch has been reviewed at some point in the past, and say a month has gone by, and now we're beginning or ending a CommitFest, the patch should just be closed. Otherwise the CommitFest just fills up with clutter.

Improving the project's ability to close dead-end patches might not be easy. Arguably, it is up to the commitfest manager to make that decision; Riggs suggested that the manager only be allowed to defer consideration on 50% of the submitted patches. But Lane (in the message quoted above) noted that only managers who are "assertive enough and senior enough" have been able to "kill off patches that didn't look like they were going to go anywhere" and said that perhaps this should not be the commitfest manager's job in any case. Certainly disappointing dozens of submitters over the course of a month — and, undoubtedly, hearing their thoughts on the matter — is not going to make the job of commitfest manager more appealing.

One other idea that came up a few times was to place a limit on the number of commitfests that a patch could be allowed to slide through before being removed. This approach has the advantage of being relatively automatic and objective; nobody would have to step up as the evil maintainer who decided to reject a whole pile of patches. It would also bring a natural end to patches that nobody can find a way to care about.

The conversation wound down without reaching any solid conclusions. Perhaps this issue, too, will be deferred to the next commitfest for a decision. But the discussion may at least motivate some developers to put more time into cleaning out the queue and reducing the backlog — in the short term, at least. In the longer term, the PostgreSQL will have to continue with a shortage of reviewers, just like many other projects.

Comments (13 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

Briefs: Asahi Linux; Debian 11; Git 2.33; Go 1.17; KDE Gear 21.08; eBPF Foundation; Quotes; ...
Announcements: Newsletters; conferences; security updates; kernel patches; ...

Next page: Brief items>>