|
|
Subscribe / Log in / New account

Leading items

Welcome to the LWN.net Weekly Edition for July 1, 2021

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Mozilla Rally: trading privacy for the "public good"

By Jake Edge
June 30, 2021

A new project from Mozilla, which is meant to help researchers collect browsing data, but only with the informed consent of the browser-user, is taking a lot of heat, perhaps in part because the company can never seem to do anything right, at least in the eyes of some. Mozilla Rally was announced on June 25 as joint venture between the company and researchers at Princeton University "to enable crowdsourced science for public good". The idea is that users can volunteer to give academic studies access to the same kinds of browser data that is being tracked in some browsers today. Whether the privacy safeguards are strong enough—and if there is sufficient reason for users to sign up—remains to be seen.

Studies

The underlying theme of being able to control who is able to access your data, coupled with using it for the "public good", may well resonate with some people. The initial study that is available for participation by Rally users is "Political and COVID-19 News". It is being run by Princeton University’s Center for Information Technology Policy under the auspices of professor Jonathan Mayer, who also helped develop the Rally platform. The goals of the study are interesting and any conclusions that it draws could potentially be quite helpful for fighting the problems of misinformation on the net:

This study will help us understand how web users encounter, consume, and share news online about politics and COVID-19. There are a variety of sources for information on these topics: some authoritative and trustworthy, and some not. We hope the study can inform efforts to help users distinguish trustworthy and untrustworthy content.

As might be expected for a Mozilla project, Rally is integrated with the Firefox browser; it is an optional add-on that can easily be installed from a button on the Rally home page. Doing so brings up a list of various permissions the add-on needs in order to function, which can be reviewed in the "about:addons" page after it is installed. There is also an extensive privacy policy that must be agreed to; it outlines the kinds of data that can be collected (which includes things like geographic location, demographic information, and the hardware/software platform), how it can be used, and who it can be shared with. The policy outlines the big picture, while each individual study will further narrow down its data needs and plans.

Rally is only available to Firefox users in the US who are at least 19 years of age. Neither the age nor location requirements are enforced in any obvious way; no problem was encountered installing the add-on from a non-US location. There is a page that asks for demographic information (age, gender, race, education, income, and US zip code), but it is optional. It may be that certain studies will not use any data shared without corresponding demographic information, however.

The announcement talks about academic studies, mentioning the COVID-19 study and an upcoming "Beyond the Paywall" study in conjunction with the Stanford University Graduate School of Business. The latter sounds like it could also be useful, especially to newspapers, magazines, and internet news outlets:

It aims to better understand news consumption, what people value in news and the economics that could build a more sustainable ecosystem for newspapers in the online marketplace.

Beyond those two academic studies, the Rally team has its own "Your Time Online and 'Doomscrolling'" study that is currently running. It is meant to better understand "how our community browses the internet, and how these browsing dynamics differ across segments of people". The description of the kinds of information that will be gathered and what might be done with it are particularly concerning from a privacy perspective, however. In truth, those who are highly privacy-conscious are likely to find much that is worrisome in the descriptions of both of the current studies.

The study descriptions do try to allay some of the fears that potential volunteers might have, though. The "How We Protect You" section in the information pages of the studies is meant to clarify what is being done with the data and the protections being placed on it. The "How Rally Works" page is also geared toward reassuring the privacy-conscious user. By the sounds, Rally is taking extraordinary care; it is only collecting what it needs (and has specified), is encrypting the data from the browser all the way to an offline analysis environment, and is limiting access to the data to only those who are working on the study.

On the other hand, the Rally add-on can run in private-browsing windows, which was not apparent when it was being installed. The two current studies explicitly state that they will not collect data from private-browsing sessions, which leaves open the possibility that others will collect that data down the road. That may also be of concern to the privacy-conscious, though, of course, anyone using the private-browsing feature is pretty obviously conscious of their privacy to some extent.

In general, though, Rally is protecting its data far more carefully than the advertising networks and other user-tracking organizations are doing with their data. Many who have commented about the project, here and at other sites, seem to mistrust Mozilla's commitment to privacy and some see Rally as a mechanism for the company to generate income. Research organizations might be willing to pay for the privilege of using the platform, but it does not seem likely to be a huge income stream, especially once Rally rolls out on other browsers (as is mentioned in the FAQs and elsewhere).

Filling in details

One question might be: why would anyone want to sign up? Those who are privacy-conscious may well not be interested in allowing any access to their data, while those who are not seem rather unlikely to go out of their way to install the add-on—if they are even using Firefox at all. It essentially comes down to the whole "research for the public good" theme and whether people will care enough to forgo some of their privacy—and enough of them will be willing to install an add-on no matter their privacy inclinations—to foster it.

Over at Hacker News, Mayer has been commenting to try to answer some of the questions and concerns posted in a thread about the announcement. The "why?" question came up there, and Mayer tried to clarify what he and others are after:

The motivation is enabling crowdsourced scientific research that benefits society. [...] There are many research questions at the intersection of technology and society where conventional methods like web crawls, surveys, and social media feeds aren't sufficient. That's especially true for platform accountability research; the major platforms have generally refused to facilitate independent research that might identify problems, and platform problems often involve targeting and personalization that other methods can't meaningfully examine.

One area in particular that might be of interest to potential volunteers is in this area of "platform accountability". The large, social media platforms have often come under scrutiny for their behavior—and its effect on users—but there is no way to gather data on that except from within the browsers of users of those sites. As with many other commenters, "Yaina" lamented that the announcement did not specify the problem being solved very well. Yaina noted that the big internet companies can already do these kinds of studies, but that others are left out:

This is a luxury many researchers that work outside of these big tech companies don't have, which creates a scientific power imbalance. Mozilla Rally is meant to give these capabilities to everyone, and the platform is meant to ensure that you always know what you sign up for and what data is being used.

If I understand the Princeton example correctly: They want to figure out how people consume and spread misinformation. Social networks like Facebook have all that data but won't share it. Now you can opt-in to a Rally study where independent researchers can examine the data.

Mayer largely agreed with that characterization, though the imbalance is more far-reaching:

The power imbalance goes far beyond science. Independent research is foundational for platform accountability. An example: when I was working on the Senate staff, before I started teaching at Princeton, a recurring challenge was the lack of rigorous independent research on platform problems. We were mostly compelled to rely on anecdotes, which made oversight and building a factual record for legislation difficult.

There is, of course, something of a self-selection bias at work among Rally users. If all of the participants have to know about the project, believe in its goals, and be willing to donate their data even though it reduces their privacy to a certain extent, they may well not reflect a cross-section of the browser-using public. Mayer addressed that issue as well:

The Rally participant population is not representative of the U.S. population—these are users who run Firefox (other browsers coming soon), choose to join Rally, and choose to join a study. In research jargon, there's significant sampling bias.

For some studies, that's OK, because the research doesn't depend on a representative sample. For other studies, researchers can approximate U.S. population demographics. When a user joins Rally, they can optionally provide demographic information. Researchers can then use the demographics with reweighting, matching, subsampling, and similar methods to approximate a representative population. Those methods already appear throughout social science; whether they're sufficient also depends on the study.

Part of the difficulty in the messaging around a project like Rally is all of the moving parts it has and that different kinds of users are going to need different areas of emphasis in order to really make it clear for them. It is a project that sits at a particularly uncomfortable intersection of concerns—or the lack thereof. The lack of any real tangible benefit from joining up is problematic as well. "For the good of society" has a nice ring to it, but it is terribly difficult to quantify.

If Mozilla were a different kind of company, one could imagine it gathering this kind of information without any kind of uproar from the social-network-using folks who seem utterly unconcerned with the massive privacy invasions those kinds of sites routinely perform. But Mozilla is not that kind of organization, so it needs to convince those who do not really seem to care about privacy much to care enough to install the add-on, while not excessively irritating the more tech-savvy users who get up in arms about even the smallest loss of private data. It is a hard balance to find.

Given all that, it is a little hard to see Rally being a huge success. There are certainly perfectly reasonable concerns about gathering this kind of data, storing it, dealing with governments that want access to it, and so on. The privacy-savvy may well skip over Rally for its real or perceived shortcomings, while the vast majority of folks may either never hear of it or pay it no attention whatsoever. That is somewhat sad, perhaps, at least to those who can see value in the kinds of studies (and platform oversight) that Rally data-gathering would enable. It will be interesting to see what comes of it.

Comments (54 posted)

An unpleasant surprise for My Book Live owners

By Jake Edge
June 29, 2021

Embedded devices need regular software updates in order to even be minimally safe on today's internet. Products that have reached their "end of life", thus are no longer being updated, are essentially ticking time bombs—it is only a matter of time before they are vulnerable to attack. That situation played out in June for owners of Western Digital (WD) My Book Live network-attached storage (NAS) devices; what was meant to be a disk for home users accessible via the internet turned into a black hole when a remote command-execution flaw was used to delete all of the data stored there. Or so it seemed at first.

Missing data

The first indication of the problem came in a June 23 post to the WD support forums by user "sunpeak" about a now-empty My Book Live device ("somehow all the data on it is gone today"), though the 2TB device had been nearly full before that. Sunpeak also reported that the administrative password had been changed so they could not log into the device. It was not long before others added their stories of woe to the thread. In the early going, there was concern that WD had released some kind of firmware update that caused this behavior, but it turns out that those devices have had no updates for quite some time at this point.

Various posters in the thread dug out the logs from their devices to see what they could determine. There were reports that some of the devices had been reset to the factory settings via the factoryRestore.sh script, for unknown reasons, but those reports also said that the default "admin" username (with the same password) did not work. Eventually, "t4thfavor" strongly suggested removing My Book Live devices from the internet by way of a firewall—or simply pulling the Ethernet cable entirely. That good advice was echoed by sunpeak and others in thread.

Not long after that, WD posted a security bulletin to the support forum with effectively the same advice. Both that post and the more formal WDC-21008 security bulletin were quick to point out that these devices were introduced in 2010 and stopped receiving updates in 2015. The WDC-21008 bulletin pointed to CVE-2018-18472, though no context was given. Looking at the CVE provides some missing context, though:

Western Digital WD My Book Live and WD My Book Live Duo (all versions) have a root Remote Command Execution bug via shell metacharacters in the /api/1.0/rest/language_configuration language parameter. It can be triggered by anyone who knows the IP address of the affected device, as exploited in the wild in June 2021 for factory reset commands,

Clearly the CVE description has been recently updated. But the 2018 date in the CVE number is telling; this flaw has been known for three years or so at this point. It was originally reported in a blog post at WizCase that offered much the same advice about removing the device from the internet. As shown by the proof of concept (PoC) in that post (and this report with a clearer PoC), simply tacking a command in backticks (e.g. `whoami`) to the data sent with an HTTP PUT command to the configuration URL will cause the command to be executed with root privileges. Backticks are used in various languages (e.g. Unix shells, PHP, Perl) to execute operating-system commands; the device's interface is written in PHP and shell scripts, so it seems clear that the input provided in the PUT is not being sanitized correctly.

But wait ...

On June 29, the picture got rather murkier. Ars Technica reported on some research it had done on the attacks in collaboration with Derek Abdine, CTO at security firm Censys. It turns out that there is a second, previously unknown flaw in the NAS devices: there is a way to do a factory reset through the configuration interface without providing a password. In fact, the relevant source code has the password checks commented out; anyone who knows how to format the XML-based request can wipe any My Book Live just by knowing its IP address.

It turns out that there is evidence that there are at least two attackers at work here—and they aren't working together. As Abdine described, it would seem that CVE-2018-18472 had been in use for some time, adding the devices to a botnet (possibly Linux.Ngioweb). The just-discovered factory-reset flaw (which does not yet have a CVE number) was only recently used, perhaps as a way to destroy or disrupt the botnet. Whatever the reason, though, exploiting that flaw and wiping the user data on the NAS is what brought the whole situation to light.

[Update: WD has put out more information about the factory-reset flaw, which it said is due to a botched refactoring effort. The bug has been assigned CVE-2021-35941. In addition, WD is offering data-recovery services for those who lost data.]

The configuration endpoint that was vulnerable to the original command-execution flaw (language_configuration.php) was being modified on devices that were being attacked that way. A password test was added so that only the original attacker could further exploit that particular flaw; a SHA-1 hash of the password is used in the modified version of language_configuration.php that has been recovered. As noted in both reports, though, the attacker apparently did not know that the parameters sent to the device's interface can be logged, so at least one of the "secret" passwords used by the attacker is now known. It was written, in plaintext, to a log on the device.

While "rival attackers" is only a theory, it makes sense that the botnet controller would have no need (or interest) in causing the factory reset. After all, they had full control of the system and could make it do whatever they wanted (including wiping the disks if that was somehow useful to them). All that the factory reset did was draw attention to the devices, leading to the exposure of the flaws and, thus, curtailing future My Book Live exploits.

Original response

At some point after the WizCase post in 2018, WD responded to it with much the same information as was in its recent responses. But in part of its response, which seems geared toward covering its ass more than anything else, it described the products in a way that may seriously irritate the owners of these NAS devices:

We encourage users who wish to continue operating these legacy products to configure their firewall to prevent remote access to these devices, and to take measures to ensure that only trusted devices on the local network have access to the device.

Calling these devices "legacy products" obviously reflects WD's level of interest in them, but it probably does not mirror the opinions of most folks who bought them. Turning off the security-update spigot around a year after the product was discontinued seems fairly short-sighted, especially for a system that was touted as one that can be connected to the internet in order to "securely access your media and files anywhere in the world". At some point, the company realized that those devices should not be connected to the internet, but did not make an update, nor, seemingly, raise the profile of this problem so that users could protect themselves.

While the product life cycle may be long finished from the perspective of WD, the devices are still available from outlets like Amazon. Anyone who buys one today might be forgiven for thinking it is still supported. A NAS device is not a cell phone or other consumer-electronics gizmo that might be shunted aside for the latest thing in relatively short order; one might well expect to set up a home NAS and have it running for years—or even a decade or more. One hopes that those who do set up such a device also have another backup strategy go with it, however.

It is, as a number of people have observed, fairly surprising that it took this long for the CVE-2018-18472 vulnerability to be exploited; maybe the recent updates have shown that it actually was used much earlier. The exploit is trivially easy to perform and it provides full access to what would seem to be fairly high-value data. These devices would make for prime ransomware targets, one would think, even if the most recent attackers were perhaps just digital vandals.

One way to route around device makers and their arbitrary life-cycle decisions would be to create and maintain an alternate firmware for the device. It is, after all, simply a Linux system under the covers. There is some information on the WD support site about how to build and install custom firmware, but there does not seem to be an active existing project for My Book Live. Firmware based on free software would at least be possible to fix, of course, even in the absence of a project keeping things up to date.

Device owners need to be extremely careful with the internet access they provide to the gadgets that they buy. That's easy to say, but can be hard (or impossible) to do in a world where everything from shoes to light bulbs come equipped with some kind of whiz-bang feature that requires internet access. Makers of devices that are attacked rarely suffer anything more than a bit of negative press—and that only briefly. Under those conditions, is it any real surprise that people can lose all of their important data, possibly via a vulnerability that has been public for years?

Comments (70 posted)

Spectre revisits BPF

By Jonathan Corbet
June 24, 2021
It has been well over three years now since the Spectre hardware vulnerabilities were disclosed, but Spectre is truly a gift that keeps on giving. Writing correct and secure code is hard enough when the hardware behaves in predictable ways; the problem gets far worse when processors can do random and crazy things. For an illustration of the challenges involved, one need look no further than the BPF vulnerability described in this advisory, which was fixed in the 5.13-rc7 release.

Attacks on Spectre vulnerabilities generally rely on convincing the processor to execute, in a speculative mode, a sequence of operations that cannot happen in real execution. A classic example is an out-of-range array reference, even though the code performs a proper bounds check. The erroneous access will be backed out once the processor figures out that it mispredicted the result of the bounds check, but the speculative access will leave traces in the memory caches that can be used to exfiltrate data.

The BPF virtual machine has always been an area of special concern when it comes to defending against speculative-execution attacks. Most such attacks rely on finding a fragment of kernel code that can be made to do surprising things when the CPU is executing speculatively; kernel developers duly have made a concerted effort to eliminate such fragments. But BPF exists to enable the loading of code from user space that runs within the kernel context; that allows attackers to craft their own code fragments and avoid the tedious task of combing through the kernel code.

Much work has been done in the BPF community to frustrate those attackers. For example, array indexes are ANDed with a bitmask so that they cannot reach outside of the array even speculatively, regardless of what value they may contain. But it can be hard to anticipate every case where the processor may do something surprising.

The vulnerability

Consider, for example, the following fragment of code, taken directly from this commit by Daniel Borkmann fixing this vulnerability:

    // r0 = pointer to a map array entry
    // r6 = pointer to readable stack slot
    // r9 = scalar controlled by attacker
    1: r0 = *(u64 *)(r0) // cache miss
    2: if r0 != 0x0 goto line 4
    3: r6 = r9
    4: if r0 != 0x1 goto line 6
    5: r9 = *(u8 *)(r6)
    6: // leak r9

Incidentally, the changelog for this patch is an outstanding example of how to document a vulnerability and its fix; it's worth reading in full.

In normal (non-speculative) execution, the above code has a potential problem. The register r9 contains an attacker-supplied value; that value is assigned to r6 in line 3, which is then used as a pointer in line 5. That value could point anywhere in the kernel's address space; this is just the sort of unconstrained access that the BPF verifier was designed to prevent, so one might think that this code would never be accepted by the kernel in the first place.

The verifier, though, works by exploring all of the possible paths that execution of a BPF program could take. In this case, there is no possible path that executes both lines 3 and 5. The assignment of the attacker-supplied pointer only happens if r0 contains zero, but that value will prevent the execution of line 5. The verifier thus concludes that there is no path that can result in the indirection of a user-supplied pointer and allows the program to be loaded.

But that verification runs in the real world; different rules apply in the speculative world.

Line 1 in the above code fragment references memory that an attacker will have taken pains to ensure is not currently cached, forcing a cache miss. Rather than wait for memory to fetch the value, though, the processor will continue speculatively, making guesses about how any conditional statements involving r0 will play out. And those guesses, as it turns out, could well be that neither if condition (in line 2 or 4) will evaluate true and, thus, neither jump will be taken.

How can that be? Branch prediction doesn't work by guessing a value for r0 and checking the result; it is, instead, based on what the recent history of that particular branch has been. That history is stored in the CPU's "pattern history table" (PHT). But the CPU cannot possibly track every branch instruction in a large program, so the PHT takes the form of a hash table. An attacker can locate code in such a way that its branches land in the same PHT entries as the branches in the crafted BPF program, then use that code to train the branch predictor to make the desired guesses.

Once the attacker has loaded the code, cleared out the caches, and fooled the branch predictor into doing silly things, the battle is over; the CPU will speculatively reference the attacker-supplied address. Then it's just a matter of leaking the results in any of the usual ways. It is a bit of a tedious process — but computers are good at following such processes without complaining.

It is worth noting that this is not a hypothetical attack. According to the advisory, multiple proofs-of-concept were sent to the security@kernel.org list when this problem was reported. Some of them do not require the step of training the branch predictor (one such is provided in the above-linked commit). These attacks can read out any memory in the kernel's address space; given that all of physical memory is contained therein, there are no real limits to what can be exfiltrated. Since unprivileged users can load a few types BPF programs, root access is not needed to carry out this attack. This is, in other words, a serious vulnerability.

Closing the hole

The fix in this case is relatively straightforward. Rather than prune paths that the verifier "knows" will not be executed, the verifier will simulate them speculatively. So, for example, when checking the path where r0 is zero, the unfixed verifier would simply conclude that the test in line 4 must be true and not consider the alternative. With the fix, the verifier will look at the false path (which includes line 5), conclude that an unknown pointer is being used, and prevent the loading of the program.

This change has the potential to block the loading of correct programs that could be run before, though it is hard to imagine real-world, non-malicious code that would include this kind of pattern. It will, of course, slow the verification process to force it to examine paths that cannot occur in normal program execution, but that's the speculative world we live in.

This fix was merged into the mainline and can be found in the 5.13-rc7 release. It has since found its way into the 5.12.13 and 5.10.46 stable updates, but not (yet) into any of the earlier stable releases. With this change, those kernels are protected against yet another Spectre vulnerability, but it would be foolhardy to assume that this is the last one.

Comments (13 posted)

Suppressing SIGBUS signals

By Jonathan Corbet
June 25, 2021
The mmap() system call creates a mapping for a range of virtual addresses; it has a long list of options controlling just how that mapping should work. Ming Lin is proposing the addition of yet another option, called MAP_NOSIGBUS, which changes the kernel's response when a process accesses an unmapped address. What this option does is relatively easy to understand; why it is useful takes a bit more explanation.

Normally, when a process performs an operation involving memory, it expects the desired data to be read from or written to the requested location. Sometimes, though, things can go wrong, resulting in the delivery of a fatal (by default) signal to the process. A "segmentation violation" (SIGSEGV) signal is generated in response to an attempt to access a valid memory address in a way that is contrary to its protection — writing to read-only memory, for example. Attempting to access an address that is invalid, instead, results in a "bus error" (SIGBUS). Bus errors can be provoked in a number of ways, including using an improperly aligned address or an address that is not mapped at all. If a process uses mmap() to create a mapping that extends beyond the end of the backing file, attempts to access the pages past the end of the file will result in SIGBUS signals.

If, however, a memory range has been mapped with the proposed MAP_NOSIGBUS flag, SIGBUS signals will no longer be generated in response to an invalid address that lies within the mapped area. Instead, the guilty process will get a new page filled with zeroes. If the mapped area is backed up by a file on disk, the new page will not be added to that file. To a first approximation, the new option simply makes SIGBUS signals go away, with the process never even knowing that it had tried to access an invalid address.

OK...but why?

This behavior may seem like a strange thing to want. One would not normally expect a mapped area to contain invalid addresses within it, and one ordinarily wants to know if a program is generating and using invalid addresses. As it happens, mapped areas can contain invalid addresses in one normal use case: if that area is mapping a file, and it extends beyond the end of the file on disk. Attempts to access pages beyond the end of the file will generate a SIGBUS signal; this situation can be avoided by extending the file before attempting to access it through the mapping.

MAP_NOSIGBUS is explicitly incompatible with that way of working, though; since the zero-filled pages that it creates in response to invalid addresses are not connected to the backing file, it makes extending the file without redoing the mapping impossible. Instead, this option exists to address another problem: graphical clients that can, accidentally or intentionally, cause a compositor to crash.

Graphical applications often have to communicate large amounts of data to the compositor. An efficient way of doing this can be to map a file and pass a descriptor to the compositor; that file (which can live in a memory-only filesystem) becomes a shared-memory segment between the two processes. If, however, the client process then calls ftruncate() to shorten the file, the result is a mapping (in the compositor) that extends beyond the end of that file. If the compositor tries to access the shared-memory segment beyond the new end of the file, it will get a SIGBUS signal; in the absence of measures taken to the contrary, that will cause the compositor to crash, which is the sort of thing that user-experience developers usually make at least a modest effort to avoid. The SIGBUS signal can be caught and handled in the compositor, but that can be complex and hard to get right.

As Simon Ser, who works on Wayland compositors, noted back in April, there is another mechanism for passing data between the two processes: the memfd abstraction. A memfd can be "sealed", meaning that the creator cannot shrink it as described above (or, indeed, change it at all); the recipient, knowing that the segment will not change unexpectedly, can access it safely. But, as Ser points out, no compositor requires the use of sealed memfds because there are clients that are unwilling or unable to use them. So compositors must either jump through the SIGBUS-handling hoops or risk filling the disk with embarrassing core dumps.

But if the compositor could map a segment in a way that wouldn't create SIGBUS signals on invalid addresses, this whole problem would go away. Ser suggested looking at the __MAP_NOFAULT flag supported by OpenBSD as a possible solution. At the beginning of June, Lin responded with an implementation of MAP_NOSIGBUS, which differs from __MAP_NOFAULT in a number of ways. The initial implementation only worked for the in-memory tmpfs filesystem, but Hugh Dickins objected, saying that it should apply to any mapping; the second (and current revision) reflects that criticism and works regardless of the backing store behind a mapping.

Limitations

One significant limitation of the current implementation is that it only works for MAP_PRIVATE mappings — that seems like it could be a fatal flaw for a mechanism that is meant for use with mappings shared between clients and a compositor. But, as Ser explained, private mappings will work in almost all cases; since the data transfer is one-way from the client to the compositor, the mapping can be read-only on the compositor side. The big exception is screen capture, which will still have to be handled specially as long as shared mappings are not supported. So the solution is not complete, but 90% is a big step in the right direction.

The second version of the patch set has seen relatively little discussion; it seems that the developers who care about it are relatively happy with its current condition (though Kirill Shutemov was heard to grumble a bit about "one-user features"). There are never any guarantees, but there does seem to be a reasonable chance that this change could be merged as early as the 5.14 release.

Comments (54 posted)

Some 5.13 development statistics

By Jonathan Corbet
June 28, 2021
As expected, the 5.13 development cycle turned out to be a busy one, with 16,030 non-merge changesets being pulled into the mainline over a period of nine weeks. The 5.13 release happened on June 27, meaning that it must be time for our traditional look at the provenance of the code that was merged for this kernel.

In terms of changeset counts, 5.13 was not the busiest development cycle ever; that record still belongs to 5.8, with 16,306 changesets merged; indeed, 5.10 (16,174) was also busier. But 5.13 did set a record by including the work of 2,062 developers — the first time more than 2,000 developers have participated in a single release cycle. Of those developers, 329 contributed their first patch to the kernel in this cycle, a number that just matches the previous record set by 4.12.

The most active developers this time were:

Most active 5.13 developers
By changesets
Lee Jones 2591.6%
Fabio Aiuto 1961.2%
Marco Cesati 1901.2%
Sean Christopherson 1841.1%
Pierre-Louis Bossart 1801.1%
Bhaskar Chowdhury 1751.1%
Christoph Hellwig 1460.9%
Johan Hovold 1420.9%
Christophe Leroy 1420.9%
Pavel Begunkov 1350.8%
Andy Shevchenko 1310.8%
Colin Ian King 1170.7%
Masahiro Yamada 1050.7%
Jiapeng Chong 990.6%
Krzysztof Kozlowski 960.6%
Laurent Pinchart 960.6%
Chuck Lever 930.6%
Vladimir Oltean 900.6%
Hans de Goede 890.6%
Arnd Bergmann 890.6%
By changed lines
Hawking Zhang 12508715.7%
Greg Kroah-Hartman 225002.8%
Jiri Slaby 120821.5%
Fabio Aiuto 103751.3%
Dmitry Baryshkov 95611.2%
Robert Foss 81261.0%
Christoph Hellwig 74060.9%
Thomas Zimmermann 73350.9%
Mickaël Salaün 69120.9%
Álvaro Fernández Rojas 65970.8%
Steen Hegelund 64380.8%
Christophe Leroy 63360.8%
Thomas Bogendoerfer 62800.8%
Dexuan Cui 61700.8%
Wu XiangCheng 60640.8%
Ido Schimmel 56620.7%
Dave Airlie 55500.7%
Maximilian Luz 53920.7%
Qi Zhang 53810.7%
Sean Christopherson 53480.7%

Lee Jones, once again, contributed more changesets than anybody else; that work continues to focus on cleanups and removal of warnings. Fabio Aiuto and Marco Cesati (among others) were part of what appears to be an organized effort to get the rtl8723bs wireless network driver out of staging; no less than 26 developers made 450 patches to this driver for 5.13. Sean Christopherson continues to massively rework the KVM subsystem, and Pierre-Louis Bossart made a lot of cleanups to the sound subsystem.

The 125,000 lines of code added to the kernel by Hawking Zhang are, of course, more amdgpu header files; there are now almost 2.4 million lines of code under drivers/gpu/drm/amd/include. Greg Kroah-Hartman removed an unloved staging driver and reverted a lot of patches as the result of the UMN patch review. Jiri Slaby removed a number of old TTY drivers, and Dmitry Baryshkov refactored a number of clock and DRM drivers.

Work on 5.13 was supported by a minimum of 232 employers, the most active of which were:

Most active 5.13 employers
By changesets
Intel160210.0%
(Unknown)11637.3%
Huawei Technologies10386.5%
Red Hat9515.9%
(None)9435.9%
Linaro9195.7%
Google7854.9%
AMD7744.8%
NVIDIA4923.1%
(Consultant)4632.9%
Facebook4442.8%
SUSE3742.3%
IBM3332.1%
NXP Semiconductors3101.9%
Oracle3051.9%
Arm2401.5%
Code Aurora Forum2241.4%
Canonical2181.4%
(Academia)2151.3%
Renesas Electronics2111.3%
By lines changed
AMD16054520.2%
Intel629397.9%
(None)413795.2%
Linaro410155.2%
Red Hat393934.9%
SUSE295973.7%
(Unknown)291613.7%
Google255653.2%
NVIDIA250883.2%
Linux Foundation234552.9%
NXP Semiconductors181652.3%
Huawei Technologies180692.3%
Facebook174102.2%
(Consultant)167762.1%
Microsoft156532.0%
IBM143411.8%
Realtek127091.6%
MediaTek122381.5%
Microchip Technology Inc.105931.3%
Arm94641.2%

As usual, there are few surprises here.

Of course, companies don't write patches, developers do. Many companies put significant effort into hiring community developers, but where do those developers come from in the first place? A little bit of light can be cast onto this question by looking at who developers are working for when they get their first patch into the kernel. One might expect that developers start as volunteers, proving that they can do kernel work before being paid to do it, and indeed many kernel developers begin that way. But others are already on the job when that first patch lands.

In the case of 5.13, 150 of the 329 first-time contributors were on the job from the beginning. The companies and other organizations that employed at least two first-time kernel contributors were:

Employers of first-time contributors
CompanyDevelopers
Huawei Technologies30
AMD16
Intel12
Google11
Samsung 6
MediaTek 5
Code Aurora Forum 4
IBM 4
Microchip Technology Inc.3
Microsoft3
Cirrus Logic2
Red Hat2
Habana Labs2
Facebook2
NXP Semiconductors2
NVIDIA2
ZTE Corporation2

That leaves 179 first-time contributors, two of whom were Outreachy interns and two of whom were known to be working on their own time. If one assumes that most (but not all) of the rest of the unknowns are also volunteers, the logical conclusion is that at least half of our first-time contributors did their work as part of their job. That suggests that some companies, at least, are working to bring new developers into the kernel community.

As for what those first-time developers were doing, these are the directories most often touched by first-time patches:

DirectoryPatches
drivers/staging49
drivers/net27
Documentation21
drivers/gpu21
net17
include16
sound15
tools13
arch/arm11
drivers/hid10

The staging tree is the most popular place for a first-time patch, unsurprisingly. The networking core or GPU drivers are less obvious places for an aspiring kernel developer to start, though; that may well be the sort of place where developers who are learning on the job make their start.

In summary: the kernel community continues to merge patches and make releases at an impressive pace. For all of the challenges that new developers must overcome, the community is gaining more developers than ever before. Things, it seems, are not going all that badly. As of this writing, there are nearly 12,500 patches waiting in linux-next - a big pile, but still 1,000 fewer than were queued for 5.13. So the 5.14 cycle may be slower than 5.13 — but only a little bit.

Comments (3 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>


Copyright © 2021, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds