|
|
Log in / Subscribe / Register

Leading items

Welcome to the LWN.net Weekly Edition for May 16, 2024

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Portable LLMs with llamafile

By Daroc Alden
May 14, 2024

Large language models (LLMs) have been the subject of much discussion and scrutiny recently. Of particular interest to open-source enthusiasts are the problems with running LLMs on one's own hardware — especially when doing so requires NVIDIA's proprietary CUDA toolkit, which remains unavailable in many environments. Mozilla has developed llamafile as a potential solution to these problems. Llamafile can compile LLM weights into portable, native executables for easy integration, archival, or distribution. These executables can take advantage of supported GPUs when present, but do not require them.

Portability

Llamafile is based on the MIT-licensed llama.cpp library, a C/C++ implementation of the code necessary to evaluate LLMs of several different architectures. Llamafile maintains its own copy of llama.cpp, with some additional MIT-licensed changes. The code for llamafile itself is made available under the Apache 2.0 license. Llamafile's value comes from building llama.cpp in a way that works seamlessly across many different environments. On April 25, Mozilla posted an update about llamafile's progress that called it "the easiest and fastest way to run a wide range of open large language models".

The lead developer of llamafile, Justine Tunney, chose to re-use her previous work on Cosmopolitan Libc. That project implements a C standard library and associated compiler infrastructure to compile programs written in C as multi-format executables that can run on Linux, macOS, Windows, several BSDs, or without an operating system.

In the blog post announcing the start of Cosmopolitan Libc, Tunney said:

I like the idea of having the freedom to write software without restrictions that transcends traditional boundaries. My goal has been helping C become a build-once run-anywhere language, suitable for greenfield development, while avoiding any assumptions that would prevent software from being shared between tech communities.

In service of that goal, Cosmopolitan Libc uses clever techniques to create executable files that can be simultaneously interpreted as several different formats. The executables it produces start with a shell script that behaves differently on different operating systems, which the program exploits to select a pre-compiled static binary suitable to the operating system and architecture at run time. These files (along with the weights of the LLM, in llamafile's case), are bundled into a zip file. Since zip files store their metadata at the end of the file, the same file can serve as shell script, executable, and zip archive.

This approach is of questionable use for most programs, but for Mozilla, it represents an important way to democratize access to LLMs. There are an increasing number of repositories, most prominently HuggingFace, that distribute raw LLM weights — but raw weights aren't actually enough to use the models. Users also rely on inference code and software that provides an API to actually access the results of the model. To make things worse, that inference code is often specific to a particular brand of GPU or machine-learning toolchain, which makes LLMs hard to run except in specific environments.

Llamafile applies Cosmopolitan Libc's philosophy of selecting an appropriate implementation at run time to machine learning. It automatically detects whether the user has AMD or NVIDIA's GPU toolchains available, and if so, uses those. If not, it uses a new open-source linear-algebra library called tinyBLAS. The library supports using the APIs made available by existing graphics drivers to take advantage of GPU acceleration without requiring an installed toolchain. This is less performant than letting NVIDIA's CUDA or AMD's ROCm compile a native program for a specific model of graphics card, but still useful for users who don't have the GPU SDKs but do have the hardware.

TinyBLAS doesn't work with all drivers, however. If no GPU acceleration is available, llamafile falls back to CPU implementations of the core linear algebra libraries — versions that are specialized for particular microarchitectures and specific hardware. On March 31, Tunney published a detailed blog post discussing how she improved CPU inference performance across a wide variety of hardware, often by hand-writing a matrix multiplication kernel tuned for that exact hardware.

There's another trick that llamafile uses to speed up matrix multiplication, though, which is much more specific to its purpose as a platform for running LLMs. Generic linear algebra libraries like BLAS need to be able to multiply arbitrary matrices with unknown dimensions, possibly transposed or weighted in some way. LLM inference, because it proceeds one token at a time, spends a lot of time doing matrix-vector multiplications that can be written in a simpler form.

Even when LLMs do generalized matrix multiplications (during initialization), the models are architected such that the matrices are usually of a known size — often a multiple of 64. This lets a hand-unrolled implementation specific to those sizes outperform a more generic algorithm. Tunney benchmarked the multiplication of a 513x512 matrix with a 512x512 one (a size llamafile uses frequently), finding that her code outperformed Intel's proprietary Math Kernel Library (MKL) — on that specific size. The MKL is still faster on other sizes. Since llamafile controls the size of the batches used during LLM initialization, however, that's still a clear performance improvement.

Using llamafile

Using an LLM packaged by llamafile is fairly straightforward. The project's README links to several examples of different sizes. Downloading a file, marking it as executable, and running it is all that should be required in the vast majority of cases. Users who have binfmt_misc registrations for WINE might need to add a more specific rule to prevent WINE from being used as the program's interpreter. Running the program with no arguments will open llama.cpp's simple chat interface:

The built-in web server also offers an OpenAI-compatible API, so tools that expect to talk to the proprietary service can be seamlessly re-directed, as can tools that use OpenAI's API design as a de-facto standard for LLM inference. Users who are more comfortable on the command line can pass parameters and instructions as arguments instead.

Parameters can also be baked into a llamafile executable. As mentioned above, the files are actually valid zip files; adding a file named .args to the executable will make it treat those arguments as additional command line parameters. The procedure for turning the llamafile binary produced by building the project into a LLM-specific llamafile for distribution is actually the same: add the weights and any required arguments to the zip file.

For performance reasons, however, it's important to add the weights without compression, and ideally aligned to a 4K boundary. This allows llamafile to map the weights directly into memory, which is substantially faster than decompressing them into non-disk-backed memory. For this purpose, the project also provides a utility called zipalign that adds files to a zip archive in the correct way.

On my laptop, which lacks any relevant GPUs but does have a spiffy 12th generation Intel i7 processor, the Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile download provided as an example evaluates provided prompts at a rate of about 16 tokens per second. The actual answer itself is evaluated at about 3.5 tokens per second. The difference is attributable to the fact that during prompt evaluation, the model can use matrix-matrix multiplications, instead of matrix-vector multiplications. But this level of performance — while perhaps too slow to process many large documents — seems entirely adequate for local use.

With LLMs becoming increasingly integrated into other software, efforts to make them easy to run on existing, consumer hardware are an important part of making sure users can benefit from the technology without sending their data to third parties. The ultimate test of whether a format is suitable for widespread use is whether it is actually adopted; with llamafile being only a few months old, it's too soon to say for sure whether the project has achieved its goals. It does, however, seem to be well on the way.

Comments (18 posted)

Another push for sched_ext

By Jonathan Corbet
May 9, 2024
The extensible scheduler class (or "sched_ext") is a comprehensive framework that enables the implementation of CPU schedulers as a set of BPF programs that can be loaded at run time. Despite having attracted a fair amount of interest from the development community, sched_ext has run into considerable opposition and seems far from acceptance into the mainline. The posting by Tejun Heo of a new version of the sched_ext series at the beginning of May has restarted this long-running discussion, but it is not clear what the end result will be.

As a quick refresher: sched_ext allows the creation of BPF programs that handle almost every aspect of the scheduling problem; these programs can be loaded (and unloaded) at run time. Sched_ext is designed to safely fall back to the completely fair scheduler should something go wrong (if a process fails to be run within a time limit, for example). It has been used to create a number of special-purpose schedulers, often with impressive performance benefits for the intended workload. See this 2023 article for a more detailed overview of this work.

Heo lists a number of changes that have been made to sched_ext since the previous version was posted in November. For the most part, these appear to be adjustments to the BPF API to make the writing of schedulers easier. There is also a new shutdown mechanism that, among other things, disables the BPF scheduler during power-management events like system suspend. There is now support for CPU-frequency scaling, and some debugging interfaces have been added to make developing schedulers easier. The core design of sched_ext appears to have stabilized, though.

Increasing interest

Even before getting to the changes, though, Heo called attention to the increasing interest in sched_ext that is being shown across the community and beyond. Valve is planning to use sched_ext for better game scheduling on the Steam Deck. Ubuntu is considering shipping it in the 24.10 release. Meta and Google are increasing their use of it in their production fleets. There is also evidently interest in using it in ChromeOS, and Occulus is looking at it as well. Heo concludes that section with:

Given that there already is substantial adoption which continues to grow and sched_ext doesn't affect the built-in schedulers or the rest of kernel in an invasive manner, I believe it's reasonable to consider sched_ext for inclusion.

Whether that inclusion will happen remains an open question, though. The posting of version 4 of the patch set in July 2023 led to a slow-burning discussion on the merits of this development. Scheduler maintainer Peter Zijlstra rejected the patches outright, saying:

There is not a single doubt in my mind that if I were to merge this, there will be Enterprise software out there that will mandate its own BPF sched thing, or else it won't work.

They will not care, they will not contribute, they might even pull a RedHat and only share the code to customers.

He added that he saw no value in merging the code, and dropped out of the conversation. Mel Gorman also expressed his opposition to merging sched_ext, echoing Zijlstra's concern that enterprise software would start requiring the use of special-purpose schedulers. He later added that, in his opinion (one shared with Zijlstra), sched_ext would work actively against the improvement of the current scheduler:

I generally worry that certain things may not have existed in the shipped scheduler if plugging was an option including EAS, throttling control, schedutil integration, big.Little, adapting to chiplets and picking preferred SMT siblings for turbo boost. In each case, integrating support was time consuming painful and a pluggable scheduler would have been a relatively easy out that would ultimately cost us if it was never properly integrated.

Heo, naturally, disagreed with a lot of the concerns that had been raised. There are, he said, scheduling problems that cannot be addressed with tweaks to the current scheduler, especially in "hyperscaling" environments like Meta. He disagreed that sched_ext would impose a maintenance burden, arguing that the intrusion of BPF into other parts of the kernel has not had that result. Making it possible for users to do something new is beneficial, even if there will inevitably be "stupid cases" resulting from how some choose to use the new feature. In summary, he said, opponents are focused on the potential (and, in his opinion, overstated) costs of sched_ext without taking into account the benefits it would bring.

Restarting the conversation

That message, in October, was the end of the conversation at the time. Heo is clearly hoping for a better result this time around, but Zijlstra's response was not encouraging:

I fundamentally believe the approach to be detrimental to the scheduler eco-system. Witness the metric ton of toy schedulers written for it, that's all effort not put into improving the existing code.

He said that he would not accept any part of this patch series until "the cgroup situation" has been resolved. That "situation" is a performance problem that affects certain workloads when a number of control groups are in use. Rik van Riel had put together a patch series to address this problem in 2019, but it never reached the point of being merged; Zijlstra seems to be insisting that this work be completed before sched_ext can be considered, and he gave little encouragement that it would be more favorably considered even afterward.

Heo expressed a willingness (albeit reluctantly) to work on the control-group problem if it would clear the way for sched_ext. He strongly disagreed with Zijlstra's characterization of sched_ext schedulers as "toy schedulers" and the claim that working on sched_ext will take effort away from the mainline scheduler, though. There is, he said, no perfect CPU scheduler, so the mainline scheduler has to settle for being good enough for all users. That makes it almost impossible to experiment with "radical ideas", and severely limits the pool of people who can work on the scheduler. Much of the energy that goes into sched_ext schedulers, he said, is otherwise unavailable for scheduler development at all.

There is, he said, value in some of those radical ideas:

Yet, the many different ways that even simple schedulers can demonstrates sometimes significant behavior and performance benefits for specific workloads suggest that there are a lot of low hanging fruits in the area. Low hanging fruits that we can't easily reach from our current local optimum. A single implementation which has to satisfy all users all the time is unlikely to be an effective vehicle for mapping out such landscape.

Igalia developer Changwoo Min, who is working with Valve on gaming-oriented scheduling, supported Heo's argument, saying that: "The successful implementation of sched_ext enriches the scheduler community with fresh insights, ideas, and code". That, as of this writing, is where this conversation stands.

What next?

Sched_ext is on the schedule for the BPF track of the Linux Storage, Filesystem, Memory-Management, and BPF Summit, which begins on May 13. That discussion will cover the future development of sched_ext but, most likely, will not be able to address the question of whether this work should be merged at all. That discussion could continue, on the mailing lists and elsewhere, for some time yet.

Sometimes, when a significant kernel development stalls in this way, distributors that see value in it will ship the patches anyway, as Ubuntu, Valve, and ChromeOS are considering doing. While shipping out-of-tree code is often discouraged, it can also serve to demonstrate interest in a feature and flush out any early problems that result from its inclusion. If things go well, this practice can strengthen the argument for merging the code into the mainline, albeit with the ever-present possibility of changes that create pain for the early adopters.

Whether that will be the path taken for sched_ext remains to be seen. What is certain is that this work has attracted a lot of interest and is unlikely to go away anytime soon. Sched_ext has the potential to enable a new level of creativity in scheduler development, even if it remains out of the mainline — but that potential will be stronger if it does end up being merged. Significant scheduler patches are not merged quickly even when they are uncontroversial; this one will be slower than most if it is accepted at all.

Comments (40 posted)

Some 6.9 development statistics

By Jonathan Corbet
May 13, 2024
The 6.9 kernel was released on May 12 after a typical nine-week development cycle. Once again, this is a major release containing a lot of changes and new features. Our merge-window summaries (part 1, part 2) covered those changes; now that the development cycle is complete, the time has come to look at where all that work came from — and to introduce a new and experimental LWN feature for readers interested in this kind of information.

A total of 2,028 developers contributed to the 6.9 kernel; 285 of them made their first kernel contribution during this cycle. The most active contributors to 6.9 were:

Most active 6.9 developers
By changesets
Uwe Kleine-König 3442.4%
Kent Overstreet 2591.8%
Christoph Hellwig 2061.4%
Krzysztof Kozlowski 2011.4%
Johannes Berg 1751.2%
Ricardo B. Marliere 1721.2%
Eric Dumazet 1611.1%
Andy Shevchenko 1270.9%
Dmitry Baryshkov 1230.8%
Thomas Gleixner 1160.8%
Andrew Davis 1080.7%
Jiri Slaby 1000.7%
Jani Nikula 990.7%
Sean Christopherson 970.7%
Darrick J. Wong 970.7%
Randy Dunlap 960.7%
Ard Biesheuvel 930.6%
Masahiro Yamada 880.6%
Takashi Iwai 810.6%
Matthew Wilcox 800.6%
By changed lines
Hamza Mahfooz 721449.1%
Hawking Zhang 669978.5%
Matthew Sakai 587137.4%
Matthew Wilcox 311923.9%
Ian Rogers 184562.3%
Darrick J. Wong 123561.6%
Neil Armstrong 97071.2%
Dmitry Baryshkov 83001.0%
Kent Overstreet 80871.0%
Johannes Berg 77791.0%
Ping-Ke Shih 68890.9%
Mike Snitzer 65470.8%
Rob Clark 56540.7%
Christoph Hellwig 55890.7%
Geert Uytterhoeven 55350.7%
Shinas Rasheed 53100.7%
Krzysztof Kozlowski 52180.7%
Rajendra Nayak 52110.7%
Stefan Herdler 50170.6%
Yazen Ghannam 49950.6%

Following what has become a longstanding tradition, Uwe Kleine-König was the biggest contributor of changesets this time around. This work, which is mostly focused on low-level device-driver refactoring, has brought about 2,500 changesets into the kernel since 6.3 was released in April, 2023. Kent Overstreet continued the work of completing and stabilizing the bcachefs filesystem. Christoph Hellwig kept on with his extensive refactoring work in the block layer and XFS filesystem. Krzysztof Kozlowski worked extensively with drivers and devicetrees for mobile systems, and Johannes Berg did a lot of work within the kernel's WiFi subsystem.

In the "lines changed" column, Hamza Mahfooz and Hawking Zhang kept up another apparent tradition: adding huge files with lots of amdgpu register definitions. Matthew Sakai, instead, added the new dm-vdo device-mapper target. Matthew Wilcox removed the old NTFS filesystem implementation, and Ian Rogers added event definitions for Intel CPUs.

The top testers and reviewers this time around were:

Test and review credits in 6.9
Tested-by
Daniel Wheeler 1197.5%
Michael Kelley 784.9%
Sohil Mehta 714.5%
Helge Deller 472.9%
Philipp Hortmann 342.1%
Shan Kang 342.1%
Pucha Himasekhar Reddy 322.0%
Dapeng Mi 311.9%
Carl Worth 241.5%
Babu Moger 231.4%
Dietmar Eggemann 231.4%
Shaopeng Tan 231.4%
Peter Newman 231.4%
Geert Uytterhoeven 221.4%
Randy Dunlap 221.4%
Guenter Roeck 211.3%
Nicolin Chen 211.3%
Juergen Gross 201.3%
K Prateek Nayak 201.3%
Zhang Rui 191.2%
Reviewed-by
Simon Horman 2002.2%
Christoph Hellwig 1711.9%
Krzysztof Kozlowski 1611.8%
Konrad Dybcio 1431.6%
AngeloGioacchino Del Regno 1291.4%
Andy Shevchenko 1151.3%
Ilpo Järvinen 1121.2%
Andrew Lunn 1121.2%
Darrick J. Wong 981.1%
Dmitry Baryshkov 981.1%
Kees Cook 951.0%
Linus Walleij 921.0%
Geert Uytterhoeven 891.0%
Neil Armstrong 881.0%
Jiri Pirko 881.0%
Rob Herring 871.0%
Greg Kroah-Hartman 850.9%
Gregory Greenman 780.9%
Hawking Zhang 770.8%
David Sterba 690.8%

The top testers continue, by all appearances, to be people who do that work as a primary job focus. On the review side, there are 19 developers who reviewed at least one patch every day during this development cycle, and five of those reviewed more than two each day.

There are 227 companies that were identified as having supported work on the 6.9 kernel, the highest number (by a small margin) since 6.4 was released. The most active employers were:

Most active 6.9 employers
By changesets
Intel186712.9%
(Unknown)10727.4%
Google10317.1%
(None)9796.8%
Linaro9246.4%
AMD8205.7%
Red Hat8075.6%
SUSE4683.2%
Meta4132.9%
Pengutronix3722.6%
Huawei Technologies3452.4%
Oracle3132.2%
Qualcomm3112.1%
IBM3012.1%
(Consultant)2872.0%
Renesas Electronics2471.7%
NVIDIA2411.7%
Texas Instruments2101.5%
Arm1761.2%
Microsoft1591.1%
By lines changed
AMD17187721.7%
Red Hat9144811.5%
Intel708008.9%
Google511046.5%
Oracle479066.0%
(Unknown)443005.6%
Linaro414925.2%
(None)283883.6%
Qualcomm178122.2%
Meta173882.2%
Renesas Electronics170512.2%
Realtek138621.7%
SUSE119531.5%
NVIDIA101621.3%
Huawei Technologies91001.1%
(Consultant)71400.9%
IBM67770.9%
Collabora67600.9%
Arm67120.8%
Marvell65870.8%

As usual, there are not a lot of surprises here; these results do not change greatly from one release to the next — or even from one year to the next.

One last note

It has been over 17 years since Who wrote 2.6.20? was published here. Back in 2007, it was still widely said that the kernel was mostly developed and maintained by volunteers; by taking the time to map commits in the kernel repository to employers, we showed that the reality was rather different, and that most kernel developers were paid for their work.

After all these years, it sometimes seems that these articles contain about as much news as a tide table. The information found there might be useful, but it is not generally surprising. There is still interest in these articles, though, as we found out when we skipped a few development cycles some years back. Given the ongoing interest and the generally mechanical nature of putting this information together, it perhaps makes sense to delegate more of the work to a machine.

Thus, we are happy to launch the LWN Kernel Source Database as an experimental, subscriber-only feature. Much of the information found in these articles is available there, along with quite a bit more. We encourage readers to play with the system and to let us know what they think. To be clear: there is no plan to stop publishing these articles anytime soon, but now there is a resource for readers who would like to dig deeper.

Comments (11 posted)

The state of the page in 2024

By Jonathan Corbet
May 15, 2024

LSFMM+BPF
The advent of the folio structure to describe groups of pages has been one of the most fundamental transformations within the kernel in recent years. Since the folio transition affects many subsystems, it is fitting that the subject was covered at the beginning of the 2024 Linux Storage, Filesystem, Memory Management, and BPF Summit in a joint session of the storage, filesystem, and memory-management tracks. Matthew Wilcox used the session to review the work that has been done in this area and to discuss what comes next.

The first step of this transition, he began, was moving much of the information traditionally stored in the kernel's page structure into folios instead, then converting users of struct page to use the new structure. The initial goal was to provide a type-safe representation for compound pages, but the scope has expanded greatly since then. That has led to a bit of ambiguity: what, exactly, is a folio in current kernels? For now, a folio is still defined as "the first page of a compound page".

[Matthew Wilcox] By the end of the next phase, the plan is for struct page to shrink down to a single, eight-byte memory descriptor, the bottom few bits of which describe what type of page is being described. The descriptor itself will be specific to the page type; slab pages will have different descriptors than anonymous folios or pages full of page-table entries, for example.

Among other motivations, a key objective behind the move to descriptors is reducing the size of the memory map — the large array of page structures describing every physical page in the system. Currently, the memory-map overhead is, at 1.6% of the memory it describes, too high. On systems where virtualization is used, the memory map is also found in guests, doubling the memory consumed by the memory-map. By moving to descriptors, that overhead can be reduced to 0.2% of memory, which can save multiple gigabytes of memory on larger systems.

Getting there, though, requires moving more information into the folio structure. Along the way, concepts like the pin count for a page can be clarified, cleaning up some longstanding problems in the memory-management subsystem. This move will, naturally, increase the size of the folio structure, to a point where it will be larger than struct page. The advantage, though, is that only one folio structure is needed for all of the base pages that make up the folio. For two-page folios, the total memory use is about the same; for folios of four pages or more, the usage is reduced. If the kernel is caching the contents of a 1GB file, it currently needs 60MB 16MB of page structures. If that caching is done entirely with base pages, that overhead will increase to 23MB in the future. But, if four-page folios are used instead, it drops to 9MB total.

Some types of descriptors, including those for slab pages and page-table entries, have already been introduced. The page-table descriptors are quite a bit smaller than folios, since there are a number of fields that are not needed. For example, these pages cannot be mapped into user space, so there is no need for a mapping count.

Wilcox put up a plot showing how many times struct page and struct folio are mentioned in the kernel since 2021. On the order of 30% of the page mentions have gone away over that time. He emphasized that the end goal is not to get rid of struct page entirely; it will always have its uses. Pages are, for example, the granularity with which memory is mapped into user space.

Since last year's update, quite a lot of work has happened within the memory-management subsystem. Many kernel subsystems have been converted to folios. There is also now a reliable way to determine whether a folio is part of hugetlbfs, the absence of which turned out to be a bit of a surprising problem. The adoption of large anonymous folios has been a welcome improvement.

The virtual filesystem layer has also seen a lot of folio-related work. The sendpage() callback has been removed in favor of a better API. The fs-verity subsystem now supports large folios. The conversion of the buffer cache is proceeding, but has run into a surprise: Wilcox had proceeded with the assumption that buffer heads are always attached to folios, but it turns out that the ext4 filesystem allocates slab memory and attaches that instead. That usage isn't wrong, Wilcox said, but he is "about to make it wrong" and does not want to introduce bugs in the process.

Avoiding problems will require leaving some information in struct page that might have otherwise come out. In general, he said, he would not have taken this direction with buffer heads had he known where it would lead, but he does not want to back it out now. All is well for now, he said; the ext4 code is careful not to call any functions on non-folio-backed buffer heads that might bring the system down. But there is nothing preventing that from happening in the future, and that is a bit frightening.

The virtual filesystem layer is now allocating and using large folios through the entire write path; this has led to a large performance improvement. Wilcox has also added an internal function, folio_end_read(), that he seemed rather proud of. It sets the up-to-date bit, clears the lock bit, checks for others waiting on the folio, and serves as a memory barrier — all with a single instruction on x86 systems. Various other helpers have been added and callbacks updated. There is also a new writeback iterator that replaces the old callback-based interface; among other things, this helps to recover some of the performance that was taken away by Spectre mitigations.

[Filesystem folio
conversion table] With regard to individual filesystems, many have been converted to folios over the last year. Filesystems as a whole are being moved away from the writepage() API; it was seen as harmful, so no folio version was created. The bcachefs filesystem can now handle large folios — something that almost no other filesystems can do. The old NTFS filesystem was removed rather than being converted. The "netfs" layer has been created to support network filesystems. Wilcox put up a chart showing the status of many filesystems, showing that a lot of work remained to be done for most. "XFS is green", he told the assembled developers, "your filesystem could be green too".

The next step for folios is to move the mapping and index fields out of struct page. These fields could create trouble in the filesystems that do not yet support large folios, which is almost all of them. Rather than risk introducing bugs when those filesystems are converted, it is better to get those fields out of the way now. A number of page flags are also being moved; flags like PageDirty and PageReferenced refer to the folio as a whole rather than to individual pages within it, and thus should be kept there. There are plans to replace the write_begin() and write_end() address-space operations, which still use bare pages.

Beyond that, there is still the task of converting a lot of filesystems, many of which are "pseudo-maintained" at best. The hugetlbfs subsystem needs to be modernized. The shmem and tmpfs in-memory filesystems should be enhanced to use intermediate-size large folios. There is also a desire to eliminate all higher-order memory allocations that do not use compound pages, and thus cannot be immediately changed over to folios; the crypto layer has a lot of those allocations.

Then, there is the "phyr" concept. A phyr is meant to refer to a physical range of pages, and is "what needs to happen to the block layer". That will allow block I/O operations to work directly on physical pages, eliminating the need for the memory map to cover all of physical memory.

It seems that there will be a need for a khugepaged kernel thread that will collapse mid-size folios into larger ones. Other types of memory need to have special-purpose memory descriptors created for them. Then there is the KMSAN kernel-memory sanitizer, which hasn't really even been thought about. KMSAN adds its own special bits to struct page, a usage that will need to be rethought for the folio-based future.

An important task is adding large-folio support to more filesystems. In the conversions that Wilcox has done, he has avoided adding that support except in the case of XFS. It is not an easy job and needs expertise in the specific filesystem type. But, as the overhead for single-page folios grows, the need to use larger folios will grow with it. Large folios also help to reduce the size of the memory-management subsystem's LRU list, making reclaim more efficient.

Ted Ts'o asked how important this conversion is for little-used filesystems; does VFAT need to be converted? Wilcox answered that it should be done for any filesystem where somebody cares about performance. Dave Chinner added that any filesystem that works on an NVMe solid-state device will need large folios to perform well. Wilcox closed by saying that switching to large folios makes compiling the kernel 5% faster, and is also needed to support other desired features, so the developers in the room should want to do the conversion sooner rather than later.

Comments (11 posted)

Debian dismisses AI-contributions policy

By Joe Brockmeier
May 10, 2024

In April, the Gentoo Linux project banned the use of generative AI/ML tools due to copyright, ethical, and quality concerns. This means contributors cannot use tools like ChatGPT or GitHub Copilot to create content for the distribution such as code, documentation, bug reports, and forum posts. A proposal for Debian to adopt a similar policy revealed a distinct lack of love for those kinds of tools, though it would also seem few contributors support banning them outright.

Tiago Bortoletto Vaz started the discussion on the Debian project mailing list on May 2, with the suggestion that the project should consider adopting a policy on the use of AI/ML tools to generate content. Vaz said that he feared that Debian was "already facing negative consequences in some areas" as a result of this type of content, or it would be in a short time. He referenced the Gentoo AI policy, and Michał Górny's arguments against AI tools on copyright, quality, and ethical grounds. He said he was in agreement with Górny, but wanted to know how other Debian contributors felt.

Ansgar Burchardt wrote that generative AI is "just another tool". He noted that Debian doesn't ban Tor, even though it can be used to violate copyright or for unethical things, and it doesn't ban human contributions due to quality concerns: "I don't see why AI as yet another tool should be different."

Others saw it differently. Charles Plessy responded that he would probably vote for a general resolution against "the use of the current commercial AI for generating Debian packaging, native, or infrastructure code". He specified "commercial AI" because "these systems are copyright laundering machines" that abuse free software, and found the idea that other Debian developers would use them discouraging. He was not against generative AI technology itself, however, as long as it was trained on content that the copyright holders gave consent to use for that purpose.

Russ Allbery was skeptical of Gentoo's approach of an outright ban, since "it is (as they admit) unenforceable". He also agreed with Burchardt, "we don't make policies against what tools people use locally for developing software". He acknowledged that there are potential problems for Debian if output from AI tools infringes copyright. Even so, banning the use of those tools would not make much difference: "we're going to be facing that problem with upstreams as well, so the scope of that problem goes far beyond" direct contributions to Debian. The project should "plan to be reactive [rather] than attempt to be proactive". If there are reports that AI-generated content is a copyright violation, he said, then the project should deal with it as it would with any Debian Free Software Guidelines (DFSG) violation. The project may need to make judgment calls about the legal issues then, but "hopefully this will have settled out a bit in broader society before we're forced to make a decision on a specific case".

Allbery said his primary concern about the impact of AI is its practical impact:

Most of the output is low-quality garbage and, because it's now automated, the volume of that low-quality garbage can be quite high. (I am repeatedly assured by AI advocates that this will improve rapidly. I suppose we will see. So far, the evidence that I've seen has just led me to question the standards and taste of AI advocates.)

Ultimately, Allbery said he saw no need for new policies. If there is a deluge of junk, "we have adequate mechanisms to complain and ask that it stop without making new policy". The only statement he wanted to convey so far is that "anyone relying on AI to summarize important project resources like Debian Policy or the Developers Guide or whatnot is taking full responsibility for any resulting failures".

A sense of urgency

In reply to Allbery, Vaz conceded that Gentoo's policy was not perfect but, despite the difficulty in enforcing it, he maintained there was a need to do something quickly.

Vaz, who is an application manager (AM) for the Debian new maintainer process, suggested that Debian was already seeing problems with AI output submitted during the new maintainer (NM) process and as DebConf submissions, but declined to provide examples. "So far we can't [prove] anything, and even if we could, of course we wouldn't bring any of the involved to the public arena". He did, however, agree that a statement was a more appropriate tool than a policy.

Jose-Luis Rivas replied that Vaz had more context than the rest of the participants in the discussion and that "others do not have this same information and can't share this sense of urgency". He inferred that an NM applicant might be using a large-language model (LLM) tool during the NM process, but in that scenario there was "even less point" in making policy or a statement about the use of such tools. It would be hard to prove that an LLM was in use, and "ultimately [it] is in the hands of those judging" to make the decisions. "I can't see the point of 'something needs to be done' without a clear reasoning of the expectations out of that being done".

Vaz argued that having a policy or statement would be useful, even in the absence of proof that an LLM was in use. He made a comparison to Debian's code of conduct and its diversity statement: "They might seem quite obvious to some, and less so to others." Having an explicit position on the use of LLMs would be useful to educate those who are "getting to use LLMs in their daily life in a quite mindless way" and "could help us both avoid and mitigate possible problems in the future".

The NM scenario Vaz gave was not convincing to Sam Hartman, who replied that the process would not benefit from a policy. It is up to candidates to prove to their application manager (AM), advocates, and reviewers that they can be trusted and have the technical skills to be a Debian Developer:

I as an AM would find an applicant using an LLM as more than a possibly incorrect man page without telling me would violate trust. I don't need a policy to come to that conclusion.

He said he did not mind if a candidate used an LLM to refresh their memory, and saw no need for them to cite the use of the LLM. But if the candidate didn't know the material well enough to catch bad information from an LLM, then it's clear they are not to be trusted to choose good sources of information.

On May 8, after the conversation had died down, Vaz wrote that it was apparent "we are far from a consensus on an official Debian position regarding the use of generative AI as a whole in the project". He thanked those who had commented, and said that he hoped the debate would surface again "at a time when we better understand the consequences of all this".

It is not surprising to see Debian take a conservative, wait-and-see approach. If Debian is experiencing real problems from AI-generated content, they are not yet painful or widespread enough to motivate support for a ban or specific policy shift. A flood of AI gibberish, or a successful legal challenge to LLM-generated content, might turn the tide.

Comments (149 posted)

Managing expectations with a contributions and credit policy

May 13, 2024

This article was contributed by Valerie Aurora

Maintainers of open-source projects sometimes have disagreements with contributors over how contributions are reviewed, modified, merged, and credited. A written policy describing how contributions are handled can help maintainers set reasonable expectations for potential contributors. In turn, that can make the maintainer's job easier because it can help reduce a source of friction in the project. A guide to help create this kind of policy for a project has recently been developed.

People sometimes have rather different expectations about how open-source projects function with regard to contributions. For example, a recent discussion about how to credit a Linux kernel patch that had two authors attracted more than 600 comments, covering a wide range of opinions from "the original author should have sole credit" to "the original author should get no credit at all". Another kind of disagreement is over which types of contributions are welcome: some projects don't want external contributions, or any new features, but contributors keep sending them anyway.

In the absence of a written policy, contributors will make assumptions about the way a project operates—and may start an argument if they don't get the expected response. A written policy describing how contributions are processed and credited can prevent conflicts from even starting. To help maintainers create their own policies, I co-authored a credits and contribution policy development guide with Maria Matějka, Martin Winter, Marcos Sanz, and other members of the RIPE Open Source Working Group.

Coverage

A contributions and credit policy is most useful when it focuses on areas where maintainers and contributors frequently have different, incompatible expectations, such as: what contributions are welcome, how reviews and changes will be handled, and how credit will be assigned. A policy can also include step-by-step instructions for maintainers and contributors on how to handle complicated situations, such as when a contributor won't make requested changes.

A policy can and should cover all kinds of contributions. Melissa Mendonça, a maintainer for NumPy, SciPy, and napari, said: "Often, things like docs, community, design work are not credited as contributions in the usual sense (i.e., don't give you green squares on GitHub)." A policy can help surface these contributions and create a standard process for acknowledging them.

A policy can help resolve three common dilemmas: assigning credit for multiple direct contributors, revising contributions, and attracting the right contributions.

Deciding on who deserves credit for a work is a difficult and unsolved philosophical question. Practically speaking, people assign credit for many reasons, such as rewarding people for contributing, establishing ownership of intellectual property, or tracking down the source of bugs or backdoors.

As the authors of the policy development guide, we took a practical approach to this question with our own contributions and credit policy. We asked ourselves what kind of behaviors we wanted to encourage in our project, and then assigned credit in ways that make those outcomes more likely. Since we want to attract new contributors, we give primary author credit to the new contributor even if a maintainer had to completely rewrite the contribution. Other projects can make similarly pragmatic decisions without solving the general problem of who gets credit.

Sometimes people will try to take advantage of a formal credit policy. Mendonça says that credit for open-source work is "always a system that you can game (for example, making just enough contributions to get a certain credit/position and then moving on)". However, she says that she is "willing to err on the side of giving credit", and take corrective action afterward. Any policy can have an explicit exception for the maintainer to take whatever action they think necessary.

Multi-contributor credit

Some policies for assigning credit are easy to implement, such as the current de facto policy for many projects: "whoever merges the code decides who gets credit". However, for more complex credit policies, some difficult situations quickly arise. Matějka, team leader for the BIRD routing daemon (and a co-author of the policy guide) said:

We sometimes get contributions with good ideas and poor quality. It's always a discussion how much the specific patch is more an idea (to be thanked for in the commit message) or a patch (where we keep the author but add a note that it was updated by the committer).
Conversely, credit for a multi-author contribution can also turn into blame; if the original author cannot review the change, crediting the changes to them can end up blaming them for the editor's mistakes. Solving complicated questions of credit can use hours of a maintainer's time or result in a contribution hanging in limbo for months. A standard policy lets maintainers decide how to handle these situations once and then simply refer to the policy in the future.

Many contributions need some revision before they can be merged into the main project. This can go wrong in many ways: maybe no one has time to review, contributors don't respond to requests, or the contribution no longer merges by the time it is approved. A policy can help with these situations in two ways: it can give the maintainer a playbook for how to handle difficult situations (e.g., after two weeks of no response, make the edits and merge the contribution), and it can give contributors guidelines for when to ping the maintainers (e.g., after two weeks of no response, email the mailing list again).

Attracting the right contributions

Many people assume that "open source" means "open to outside contributions and willing to mentor new contributors". In reality, some projects don't want new contributors or features, others will accept outside contributions but only after they have been totally rewritten, and only some maintainers have the time and interest to mentor new contributors. An explicit policy about what contributions are desired helps maintainers avoid conflict from would-be contributors. It also helps both potential contributors and potential users decide whether they want to depend on a project with that particular contribution policy.

One of my early open-source contributions was guided by a contributions policy of sorts. The xjack screensaver prints "All work and no play makes Jack a dull boy" over and over, with a variety of mistakes and typos, inspired by a scene in The Shining. A comment in the source code tells would-be contributors not to bother sending in a patch to change the words it prints. However, while I was doing exactly that, by changing it to print "All work and no play makes Val a dull girl", I found a bug in the code that generated typos. I sent in a patch to fix that bug, which was merged (without credit).

Policy resources

The credits and contribution policy development guide includes several variations on each section of the policy, as well as a list of policies in use by open-source projects. We request (but do not require) giving the authors credit on any derivative work. The policy guide has its own contributions and credit policy and new contributors are welcome. Currently we are especially interested in additional examples of choices for each section of the policy, as well as links to existing policies.

Comments (13 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>


Copyright © 2024, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds