|
|
Subscribe / Log in / New account

Oxidizing Ubuntu: adopting Rust utilities by default

By Joe Brockmeier
March 18, 2025

If all goes according to plan, the Ubuntu project will soon be replacing many of the traditional GNU utilities with implementations written in Rust, such as those created by the uutils project, which we covered in February. Wholesale replacement of core utilities at the heart of a Linux distribution is no small matter, which is why Canonical's VP of engineering, Jon Seager, has released oxidizr. It is a command-line utility that helps users easily enable or disable the Rust-based utilities to test their suitability. Seager is calling for help with testing and for users to provide feedback with their experiences ahead of a possible switch for Ubuntu 25.10, an interim release scheduled for October 2025. So far, responses from the Ubuntu community seem positive if slightly skeptical of such a major change.

Next 20 years of Ubuntu

Ubuntu celebrated 20 years since its first release in 2024 last year. Seager reflected on that milestone and published his vision for the next 20 years of Ubuntu in February. One of his themes for the future is modernization, calling on the project to constantly assess the foundations of the distribution against the needs of its users:

We should look deeply at the tools we ship with Ubuntu by default - selecting for tools that have resilience, performance and maintainability at their core. There are countless examples in the open source community of tools being re-engineered, and re-imagined using tools and practices that have only relatively recently become available. Some of my personal favourites include command-line utilities such as eza, bat, and helix, the new ghostty terminal emulator, and more foundational projects such as the uutils rewrite of coreutils in Rust. Each of these projects are at varying levels of maturity, but have demonstrated a vision for a more modern Unix-like experience that emphasizes resilience, performance and usability.

On March 12, Seager published a a follow-up to introduce his plan to start adopting some of the tools as defaults—with an eye to having them in place for the next Ubuntu long-term support (LTS) release, 26.04. The rationale for the switch is primarily "the enhanced resilience and safety that is more easily achieved with Rust ports". He cited a blog post by Rust core developer Niko Matsakis. The post, in a nutshell, is about Matsakis's vision for using Rust to write (or rewrite) foundational software; that is, "the software that underlies everything else".

Those who have been following the continuing debates and discussions about using Rust will find familiar themes in Matsakis's arguments in its favor: Rust provides the performance of C/C++ without demanding perfection from developers, it provides reliability, and it makes developers more productive regardless of experience level. Its reliability makes it particularly suitable for foundational software because "when foundations fail, everything on top fails also". Given Ubuntu's widespread adoption, Seager wrote, "it behooves us to be absolutely certain we're shipping the most resilient and trustworthy software we can".

Seager also thinks that embracing Rust will help meet another of his goals for Ubuntu, increasing the number of contributors. Not because Rust is necessarily easier to use than C, but because it provides a framework that makes it harder for contributors to commit potentially unsafe code. Presumably, though it was unsaid, that would make Rust a more attractive language for those interested in contributing but not interested in programming in C for whatever reason.

oxidizr

The abstract possibility that Rust utilities would be better, or even feasible, for Ubuntu is no substitute for hands-on experience. To that end, Seager created oxidizr as a way to quickly swap in (and out) Rust utilities in place of the traditional counterparts with relatively low risk. He released the first version, 1.0.0, on March 7. It is available under the Apache 2.0 license and, as one might expect, written in Rust.

The project is not yet packaged for Ubuntu, nor does Seager have a personal package archive (PPA) set up for users to install oxidizr with APT. There are binary releases on GitHub, or users can install the tool using cargo:

    $ cargo install --git https://github.com/jnsgruk/oxidizr

The binary releases may be the easiest way to get started, as oxidizr requires the exists function in the fs module, but exists was added in Rust 1.81.0 (released in September 2024) while the Rust version in Ubuntu 24.10 is still at 1.80.1. I used rustup to install the most recent stable version of Rust, and then used cargo to install oxidizr.

The oxidizr utility calls a set of utilities that can be independently replaced an "experiment". Experiments are Rust modules that define the packages to be installed (or removed) and handle renaming of the utilities to enable or disable use of the Rust versions. The current set of experiments include replacing GNU coreutils, findutils, or diffutils with the uutils coreutils, findutils, or diffutils, as well as replacing traditional sudo with the Rust-based sudo-rs.

For instance, to try out sudo-rs a user would run this command:

    # oxidizr enable --experiments sudo-rs

That will install the sudo-rs package from the Ubuntu package repository, back up the sudo binary, and create a /usr/bin/sudo symbolic link that targets the Rust binary (/usr/lib/cargo/bin/sudo). To enable all experiments, a user would use the all target instead:

    # oxidizr enable --experiments --all

Finally, to revert the system to the traditional utilities and remove the replacement packages from the system:

    # oxidizr disable --all
According to Seager, oxidizr works on all versions of Ubuntu after 24.04 LTS, but the uutils diffutils experiment is only supported on Ubuntu 24.10 or later. He did urge users to start testing on a virtual machine or other machine that is not their production workstation or server for safety's sake. Seager reported that he hasn't had many problems, but he has run into one incompatibility: the uutils cp, mv, and ls replacements don't support the -Z flag yet, which is used to set the SELinux context of a file or (in the case of ls) print a file's security context.

In my brief testing, I did not run into any problems with the uutils versions of the utilities or the changes oxidizr made to the system in order to swap them in. However, I did note that oxidizr does not make any changes to the system's man pages. Even when the GNU utilities have been replaced with the uutils versions, the GNU man pages are left in place, so "man cp" still displays the GNU version. It would be good to switch the man pages too in order to expose users to any gaps in the uutils documentation as well as the utilities themselves.

Reactions

Fern Dziadulewicz asked if the move toward uutils meant that "Ubuntu is actually kind of heading towards GNUlessness", as with some other Linux distributions that shy away from GNU components. Seager responded that people should not read too much into the change:

This is not symbolic of any pointed move away from GNU components - it's literally just about replacing coreutils with a more modern equivalent.

Sure, the license is different, and it's a consideration, but it's by no means a driver in the decision making.

That response did not satisfy Joseph Erdosy, who wrote that he would migrate to Fedora or Rocky Linux if Ubuntu goes through with the change. He said that he liked Rust and the idea of better, memory-safe alternatives, but that he was unhappy that the biggest "oxidized" project was an MIT-licensed rewrite of GPL-licensed code.

This decision seems to align with a broader trend of companies deprecating GPL software in favour of more permissively licensed alternatives, often under the guise of "modernization." However, the real-world impact is clear: free software is increasingly co-opted into proprietary ecosystems, weakening the principles that made Linux successful.

A few other users quickly agreed with Erdosy, then Ian Weisser announced that he was putting the topic into "Slow Mode" to "prevent piling-on until the developers have a chance to respond and keep this topic constructive". Shortly after, Seager responded that he did not agree that this potential move posed a threat to Ubuntu, or its community. He reiterated that it was not indicative of a political agenda or wider move away from GPL'ed software, and said that most of Canonical's own software is and would continue to be GPL'ed.

Ubuntu is a collection of software that we curate to build a distribution. It's a project dedicated to shipping the latest, and best open source we can find. There is no evidence of foul play, bad practice or poor intentions from the uutils maintainers - they're a thoughtful, dedicated community who are building their own software, and even contributing back to GNU coreutils in some cases. They are achieving things I think we should aspire to with Ubuntu in the coming years, and I remain committed to giving this a chance at success - noting that we and others will need to work closely with them to resolve issues with locales, selinux support and other issues.

If the current situation changes and we believe that the interests of the uutils project are no longer aligned with those of Ubuntu, we can change the coreutils package we choose to ship with Ubuntu.

Sergey Davidoff wondered why the Debian alternatives system, which is used to designate default applications when multiple programs with the same function are installed, was not sufficient for experimenting with Rust utilities. Julian Andres Klode replied that the alternatives system would not be suitable because the existing package would need to cooperate. He also responded to another user, "rain", who had floated the idea of allowing users to switch out individual commands. Klode said that it was a bad idea to allow users to select between Rust and non-Rust implementations on a per-command level, as it would make the resulting systems hard to support.

Liam Proven asked about support on versions of Ubuntu for architectures other than x86_64 and Arm, such as s390 and ppc64le, since "the LLVM Rust toolchain is still a little immature and code generation for other architectures is lacking". Uutils project founder and Ubuntu developer Sylvestre Ledru asked if Proven had any bug reports to share, since Firefox had been using the LLVM Rust toolchain to ship Firefox on those architectures for years. He pointed out that uutils had been successfully building on Debian and Ubuntu with those architectures as targets for a few years as well.

Next steps

Seager said that he had met with Ledru to discuss the idea of making uutils coreutils the default in Ubuntu 25.10, and Ledru felt that the project was ready for that level of exposure. Now it is just a matter of specifics, he said, and the Ubuntu Foundations team is already working up a plan to implement this in the next release cycle. He did acknowledge that there was a need for caution and was open to the possibility that he would need to "scale back on the ambition" if making the switch meant compromising stability or reliability in an Ubuntu LTS release. If the switch doesn't work out, it should be easy enough to revert in time for next year's LTS release.

To date, Ubuntu seems to be the first major Linux distribution that has seriously considered a switch to uutils. If Ubuntu 25.10 ships with uutils coreutils, it will be a significant win for the uutils project that grants exposure to a much larger user base than it has enjoyed so far. The "oxidize Ubuntu" experiment has the potential to accelerate Rust's adoption and inspire further attempts to replace C-based utilities with Rust, or it might have a chilling effect if Ubuntu runs into serious problems. Either way, the project should be instructive for the larger community.



to post comments

resource usage concerns

Posted Mar 18, 2025 17:00 UTC (Tue) by arachnist (subscriber, #94626) [Link] (29 responses)

a friend of mine has stumbled upon excessive memory usage in more[0], and i wonder how many more issues like that will start popping up when more people start using these.

worst case scenario, we're going to see one of the funnier ubuntu releases in recent memory. ;)

[0]: https://github.com/uutils/coreutils/issues/6397

resource usage concerns

Posted Mar 18, 2025 17:17 UTC (Tue) by jzb (editor, #7867) [Link]

I wonder how many more issues like that will start popping up when more people start using these.

It will be interesting to see, won't it? Definitely the kind of testing uutils needs to be a legit replacement. Whatever warts the GNU utilities may have, they've gotten a lot of use over the years and have been road-tested quite well. I am eager to see the results.

resource usage concerns

Posted Mar 18, 2025 18:11 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (24 responses)

Memory leaks are not considered problematic in Rust, they do not cause compilation errors like other memory bugs.

resource usage concerns

Posted Mar 18, 2025 19:01 UTC (Tue) by khim (subscriber, #9252) [Link] (2 responses)

I think one of the insights that actually made Rust possible is an observation about complete infeasibility of declaring memory leaks as errors. Tracing GC “doesn't have memory leaks”, but only if you define “memory leaks” in an extremely perverse fashion: why would I care that my program “doesn't have memory leaks” but instead wastes gigabytes of memory in the “inactive caches”? If it looks like a duck memory leak, swims acts like like a duck memory leak, and quacks makes life painful like a duck memory leak, then it probably is a duck memory leak – and proponents of tracing GC wouldn't convince me otherwise.

And once you realize that elimination of memory leaks in layman sense is impossible you immediately realize that tracing GC is not needed and can think about how to live in a world without one.

And then you end up with Rust – not by design but by accident and/or observation.

resource usage concerns

Posted Mar 18, 2025 23:49 UTC (Tue) by shahms (subscriber, #8877) [Link] (1 responses)

The Java documentation notoriously had to come up with another term after Sun marketed so heavily on GC eliminating memory leaks, so instead of "memory leaks" Java has lots and lots of "unintentional object retention".

resource usage concerns

Posted Mar 19, 2025 8:57 UTC (Wed) by taladar (subscriber, #68407) [Link]

A slightly different type is the space leak due to lazy evaluation in Haskell (while we are on the topic of alternate terminology for similar problems).

resource usage concerns

Posted Mar 18, 2025 19:33 UTC (Tue) by mb (subscriber, #50428) [Link] (10 responses)

>Memory leaks are not considered problematic in Rust

Of course they are considered problematic in Rust.
And leaks are hard to code by accident in Rust.

One basically can't simply forget to destroy an object due to automatic drops. (and "forget" is an explicit operation).
Of course it's possible to program memory leaks by creating cyclic references (although cyclic references are harder in Rust than in most other common languages) or lists that are never freed.

But Rust makes it much harder than say C to just accidentally forget to free something in an obscure error path.

resource usage concerns

Posted Mar 19, 2025 14:08 UTC (Wed) by cultpony (subscriber, #167240) [Link] (9 responses)

Usually the more common case of "memory leak" in Rust is that the compiler has pushed the deallocation so far back that it's effectively happening on program termination. Sometimes you have to be explicit about when you'd like memory to be freed rather than letting the compiler figure out that it is in fact valid to drop the data by waiting for program termination (which Rust considers to be fine too).

resource usage concerns

Posted Mar 19, 2025 23:05 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (8 responses)

Just to clarify for those less familiar with Rust:

Rust is a systems language with manual memory management, just like C, but drenched in a thick layer of syntactic sugar (to automatically free things when you're done using them) and static analysis (to detect when you free something before you're done using it). It does not make arbitrary decisions about when to deallocate things (contrast with a GC'd language, which does make such decisions). If the compiler did not deallocate something for you, it means that you have (knowingly or not) asked the compiler to keep that thing alive.

In most cases, if something no longer needs to exist, you can std::mem::drop() it, or just return from whichever scope owns the allocation. drop() is a safe function, meaning the compiler will not let you use anything that has potentially been dropped (by either means), and in fact drop() is really just a convenience function that takes ownership, does nothing, and immediately returns. You can't drop static variables or anything that you don't own.

There are objects with "more complicated" ownership models than that (a simple example being Rc/Arc), but those objects still have some notion of dropping (you can std::mem::drop() any variable, but if that variable is participating in some shared ownership chicanery, the shared allocation might outlive it).

There is also one other catch: Stack allocations always last until the function returns. That's not a Rust limitation, it's just how the stack works (at least, in any language that has a call stack). If a stack variable is moved from (or dropped early, but that's equivalent to moving it), what actually happens is that the variable's contents are memcpy'd into the new location, the drop flags are updated to indicate that the variable is now uninitialized garbage (and must not be dropped or otherwise used again), and the variable binding is deleted from the current namespace (so you can't use it again). But the stack allocation is still physically occupied until the function returns. This is rarely a problem because we usually allocate large objects on the heap (plus, the optimizer can do all sorts of things with the physical stack layout anyway).

resource usage concerns

Posted Mar 20, 2025 1:06 UTC (Thu) by wahern (guest, #37304) [Link] (6 responses)

> Stack allocations always last until the function returns. That's not a Rust limitation, it's just how the stack works (at least, in any language that has a call stack)

That's not how C works. Automatic variables, *including* VLAs, are scoped to blocks. If they weren't, than you'd have problems with loops and stack overflow. Allocations using the common "alloca" builtin do last for the entire function, but VLAs were deliberately given different semantics.

resource usage concerns

Posted Mar 20, 2025 4:13 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

Yes, sure, you can allocate and deallocate multiple separate blocks per function, but the point is that you *cannot* point to an arbitrary stack allocation and say "just deallocate that right now, without touching anything else." The physical structure of the stack is incapable of representing such an operation.

resource usage concerns

Posted Mar 20, 2025 13:40 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (4 responses)

IIUC Although the _scope_ ends, the _allocation_ does not. The loop re-uses the allocation, and that's how some of the GC'd languages get that design mistake where they re-assign a single variable for each iteration rather than destroying that variable and conjuring a new one into existence with the same name. In C because it doesn't have RAII or GC the behaviour looks like it could be either and so it's harder to realise that one of these approaches is wrong.

[If only one language had that mistake, or, even if several did this but they don't regard it as a mistake and fix it, that would be a different matter but in fact this mistake has happened several times and been fixed in IIRC at least C# and Go]

resource usage concerns

Posted Mar 20, 2025 22:39 UTC (Thu) by wahern (guest, #37304) [Link] (3 responses)

The stack does shrink. Example program:

#include <stdio.h>
#include <stdint.h>
#include <string.h>

__attribute__((noinline))
static void showfp(unsigned n, intptr_t otop) {
	intptr_t top = (intptr_t)__builtin_stack_address();
	printf("n:%u off:%td\n", n, (otop > top)? otop - top : top - otop);
}

int main(void) {
	unsigned n = 0;
	intptr_t top = (intptr_t)__builtin_stack_address();
	while (1 == scanf("%u", &n)) {
		char buf[n];
		memset_s(buf, n, 0, n);
		showfp(n, top);
	}
	return 0;
}

For `echo 200 100 5 | ./a.out` I get:

n:200 off:272
n:100 off:176
n:5 off:80

As the size of successive stack allocations decrease, so does the frame size.

resource usage concerns

Posted Mar 21, 2025 13:39 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (2 responses)

Ah, a VLA, yes it makes sense that the VLA has to actually allocate. I hadn't considered VLAs in what I wrote.

I assume, since your example is a VLA that if you write a conventional C89 array or any other type it is not in fact creating and destroying the allocation.

resource usage concerns

Posted Mar 21, 2025 22:42 UTC (Fri) by wahern (guest, #37304) [Link] (1 responses)

Yes, tweaking that example program it seems a statically sized array, even if declared within a runtime conditional block at the end of the routine (after the VLA loop), is indeed allocated at the start. Not surprising (I didn't disbelieve in that respect), but now I wonder if, in the days before GCC and clang implemented stack probing, that behavior posed a security issue for functions that attempted to conditionally use a [non-VLA] stack allocation based on its own stack size check. Perhaps still something to keep in mind for some other compilers, both C and non-C.

For posterity: I've been using gcc version 14.2.0 (MacPorts gcc14 14.2.0_3+stdlib_flag) on an ARM M1 with these test cases. (__builtin_stack_address was too convenient, but not supported by the installed Apple clang toolchain, though it seems it is supported by the latest upstream clang release.)

resource usage concerns

Posted Mar 21, 2025 23:29 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

Re-reading your original comment I observe that it is pretty emphatic about VLAs and yet I somehow ended up thinking only about the ordinary cases (the VLAs probably existed by the time I started getting paid to write C but I think I was still writing more or less C89 well into this century).

So that's on me. It did cause me to go find out what the current status is of (formally supported rather than as a hack) VLA-like Rust objects (ie a runtime sized object lives on the stack) and seems like they're not close.

resource usage concerns

Posted Mar 21, 2025 17:18 UTC (Fri) by anton (subscriber, #25547) [Link]

Stack allocations always last until the function returns. That's not a Rust limitation, it's just how the stack works (at least, in any language that has a call stack).
No. In every language with guaranteed tail-call optimization, all stack allocation ends at the latest before the tail-call. In Prolog, compilers sort the variables by lifetime, and stack-deallocate before every call, such that only live variables consume stack memory on the call. This reduces the memory consumption for recursive predicates that are not tail-recursive; I expect that there are other implementations of languages where recursion is important that use the same technique.

resource usage concerns

Posted Mar 18, 2025 20:38 UTC (Tue) by mbiebl (subscriber, #41876) [Link]

This is not a memory leak but an inefficient approach to read large files.

resource usage concerns

Posted Mar 18, 2025 21:01 UTC (Tue) by Phantom_Hoover (subscriber, #167627) [Link] (8 responses)

This is a pretty huge misunderstanding. Memory leaks are considered *safe* in Rust, because their behaviour is entirely well-defined, and so leaking memory is an explicitly permitted operation in safe Rust code. That is not the same as ‘not considering them problematic’, in the same way as using bubblesort on giant arrays is completely safe but obviously a bad idea.

resource usage concerns

Posted Mar 19, 2025 12:41 UTC (Wed) by Baughn (subscriber, #124425) [Link] (7 responses)

Safe and useful. Box::leak gives you an immutable, shared reference to any value that's valid forever, so I do that to the config file for my application — it's a memory leak, yes, technically, but I'd need to update it a hundred times to leak even a megabyte and it means I can easily store references to it.

resource usage concerns

Posted Mar 19, 2025 20:28 UTC (Wed) by Phantom_Hoover (subscriber, #167627) [Link] (6 responses)

Yes, although that gets into an issue I passed over: the slippery nature of what a ‘memory leak’ really is. If you try to define it formally you’ll probably end up with ‘allocated memory with no reachable references’, and in informal practice it means ‘allocated memory which ought to be freed but never will’; Box::leak doesn’t leak memory in either sense, but it makes it trivial to do so by dropping the reference it returns. I believe Rust officially gave up on ‘leak safety’ as a goal shortly before 1.0 because there was no good way of defining ‘reachable references’ in situations like Rc cycles without a garbage collector.

resource usage concerns

Posted Mar 19, 2025 22:08 UTC (Wed) by excors (subscriber, #95769) [Link] (5 responses)

> ... what a ‘memory leak’ really is. If you try to define it formally you’ll probably end up with ‘allocated memory with no reachable references’

I think a more useful formal definition is "allocated memory that will not be accessed in the future". (Formal definitions are happy to rely on oracles that can see the future). "Unreachable" is just an approximation with the (very useful) property of being a computable function, so that's what practical GCs use.

But there are many variations of "reachable": referenced by another allocated object (cycles won't be collected), reachable from a root set (cycles will be collected), reachable from some integer on the stack that happens to look like a pointer (conservative vs precise), reachable even if you ignore weak references, etc. Those details are quality-of-implementation issues, they're not a fundamental part of what a memory leak is.

"Not accessed in the future" is much more fundamental. It's uncomputable in general, but a human (or sophisticated algorithm) can sometimes determine that a reachable object will never be used, and I think it's fair to call that a memory leak. Then you can say e.g. "A cache with a bad policy is another name for a memory leak" - it doesn't matter that the cache contents are technically reachable (https://devblogs.microsoft.com/oldnewthing/20060502-07/?p...)

(https://inside.java/2024/11/22/mark-scavenge-gc/ expresses the same idea: "An object is said to be live if it will be accessed at some time in the future execution of the mutator" and "GCs typically approximate liveness using pointer reachability". And e.g. https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf implements a "liveness-based oracle" in Java, by recording every allocation and memory access and then replaying the program, to test how a GC implementation compares against a theoretically optimal freeing of memory.)

Informally you'd add "...and is large enough and long-lived enough to care about" to the definition of memory leak, but that's very subjective. Neither GC nor RAII can completely save you from wasting memory on non-live objects, so you'll always end up having to profile and debug to find the ones worth caring about. (They'll save you a lot of effort compared to manual memory management, though.)

resource usage concerns

Posted Mar 20, 2025 4:19 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

A cache with a bad policy might not be a memory leak even under the uncomputable definition. For example, it might be the case that every element is eventually accessed, but most of them are only accessed incredibly rarely (so rarely that it would be cheaper to evict them and recreate them as needed).

The definition I use (at my day job as an SRE) is even more pragmatic: A program is leaking memory if, when you graph its memory usage over the last (e.g.) 12 hours, it's roughly a straight line going up and to the right. But that requires you to actually have real monitoring, which some people apparently don't.

resource usage concerns

Posted Mar 20, 2025 9:13 UTC (Thu) by taladar (subscriber, #68407) [Link]

Technically most applications have a few pieces of data that won't be accessed in the future but are still kept around, e.g. if you have an object that stores all your command line options including the listen IP and port and no restart mechanism those values likely won't be needed after the initial bind but will still be kept around.

A more elaborate example might be a work queue where a priority field is only used on enqueuing but still kept around until the task has been processed to completion.

Mostly that falls under your "is large enough to care about" but in general it is just a trade-off between being worth restructuring your entire application data structures to be able to free pieces you won't need independently and the amount of extra memory used.

resource usage concerns

Posted Mar 20, 2025 12:12 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

You need a good definition of accessed for this, too - it's not just "accesses" in the sense of reads and writes, but also deallocations count as accesses (otherwise all memory is leaked by this definition, since there's a period between the last read/write and the deallocation, even if the program is careful to keep this small). It also needs to focus on the "right" set of accesses - you want, for example, to not always count main's stack as leaked since it's not freed until the end of the program, but you also don't want to count something as "not leaked" just because it happens that RAII will free it before the end of the program.

The details around "deallocation" are, IMO, the hard chunk of defining "will not be accessed in the future". We wouldn't consider let mut foo = Foo::new(); foo.do_the_thing(); /* 1 */ drop(foo); as having a leak just because at point /* 1 */ there's an allocated object that will not be accessed again, but you might want to define the program as having a leak if, at /* 1 */, it spawned a thread that did all the rest of the program's work apart from freeing foo.

resource usage concerns

Posted Mar 21, 2025 13:27 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> but also deallocations count as accesses

If you count deallocation as access then we are back to square one with tracing-GC based language never having any leaks while real-world Java programs wasting gigabytes for stuff they would never need.

Not a very useful definition.

But if you would look on issue of memory leaks from layman perspective, more precisely, CFO perspective then situation is much simpler: we don't care about bounded memory leaks at all. They don't raise our bill of materials unpredictably.

What we do are about are unbounded leaks: situations when ratio between memory spent on “useful work” and “memory leaks” goes to zero.

And it's much easier to define what is unbounded memory leak. Imagine that you program runs alongside of that oracle that tells it whether certain object would be touched in the future or not (without counting destructors/deallocators). Count the amount of memory it needs. Now run real program with the same inputs. How much memory that run needs? The smaller the ratio the better and if it's not bounded by anything then you have an unbounded memory leak.

P.S. Note that most real world programs use more memory then they, theoretically, could. Tracing GC based ones are especially egregious since they usually need at least 2x more than theoretical minimum (simple, naïve, mark-and-sweep algorithm simply require 2x more to even be usable, while modern approaches can work with more but their efficiency becomes drastically reduced). But as long as ratio is bounded (you need to pay for 16GiB of memory if you plan to process 1GiB files or something like that) CFO can easily adjust bill of materials. Unfounded leak, on the other hand, means you have no idea how much would you need to pay. And that is the critical difference.

resource usage concerns

Posted Mar 21, 2025 16:35 UTC (Fri) by farnz (subscriber, #17727) [Link]

If you count deallocation as access then we are back to square one with tracing-GC based language never having any leaks while real-world Java programs wasting gigabytes for stuff they would never need.

This is why I said The details around "deallocation" are, IMO, the hard chunk of defining "will not be accessed in the future".

At one extreme, all allocated memory is leaked, because there's always a time when it's still allocated but has not yet been deallocated; at the other extreme, no allocated memory is leaked because all memory is implicitly freed by program exit.

You need to find a definition of "deallocated" that is useful for the case you're considering; for example, memory is considered deallocated for leak purposes if you use the language level facility to release it before the next language level memory allocation, function return, or function call. That way, you've allowed for RAII (you say that destructors run just before function return), but you've ensured that any route by which memory usage can grow unbounded is considered to be a leak, as are bounded leaks where you're "merely" late deallocating (such as your tracing GC example of having gigabytes allocated but never used).

A language level facility could be something like C's free for heap objects and leaving the scope for stack objects, but for a language like Python, you could define it as "set the last reference to None or a different object, and ensure that there are no cyclical references" to get a useful definition.

resource usage concerns

Posted Mar 19, 2025 13:17 UTC (Wed) by jhe (subscriber, #164815) [Link] (2 responses)

I did a checkout for the uutils more(1) sourcecode and ended up here, https://github.com/uutils/coreutils/blob/d34eb2525195fa14...

Isn't it bad practice to do the checks for existence or inode type before open()? TOCTOU?

I wasn't able to check if coreutils does the same as coreutils does not contain more(1). Which throws up another bunch of questions but i will stop here.

resource usage concerns

Posted Mar 19, 2025 13:30 UTC (Wed) by intelfx (subscriber, #130118) [Link] (1 responses)

> Isn't it bad practice to do the checks for existence or inode type before open()? TOCTOU?

It's not a security boundary. It is only there to give a better error message (although the second check for the file existence looks like it could be safely dropped as it doesn't even provide a better error). Any TOU errors will be caught in File::open() a few lines below.

resource usage concerns

Posted Mar 19, 2025 13:51 UTC (Wed) by intelfx (subscriber, #130118) [Link]

>> Isn't it bad practice to do the checks for existence or inode type before open()? TOCTOU?

Although, of course, nothing stops that code from matching on `ErrorKind::IsADirectory` in the same match{} statement instead of duplicating the error handling blurb.

So you're right that the code _is_ sloppy. But I'd say not exactly to the level of a security issue.

Weakened license protection

Posted Mar 18, 2025 17:21 UTC (Tue) by lmb (subscriber, #39048) [Link] (21 responses)

I'm also one of the people who're concerned about the weakened protection that comes with the MIT license over a GPL/copyleft variant, which over time erodes the shared commons and is much more open to exploitation.

That's saddening, because otherwise, I'm a huge fan of Rust vs C(++), static linking aside.

Weakened license protection

Posted Mar 18, 2025 22:03 UTC (Tue) by tchernobog (guest, #73595) [Link] (19 responses)

I am with you in the general case.

However, in the case of such base utilities, you basically have to provide bug-by-bug compatibility with gnu coreutils by now.

I kinda doubt a company will take these utils, close source them, and resell them without redistributing sources. It would bring only marginal benefit.

I am much more worried about new, innovative implementations with a higher degree of complexity. For instance rsync, or the new ripgrep implementation are much more sophisticated and would be more worrisome without copyleft.

But most of these tools are just painful to write to wrap correctly POSIX or Windows API, but not inherently hard to code.

Weakened license protection

Posted Mar 19, 2025 0:06 UTC (Wed) by parametricpoly (subscriber, #143903) [Link] (13 responses)

Those who value copyleft should realize that C isn't the state of the art anymore.

Yes, there's C23, but the ML family of languages is much better if correctness is valued. Algebraic types are more expressive, dependent typing allows specifying useful invariants, automatic memory management makes a lot of sense now that systems have 64+ gigabytes of RAM. Parsing those languages is easier because the grammar is more straightforward. C has some nasty limitations: the lack of modules, the need for complex pre-processor makes incremental and efficient parsing almost impossible.

If people are going to switch to new apps, they will be making the switch based on the technical merits. If a Rust app is more secure, less crash prone, and faster to develop, it's a big win for the users. Now, to avoid being replaced by non-copyleft clones, new copyleft apps are needed. This means GNU needs to come up with new languages. I don't think Guile helps here.

Weakened license protection

Posted Mar 19, 2025 0:57 UTC (Wed) by Paf (subscriber, #91811) [Link] (11 responses)

Sorry, why can’t GNU use Rust?

Weakened license protection

Posted Mar 19, 2025 8:52 UTC (Wed) by anselm (subscriber, #2796) [Link] (9 responses)

I obviously don't speak for the GNU project, but I would assume they prefer code they can compile with gcc, and the gcc-based Rust compiler isn't quite there yet.

Weakened license protection

Posted Mar 19, 2025 11:42 UTC (Wed) by excors (subscriber, #95769) [Link] (8 responses)

The gcc-based Rust compiler is still a long way ahead of the gcc-based compiler for a hypothetical GNU language that hasn't been invented yet.

It seems typical for a successful new language to take 10-15 years to reach a reasonable level of maturity and acceptance. If there's an urgent need to defend copyleft, you can't afford to pause and build a whole new language first.

Weakened license protection

Posted Mar 19, 2025 12:18 UTC (Wed) by anselm (subscriber, #2796) [Link] (7 responses)

The gcc-based Rust compiler is still a long way ahead of the gcc-based compiler for a hypothetical GNU language that hasn't been invented yet.

Which is, if anything, an argument for finishing the gcc-based Rust compiler, rather than coming up with an entirely new language from scratch.

I don't believe that the GNU project has a problem in principle with Rust, the language. The fact that a Rust frontend for gcc is in the works seems to suggest otherwise.

Of course if you're a “GPL maximalist” it kinda sucks if people who used to use the GPL'ed coreutils in C are jumping ship to a different package which is technically superior, coincidentally written in Rust, and unfortunately happens to be more liberally licensed. Having said that, if the GNU project is primarily interested in a more modern coreutils replacement for the mythical “GNU operating system”, then once gcc-rs can compile uutils it can simply declare that uutils is now “part of the GNU operating system” much like, e.g., X11 or TeX (neither of which were GPL-licensed, nor part of the GNU project) were stipulated to be “part of the GNU operating system” back when the idea was new.

In any case there is certainly no urgent need for the GNU project to come up with an entirely new “GNU language” just to be able to implement a new version of the GPL coreutils. The GNU project could always write their own version, under the GPL, in Rust, to be compiled with gcc-rs once that is ready. It's just that right now the GNU project may perhaps be excused for not doing development in Rust while their own compiler can't deal with it yet.

Weakened license protection

Posted Mar 19, 2025 13:38 UTC (Wed) by ceplm (subscriber, #41334) [Link] (4 responses)

> I don't believe that the GNU project has a problem in principle with Rust, the language

Actually, I am not sure about, and I am not even sure we shouldn't have a problem.

https://softwarefreedom.org/podcast/2009/jul/07/0x11/

Weakened license protection

Posted Mar 19, 2025 14:38 UTC (Wed) by rahulsundaram (subscriber, #21946) [Link] (3 responses)

> Actually, I am not sure about, and I am not even sure we shouldn't have a problem.

> https://softwarefreedom.org/podcast/2009/jul/07/0x11/

Don't see any relevance to this podcast on Rust. Why would FSF/GNU have any problems at all with Rust and if they have a problem, have they explained it?

Weakened license protection

Posted Mar 19, 2025 15:01 UTC (Wed) by ceplm (subscriber, #41334) [Link] (2 responses)

The unfortunate point of the podcast is that any language which is not sufficiently old (and Rust certainly isn’t) is suspicious of being attacked by patent trolls. What if somebody manages to patent a borrow checker?

Weakened license protection

Posted Mar 19, 2025 15:08 UTC (Wed) by daroc (editor, #160859) [Link] (1 responses)

Obviously patent trolls are a huge problem for small and independent projects. But in this case, there are plenty of companies using Rust that have the resources to contend with them. Rust is pretty clearly prior art, but it's not even the first language to use a borrow checker. Cyclone is older, and there's academic research going back a bit before that.

Cyclone was released in 2001, so even if someone had a patent before that which they could argue covered borrow checking, it has pretty clearly expired by now.

There are absolutely risks to using newer programming languages, but I'm not convinced that patent encumbrance is a particular problem in Rust's case.

Weakened license protection

Posted Mar 20, 2025 9:06 UTC (Thu) by taladar (subscriber, #68407) [Link]

Also, if using GNU means always being a whole patent expiry behind everyone else they might as well shut down the project now.

A new "GNU language"

Posted Mar 19, 2025 14:07 UTC (Wed) by farnz (subscriber, #17727) [Link]

Against that, we're at about the right time in Rust's lifecycle for the Next Big Synthesis of Ideas (NBSoI) in programming language design to come together and produce something that's practically useful and academically interesting. If someone's going to do that under the GNU umbrella, that'd be great.

Note, though, that the NBSoI is not the only way to end up with a new language - you can also have languages that are basically the same combination of ideas as existing languages, but with a different syntax or emphasis (e.g. the huge family of Lisp-like languages). It's just that the NBSoI is where things get interesting, since it's where techniques move from "great in theory, lousy in practice" to "this is usable now".

Weakened license protection

Posted Mar 20, 2025 23:22 UTC (Thu) by jwakely (subscriber, #60262) [Link]

>I don't believe that the GNU project has a problem in principle with Rust, the language. The fact that a Rust frontend for gcc is in the works seems to suggest otherwise.

The GNU project doesn't control GCC, so I don't think you can draw any conclusions about GNU's view on Rust from the existence of gccrs.

Weakened license protection

Posted Mar 19, 2025 11:46 UTC (Wed) by tux3 (subscriber, #101245) [Link]

Perhaps gcc-rs feels like a prerequisite?
There is no technical issue with shipping GPL software that depends on an MIT compiler, but I imagine this wouldn't feel very satisfying from the GNU point of view.

Weakened license protection

Posted Mar 19, 2025 21:38 UTC (Wed) by jmalcolm (subscriber, #8876) [Link]

I guess the "safe" language in the current GCC suite is Ada (GNAT).

Weakened license protection

Posted Mar 19, 2025 10:53 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

> For instance rsync

`rclone` already exists: <https://rclone.org/>. It's not a drop-in, but it also supports way more remote storage protocols (e.g., I `rclone` my `restic` backups to Google Drive and Backblaze (S3-ish) with it).

Weakened license protection

Posted Mar 21, 2025 22:39 UTC (Fri) by ndiddy (subscriber, #167868) [Link] (3 responses)

> I kinda doubt a company will take these utils, close source them, and resell them without redistributing sources. It would bring only marginal benefit.

There was a podcast interview here: https://youtu.be/5qTyyMyU2hQ?t=1270 with the lead uutils maintainer where he brought up that some car manufacturers had already started using uutils in their products instead of the GNU core utils because it means they don't have to comply with the GPL. From a corporate standpoint, when you have one set of tools where you have to comply with the GPL, and then a drop-in replacement for them where you don't, of course you'll use the tools that don't require GPL compliance.

Weakened license protection

Posted Mar 22, 2025 9:18 UTC (Sat) by Wol (subscriber, #4433) [Link]

> > I kinda doubt a company will take these utils, close source them, and resell them without redistributing sources. It would bring only marginal benefit.

What it DOES bring them is a big reduction in pain. If I can ship a product, based on a publicly available tree, without all the hassle of tracking, responding to requests, etc etc, then that's a big attraction.

And regardless of whether you're an engineer, a programmer, an analyst, people at the sharp end like to collaborate. It's bean counters who all too often don't see the benefit of collaboration, but they do see the cost of getting sued.

What we need is a GPL-lite, that contains all the downstream protections, and rather than saying "you have to share the source" replaces it with "you must develop in public, and tell your customers where to find it". Basically, it has to be publicly readable, 3rd-party hosted, and advertised to upstream and downstream alike.

At the end of the day, engineers want to share, but they don't want all the GPL Administrative Hassle that comes with the GPL. All bean counters can see is the cost. The GPL is making the wrong person pay! There's a good chance I will push my changes upstream because I can see the benefit. If I don't, upstream may (or may not) mine my respository because they see a benefit. And any customer who wants the source may have a bit of grief working out exactly which source they've got, but they have got it (and if I can't tell them, that may well be a cost to me). (Programming in Excel it's costing me dear at the moment!)

Cheers.
Wol

Weakened license protection

Posted Mar 23, 2025 1:48 UTC (Sun) by himi (subscriber, #340) [Link] (1 responses)

That doesn't make much sense, though - the GPL in this case applies specifically to the coreutils code and derivatives, not to any higher level aggregation. Unless the car manufacturers are modifying the code, the only requirement for GPL compliance is documenting that they got it from upstream; given the MIT license requires copyright attribution to persist, the practical difference is zero - a little bit of text listing copyright attributions and pointing at the upstream source, or a little bit of text that only lists copyright attributions.

Unless they're actually modifying the code, of course. Which . . . well, for coreutils? I'd have to assume that's just going to be compilation support for whatever platform they're using, in which case it'd make far more sense to submit patches upstream than to maintain their own fork in-house, and the same logic would apply whether they're using GNU coreutils or uutils.

It sounds like either the companies in question don't actually understand the way the GPL works (which shouldn't be an issue if they have competent lawyers), or they're pulling an Apple and avoiding any GPLed code on ideological grounds.

Weakened license protection

Posted Mar 23, 2025 7:09 UTC (Sun) by Wol (subscriber, #4433) [Link]

> Unless the car manufacturers are modifying the code, the only requirement for GPL compliance is documenting that they got it from upstream;

"6. Conveying Non-Source Forms.

You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:

a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange."

Have you read the GPL? Have you understood it? I haven't quoted the entirety of section 6, but if you are a business there is a hell of a lot more than just "documenting you got it from upstream". You - CORPORATELY - are on the hook for making sure your customer can get the source. And that is expensive administrative hassle companies would much rather avoid.

There are "sort of" getouts, 6c, and 6e, but they're not aimed at corporates, and they still come with grief companies don't want. I've only just noticed 6e, but unless the company controls that location, they're probably not complying with it, and if they do control it it's more hassle that again they don't want.

Cheers,
Wol

Weakened license protection

Posted Mar 19, 2025 11:16 UTC (Wed) by ballombe (subscriber, #9523) [Link]

Also that means that uutils cannot simply be a port of coreutils to rust but needs to be a cleanroom new implementation. A straight port would be less likely to introduce regression.

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 17:23 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (37 responses)

One should be careful, many of these tools are heavily used in scripts that will make the distro run much slower based on a quick test here after installing the rust-coreutils package:

$ time for i in {1..2000}; do /bin/true;done
real 0m0.993s
user 0m0.568s
sys 0m0.479s

$ time for i in {1..2000}; do /usr/lib/cargo/bin/coreutils/true;done
real 0m4.623s
user 0m0.784s
sys 0m2.530s

That's 4.6 times slower. Same for other tools, wc is 3 times slower:

$ time for i in {1..2000}; do /bin/wc /dev/null >/dev/null;done
real 0m1.294s
user 0m0.627s
sys 0m0.719s

$ time for i in {1..2000}; do /usr/lib/cargo/bin/coreutils/wc /dev/null >/dev/null;done
real 0m3.847s
user 0m0.916s
sys 0m2.983s

and "expr" almost 3 times as well:

$ time for i in {1..2000}; do /bin/expr $RANDOM + $RANDOM >/dev/null;done
real 0m1.316s
user 0m0.631s
sys 0m0.750s

$ time for i in {1..2000}; do /usr/lib/cargo/bin/coreutils/expr $RANDOM + $RANDOM >/dev/null;done
real 0m3.796s
user 0m0.873s
sys 0m2.998s

I suspect that it's caused by the multi-call binary, which is enormous (23 MB):

$ ls -la /bin/true
-rwxr-xr-x 1 root root 26936 Apr 5 2024 /bin/true

$ ls -la /usr/lib/cargo/bin/coreutils/true
lrwxrwxrwx 1 root root 25 Feb 25 2024 /usr/lib/cargo/bin/coreutils/true -> ../../../../bin/coreutils
$ ls -la /usr/bin/coreutils
-rwxr-xr-x 1 root root 23361264 Feb 25 2024 /usr/bin/coreutils

That's definitely something to take into consideration! Loading (and dynamically linking) a 23MB executable thousands of times per second in scripts is going to cost a lot for certain usages. I've seen numerous utilities in the field that used to be limited by fork-exec speed calling /bin/echo, /bin/expr and even /bin/true, and here the extra cost might make some users particularly unhappy and want to roll back or migrate to another distro.

Also one question concerns the fact that on an ubuntu 24.04, the sum of all the individual tools reportedly supported by the coreutils binary total 5.6 MB. That's a 4x inflation to reach 23 MB. Are there *that* many more features to inflate that much ?

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 17:38 UTC (Tue) by tzafrir (subscriber, #11501) [Link] (4 responses)

Speaking of multi-call binaries:

$ time for i in {1..2000}; do /bin/true;done

real 0m1.158s
user 0m0.803s
sys 0m0.408s

$ time for i in {1..2000}; do busybox true;done

real 0m0.816s
user 0m0.559s
sys 0m0.302s

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 17:50 UTC (Tue) by adobriyan (subscriber, #30858) [Link]

You all need to post baselines:

$ strace -f ./true
execve("./true", ["./true"], 0x7ffe15f24de8 /* 73 vars */) = 0
exit(0) = ?
+++ exited with 0 +++

$ time for i in $(seq 2000); do ./true ; done
real 0m0.514s
user 0m0.289s
sys 0m0.292s

It's fun exercise to write minimal assembly program which reexec itself N times then exits.
$ time ./a.out 69000
real 0m1.000s
user 0m0.017s
sys 0m0.973s

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 17:56 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

Yes but your busybox binary is certainly quite far from 23 MB dynamically linked :-)

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 17:59 UTC (Tue) by dkg (subscriber, #55359) [Link]

Your insanely fast busybox doesn't match what i'm seeing here. On debian amd64, on a semi-recent , Framework laptop, with a hot cache for all these binaries:
0 dkg@bob:~$ time for i in {1..2000}; do /usr/bin/coreutils true; done

real	0m3.020s
user	0m1.946s
sys	0m1.017s
0 dkg@bob:~$ time for i in {1..2000}; do /usr/bin/true; done

real	0m1.410s
user	0m0.996s
sys	0m0.404s
0 dkg@bob:~$ time for i in {1..2000}; do /usr/bin/busybox true; done

real	0m1.864s
user	0m1.285s
sys	0m0.558s
0 dkg@bob:~$ 
To be fair, busybox's dynamic linking is limited to libc and libresolve, while uutils' multicall coreutils binary links to libc, libgcc_s, libselinux, libm, and libpcre2-8. (/usr/bin/true only links to libc

If the performance costs come from the dynamic linking itself (as they surely would for a trivial main() like true, it's no surprise that the binary with the heavier load of libraries to link to would cost more.

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 19:11 UTC (Tue) by ma4ris8 (subscriber, #170509) [Link]

There is room for performance optimization in Rust side.
Doing the naive "exit(0)" in main in Rust, yields better result using "origin" crate:
https://crates.io/crates/origin

time for i in {1..2000}; do /bin/true;done

real 0m4.322s
user 0m0.406s
sys 0m4.068s

~/rust/true-bin$ time for i in {1..2000}; do ./target/release/true-bin;done

real 0m2.681s
user 0m0.233s
sys 0m2.625s

------------ main.rs --------------
#![no_std]
#![no_main]

use core::panic::PanicInfo;
use origin::program;

#[panic_handler]
fn sleeping(_x: &PanicInfo) -> ! {
loop {}
}

#[unsafe(no_mangle)]
unsafe fn origin_main(_argc: usize, _argv: *mut *mut u8, _envp: *mut *mut u8) -> i32 {
program::exit(0)
}
---- main.rs ends ----

--------- Cargo.toml is --------------
[package]
name = "true-bin"
version = "0.1.0"
edition = "2024"

[dependencies]
# Origin can be depended on just like any other crate. For no_std, disable
# the default features, and add the desired features.
origin = { version = "0.25.1", default-features = false, features = ["origin-start"] }
panic-halt = "1.0.0"

# Let's optimize for small size!
[profile.release]
# Give the optimizer more lattitude to optimize and delete unneeded code.
lto = true
# "abort" is smaller than "unwind".
panic = "abort"
# Tell the optimizer to optimize for size.
opt-level = "z"
# Delete the symbol table from the executable.
strip = true
---------- Cargo.toml ends ------------

------------ build.rs is ----------------------
fn main() {
// Pass -nostartfiles to the linker. In the future this could be obviated
// by a `no_entry` feature: <https://github.com/rust-lang/rfcs/pull/2735>
println!("cargo:rustc-link-arg=-nostartfiles");

// The following options optimize for code size!

// Tell the linker to exclude the .eh_frame_hdr section.
println!("cargo:rustc-link-arg=-Wl,--no-eh-frame-hdr");
// Tell the linker to make the text and data readable and writable. This
// allows them to occupy the same page.
println!("cargo:rustc-link-arg=-Wl,-N");
// Tell the linker to exclude the `.note.gnu.build-id` section.
println!("cargo:rustc-link-arg=-Wl,--build-id=none");
// Disable PIE, which adds some code size.
println!("cargo:rustc-link-arg=-Wl,--no-pie");
// Disable the `GNU-stack` segment, if we're using lld.
println!("cargo:rustc-link-arg=-Wl,-z,nognustack");
}
---------------- build.rs ends ------------------------

Compilation: cargo build --release

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 17:59 UTC (Tue) by barryascott (subscriber, #80640) [Link]

Does uutil have sleep?
If so try sleep 1000000 in one terminal then retest.
Does having the coreutils in memory optimise your tests by the kernel sharing the memory?

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 18:13 UTC (Tue) by eru (subscriber, #2753) [Link] (1 responses)

Some old distributions and unix versions implemented /bin/true as simply an empty file that was marked executable. On my machine this implementation takes 3x the time of the normal /bin/true, using your test loop.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 1:28 UTC (Wed) by lkundrak (subscriber, #43452) [Link]

Yes. That is because shell scripts predate shebangs -- a file that lacks the binary executable header as well as a shebang is considered a /bin/sh script. Therefore, whatever your distro has for /bin/sh gets executed to interpret the zero-length file. Whether it's dash or bash, it's still going to be slower than a C implementation of /bin/true.

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 18:34 UTC (Tue) by dskoll (subscriber, #1630) [Link] (16 responses)

OK, I understand the desire for Rust wrt safety, but really... is it necessary to rewrite true (or false) in Rust rather than C? Really?

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 18:39 UTC (Tue) by atai (subscriber, #10977) [Link] (1 responses)

Rust is truer

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 19:35 UTC (Tue) by ma4ris8 (subscriber, #170509) [Link]

With origin's tiny binary instructions, I got the Rust binary slightly faster:

objcopy -R .eh_frame -R .comment ./target/release/true-bin even-smaller-true-bin

./target/release/true-bin: 712 bytes
even-smaller-true-bin: 504 bytes

~/rust/true-bin$ time for i in {1..2000}; do /bin/true;done
real 0m4.393s
user 0m0.407s
sys 0m4.166s

~/rust/true-bin$ time for i in {1..2000}; do ./target/release/true-bin;done
real 0m2.688s
user 0m0.289s
sys 0m2.594s

~/rust/true-bin$ time for i in {1..2000}; do ./even-smaller-true-bin ; done
real 0m2.226s
user 0m0.430s
sys 0m1.977s

objdump --disassemble-all ./even-smaller-true-bin

./even-smaller-true-bin: file format elf64-x86-64

Disassembly of section .interp:

0000000000200078 <.interp>:
200078: 2f (bad)
200079: 6c insb (%dx),%es:(%rdi)
20007a: 69 62 36 34 2f 6c 64 imul $0x646c2f34,0x36(%rdx),%esp
200081: 2d 6c 69 6e 75 sub $0x756e696c,%eax
200086: 78 2d js 0x2000b5
200088: 78 38 js 0x2000c2
20008a: 36 2d 36 34 2e 73 ss sub $0x732e3436,%eax
200090: 6f outsl %ds:(%rsi),(%dx)
200091: 2e 32 00 cs xor (%rax),%al

Disassembly of section .text:

00000000002000c8 <.text>:
2000c8: 48 89 e7 mov %rsp,%rdi
2000cb: 55 push %rbp
2000cc: e9 00 00 00 00 jmp 0x2000d1
2000d1: b8 e7 00 00 00 mov $0xe7,%eax
2000d6: 31 ff xor %edi,%edi
2000d8: 0f 05 syscall
2000da: 0f 0b ud2

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 19:32 UTC (Tue) by jkingweb (subscriber, #113039) [Link] (12 responses)

I'm sure in the case of true and false it's just done for completeness: there's far more value in a complete set of utilities than "here are the interesting ones; go somewhere else for the rest".

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 19:51 UTC (Tue) by ma4ris8 (subscriber, #170509) [Link] (11 responses)

It is good to have all binaries first, and optimize later.
It was quite interesting, to see that the original

/bin/true is 32544 bytes, and links three libraries.
Rust's binary went into 712 bytes,
and then into 504 bytes by removing two unnecessary ELF sections.

In Rust language, no memory allocator, libc, or vdso library reference was needed.
All that needed to be done, is to use CPU registers, stack and call Kernel's exit() syscall with zero argument.

In this case, "unsafe" needed to be used: this is not using std library,
but safety is still near obvious.

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 22:30 UTC (Tue) by willy (subscriber, #9762) [Link] (10 responses)

But does it implement true --help, true --version and true --usage?

(This is something that has always bugged me about GNU; tools shouldn't be forced to have those options)

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 19, 2025 0:58 UTC (Wed) by josh (subscriber, #17465) [Link]

I haven't ever seen --usage, and /bin/true doesn't seem to support it.

/bin/true bloat, and /bin/cat

Posted Mar 19, 2025 16:47 UTC (Wed) by hmh (subscriber, #3838) [Link] (6 responses)

Huh, it goes further. The output of GNU /bin/true --help is fully i18n'd and l10n'd by the distros, for example...

It is still fast when you use it as you're supposed to (i.e. no parameters), small enough on anything that isn't embedded (and would use busybox or toybox instead). And it is not going to create security issues that are not present everywhere else (because it just links to glibc, and it is using glibc's i18n). But yeah, it *is* bloated for no good reason: there's a man page, so /bin/cat really doesn't need or benefit in any way from --help or --version, and the minimal /bin/true and /bin/false would be a lot smaller.

That said, on anything worth of notice, true and false are going to be shell builtins.

Now, GNU /bin/cat is optimized to all heck, that would be a more interesting one to compare with the rust version...

Anyway, this particular rust project explicitly opted into the dependency hell pattern, and thus IMO it is too much of a dependency chain vulnerability for something that I'd run :-(

/bin/true bloat, and /bin/cat

Posted Mar 20, 2025 11:25 UTC (Thu) by chris_se (subscriber, #99706) [Link] (5 responses)

> Anyway, this particular rust project explicitly opted into the dependency hell pattern, and thus IMO it is too much of a dependency chain vulnerability for something that I'd run :-(

Yes, that's my main issue with the current state of affairs w.r.t. Rust. I rather like the language itself, but I'm utterly baffled that many Rust people saw what was going on with npm and thought "sure, let's do more of that". (Ok, it's not quite as bad yet as leftpad, but still...)

/bin/true bloat, and /bin/cat

Posted Mar 21, 2025 8:36 UTC (Fri) by taladar (subscriber, #68407) [Link] (4 responses)

I will never understand the people who complain about number of dependencies without taking into account size of dependencies. Sure, C or C++ have a lower number but that is mostly because each dependency is artificially inflated to a huge size because the build tooling is so bad that nobody wants to split them up into separate libraries.

I'd much rather have a hundred small Rust dependencies than one Qt or openssl that comes with hundreds of critical bugs and security holes that do not even affect the part of it I am using but I have to deal with the related upgrades and CVEs anyway.

/bin/true bloat, and /bin/cat

Posted Mar 21, 2025 11:26 UTC (Fri) by excors (subscriber, #95769) [Link] (1 responses)

One of the significant concerns about number of dependencies is the vulnerability to supply chain attacks, and I think small dependencies actually make that worse, even if the number remains constant.

In C++, if I want something very simple like a circular buffer class, I might find it as part of Boost. That's a huge dependency for such a little feature, which does have some drawbacks. But because it's huge I can be confident there are many developers working on the project. There are review processes, and if one developer tries to slip in something naughty then there's a reasonable chance another developer will spot it before it's released. Security researchers will be running their tools over it. If a vulnerability is reported, there are responsible maintainers who will respond promptly.

If I want the same in Rust, I'll probably find a library that is just one random guy on GitHub. A lot of the code has probably been reviewed by exactly zero other people. There is nothing to mitigate against that developer being malicious, or having their GitHub account compromised, or carelessly accepting a pull request from another random user. They might ignore a vulnerability report for months. They're lacking all the processes and shared responsibility that comes from being in a large project.

I'd agree the huge dependencies will probably have more accidental vulnerabilities, because the sheer quantity of code will outweigh the improved review processes - but Rust's memory safety should already mitigate a lot of that risk, compared to C/C++. That means deliberate backdoors are a relatively greater risk, even before attackers realise there aren't enough buffer overflows and use-after-frees left for them to exploit and they'll have to shift towards more supply chain attacks.

/bin/true bloat, and /bin/cat

Posted Mar 24, 2025 10:13 UTC (Mon) by taladar (subscriber, #68407) [Link]

On the other hand Rust's small dependencies have regular "unmaintained" notifications while the large dependency probably has a good percentage of code that nobody looked at in years. In fact I think I still have a Qt Widget bug open from 10 years ago somewhere that has been migrated through 2-3 different issue trackers by now.

/bin/true bloat, and /bin/cat

Posted Mar 23, 2025 15:53 UTC (Sun) by surajm (subscriber, #135863) [Link] (1 responses)

I think the benefit of the c++ approach is that maintenance for the libraries is generally less concerning. Group ownership of libraries feels a lot safer than tons of tenuously owned and maintained libraries. You don't need to put everything in a single repo or dependency to make this work of course but if you do put everything in one repo then that ownership structure is forced. And to be clear there are many examples of the above approach in the rust ecosystem as well. It's just difficult to ensure all of your deps originate from such entities as there are such deep layers of transitive dependencies, which is again less likely in the c++ ecosystem.

I hope to see this situation improve over time as larger organizations continue to adopt rust and place more strict rules on allowable dependencies.

/bin/true bloat, and /bin/cat

Posted Mar 23, 2025 17:11 UTC (Sun) by farnz (subscriber, #17727) [Link]

I suspect, though, that this is more "vibes" than reality; sure, a big dependency with several maintainers looks healthier from the outside, but in practice, it's not that rare for a big dependency to internally be several small fiefdoms, each of which has just one maintainer. You thus have something that actually hasn't seen maintenance for years, but it "looks" maintained because it's part of a bigger thing where the other bits are well-maintained.

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 20, 2025 12:51 UTC (Thu) by MortenSickel (subscriber, #3238) [Link] (1 responses)

Just tried true --help, true --version and true --usage on my rocky linux 9 box. No output from either.

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 20, 2025 12:54 UTC (Thu) by MortenSickel (subscriber, #3238) [Link]

Sorry, reading a bit further down, I realised that also in bash true is a shell builtin. Running /usr/bin/true definately returns text on --help and -version, but not --usage.

/bin/true (was Performance concerns when heavily used in scripts ?)

Posted Mar 18, 2025 19:46 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

It's probably for completion's sake. But even if it were left as "trivial", replacing `autotools` with `cargo` when the former is "just" for `true` and tools of similar complexity…one is probably dropping more complexity with the build tool migration at that point.

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 22:34 UTC (Tue) by willy (subscriber, #9762) [Link] (9 responses)

What (non-trivial and/or non-pedantic) shells don't implement "true" as a built-in? It's about the first thing you implement as soon as you start to care about the performance of executing shell scripts.

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 23:10 UTC (Tue) by PeeWee (guest, #175777) [Link] (8 responses)

Try dash, the default /bin/sh provider in Ubuntu. true as an actual binary does have its uses outside of the shell context too. I remember using it to override some program shortcuts in desktop environments. Don't like Ctrl-Q for quitting Firefox? Just set a keyboard shortcut in GNOME, for instance, and point it to the executable /bin/true; It would be overkill and rather counter intuitive to start a shell only to have it exit, i.e. sh -c :. Or try prototyping systemd service units; an
[Service]
ExecStart=true
is the simplest way to get started. true is, or rather should be, the cheapest to run binary there is.

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 23:14 UTC (Tue) by PeeWee (guest, #175777) [Link]

I just realized that dash does implement true as a builtin, I just did the wrong check (which true instead of type true).

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 23:15 UTC (Tue) by willy (subscriber, #9762) [Link] (6 responses)

Yes, dash was what I had in mind when I said "pedantic". Nobody cares how fast dash is. You use it to verify that your script really only relies on POSIX and didn't introduce any bash-isms.

And yes, having true as a separate binary is great for the cases you point out, and I never said it shouldn't exist.

My point, which I was quite explicit about, is that none of this matters for the performance of executing a shell script. Be it bash, ksh or some other shell of your choice, if you care about performance you already implemented true as a built-in.

Performance concerns when heavily used in scripts ?

Posted Mar 18, 2025 23:36 UTC (Tue) by PeeWee (guest, #175777) [Link] (3 responses)

But all the other coreutils, findutils, what have you, play a big part in the performance of the shell script, since the shell is just the glue, basically. And when true is already rather slow what else is in store? Don't get too hung up on true is all I am saying. I'm sure, the above examples were chosen because it is such a trivial executable and thus lends itself to measure the overhead of running uu-coreutils.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 5:06 UTC (Wed) by joib (subscriber, #8541) [Link] (2 responses)

> And when true is already rather slow what else is in store?

I don't think you can extrapolate from the startup overhead to the performance of other utilities doing more work.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 10:26 UTC (Wed) by PeeWee (guest, #175777) [Link] (1 responses)

Yes and no, if every call to a uutil incurs that kind of overhead existing scripts will be noticeably slower. But of course things like find and sort may end up being faster, depending on the size of their working set.

At least this is worth having an eye on. If it turns out that the benefits outweigh the downsides, I'd be the last to insist on a fast true in a showstopper kind of way.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 15:13 UTC (Wed) by joib (subscriber, #8541) [Link]

> if every call to a uutil incurs that kind of overhead existing scripts will be noticeably slower.

Will they? Per the original post in this subthread, uutils has an invocation overhead of 4.6s/2000=0.0023s (minus the shell looping overhead). Keep in mind that more complex coreutils utilities will have higher overhead than /usr/bin/true as they need to link in more libraries and map more pages, reducing the relative penalty of uutils. And of course most uses of these utilities actually do more work, amortizing the startup overhead.

> But of course things like find and sort may end up being faster, depending on the size of their working set.

I think things like find, sort, cp etc. will be faster or slower depending on the implementation and tuning choices, algorithms used etc. None of which is impacted by the overhead of launching the binary in the first case.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 1:00 UTC (Wed) by josh (subscriber, #17465) [Link]

> Nobody cares how fast dash is.

Performance is regularly cited as a reason to keep /bin/sh pointing to dash rather than bash. If bash were faster, I think it's quite likely many distributions (including Debian) would have pointed /bin/sh to bash.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 4:40 UTC (Wed) by interalia (subscriber, #26615) [Link]

Yes back pre-systemd, Debian and other distros explicitly switched /bin/sh to dash because it improved boot time running System V scripts using dash rather than bash, so the speed of dash definitely matters in that it was notably faster than bash once you were running all those hundreds of shells during bootup.

It'll be interesting to know how much of a difference uutils would make for running shell scripts if the startup time for the utilities is slower e.g. if a script has a loop that runs cp, ls, diff etc. in a loop. But I imagine with more use there'll be more focus on performance and startup time rather than just feature parity, so it'll improve over what it is now.

Performance concerns when heavily used in scripts ?

Posted Mar 19, 2025 12:21 UTC (Wed) by ferringb (subscriber, #20752) [Link]

Your stats don't make much sense to me. The file size looks like a debug build, but the runtimes are in the range of a release build, at least against my system. I'm confused, in short. File size stats:

gnu coreutils-9.5 gcc Gentoo Hardened 14.2.1_p20241221 p7
     8189040  unstripped
     6573656  stripped

rust 1.85.0 from upstream, coreutils from main (sha: fc46a041f80dc)
rust coreutils profile=debug 
   127032616  unstripped
    25930968  stripped # this should look familiar

rust coreutils profile=release 
    12510480  unstripped
    11487080  stripped

rust coreutils profile=release-small (profile just produces stripped, I added it cause they had it)
     7849832  stripped

The odd part here is your file sizes look like a debug build, but the timings align roughly w/ a release build. There's noise between your run and mine, but I'm using hyperfine for this.

Integrate Ash/Dash?

Posted Mar 20, 2025 22:24 UTC (Thu) by gmatht (guest, #58961) [Link]

I understand that Busy Box has an integrated shell. Presumably a uutils integrated shell could calls to the integrated utilities with a simple function call, which should be pretty fast?

More robust oxidizr behavior?

Posted Mar 18, 2025 19:26 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (19 responses)

That "replace `/usr/bin/XYZ`" pattern seems…quite dangerous. Why not instead make `/usr/libexec/oxidizr/bin` with symlinks to the `cargo/bin` tools using shadowing names and put it ahead of `/usr/bin` in `PATH`? That lets the package not get confused when it tries to update the underlying tool. I suppose going to the `oxidizr` upstream to suggest this is probably more effective…

More robust oxidizr behavior?

Posted Mar 18, 2025 19:46 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

More robust oxidizr behavior?

Posted Mar 18, 2025 21:13 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (1 responses)

I must confess that the example above about how to replace sudo made me think "how things could go wrong" ;-)

More robust oxidizr behavior?

Posted Mar 18, 2025 22:28 UTC (Tue) by PeeWee (guest, #175777) [Link]

And that's why one does NOT violate FHS for no good reason. /usr/bin is off limits and should only be written to by distro package managers like dpkg.

More robust oxidizr behavior?

Posted Mar 18, 2025 21:17 UTC (Tue) by jrtc27 (subscriber, #107748) [Link] (15 responses)

Code that calls various binaries by their absolute path does exist (some of them are even good citizens and detect the location at configure time) so putting the Rust version elsewhere wouldn't test using them for those uses.

One could at least use dpkg-divert to move the non-Rust versions out of the way in a more proper manner, which it doesn't seem to do from a cursory grep of the source.

More robust oxidizr behavior?

Posted Mar 18, 2025 22:22 UTC (Tue) by PeeWee (guest, #175777) [Link] (14 responses)

Code that calls various binaries by their absolute path does exist
Such code is brittle by definition, in that it requires the respective binary to always be in that place. OK, there is the omnipresent #!/bin/sh but nobody is talking about replacing shells or other interpreters, just some of the foundational utilities. Plus, the better approach, which I learned from some IBM coding examples, believe it or not, is: #!/usr/bin/env sh or whatever interpreter is desired.
some of them are even good citizens and detect the location at configure time
I don't get how that makes a good citizen. The configure time may well be spent under a different environment, i.e. with a different $PATH that may not even exist at runtime. If anything, one should call which <command> or command -v <command> at runtime, or whatever equivalent lookup method the programming framework provides.
One could at least use dpkg-divert to move the non-Rust versions out of the way in a more proper manner, which it doesn't seem to do from a cursory grep of the source.
That approach and the suggestion of using update-alternatives has been denied already, because both require cooperation by the respective package, which is not feasible for this kind of temporary experimentation. I still maintain that update-alternatives could be used - bent, really - to do what oxidizr does, by using a different $DPKG_ROOT, i.e. DPKG_ROOT=/usr/local and some crafting of paths, but, having seen @mathstuf's suggestion, that seems like cracking nuts with a sledgehammer; IOW: KISS. Maybe there is some added value in using the alternatives system, because somebody on the discourse thread pointed out that the man pages remain the same, e.g. the GNU coreutils man pages are shown, when in fact the uu-coreutils are in use. But for now that's all in my head and I haven't tried that approach yet; I may be missing something that could be a showstopper - it's been a while since I did anything remotely serious with it.

More robust oxidizr behavior?

Posted Mar 18, 2025 22:32 UTC (Tue) by willy (subscriber, #9762) [Link] (6 responses)

Using "env" really isn't best practice. /bin/sh is guaranteed to exist by POSIX.

More robust oxidizr behavior?

Posted Mar 18, 2025 22:47 UTC (Tue) by PeeWee (guest, #175777) [Link] (3 responses)

But it is pretty much the only guaranteed path to exist, give or take a few I am too lazy to look up right now; the gist should be clear. Try #!/usr/bin/env python then. That way you can have your own local version in /usr/local/bin/ and don't need to change your script only to try a different iteration of the interpreter.

I knew I shouldn't have used sh in that example. *sigh*

More robust oxidizr behavior?

Posted Mar 19, 2025 9:12 UTC (Wed) by taladar (subscriber, #68407) [Link] (2 responses)

At the very least env itself needs to be guaranteed to be in a fixed place for env use to make any sense at all.

More robust oxidizr behavior?

Posted Mar 19, 2025 10:54 UTC (Wed) by PeeWee (guest, #175777) [Link] (1 responses)

Yes, as I have eluded to elsewhere in this thread. But that would be the only fixed and absolute path. BTW, why does $PATH exist if people insist on calling by absolute paths? Unless there is a very good reason, one should just not do that.

More robust oxidizr behavior?

Posted Mar 21, 2025 2:47 UTC (Fri) by raven667 (subscriber, #5198) [Link]

PATH exists for the interactive user convenience, but robust scripts don't operate in the same environment and it's reasonable to either sanitize $PATH to a known quantity or skip relying on it at all and hardcode all paths to system binaries that rarely change on the platform/version you support. Scripts have to take a whole bunch of defensive measures like pervasive quoting, using quoted arrays for arguments, explicit exit checking/set -e and other techniques that aren't at all like someone using a shell interactively. The two use cases regularly conflict in their wants and needs, which is why stuff gets reimplemented in perl or Python sometimes and things like suid shell scripts are impossible.

More robust oxidizr behavior?

Posted Mar 19, 2025 0:37 UTC (Wed) by PeeWee (guest, #175777) [Link] (1 responses)

BTW, that guarantee does not exist in POSIX, on the contrary:
Applications should note that the standard PATH to the shell cannot be assumed to be either /bin/sh or /usr/bin/sh, and should be determined by interrogation of the PATH returned by getconf PATH, ensuring that the returned pathname is an absolute pathname and not a shell built-in.
And it is quite involved to install scripts with the correct absolute path to sh, see further down in the spec - I bet that no current Linux distro does it that way. Given all that and that env is also a defined by POSIX one may just as well use that installation routine to get the absolute path of it and install any script (not just sh ones) using that approach, or just assume that /bin/env or /usr/bin/env will always exist, and YOLO. I'll leave it at that, since this is getting more off topic. The point was, and still is, that one should not rely (too heavily) on absolute paths, it's bad practice. I consider the need for absolute paths in shebangs to be a historical relic we have to live with, but anything beyond that should be fine with plain relative (to $PATH) executable path names.

More robust oxidizr behavior?

Posted Mar 20, 2025 8:33 UTC (Thu) by riking (guest, #95706) [Link]

NixOS takes the position that env, sh, and ld-linux are in fact the only absolute paths to binaries you get:
$ ls /bin
sh
$ ls /usr/bin
env
$ ls /lib64
ld-linux-x86-64.so.2

More robust oxidizr behavior?

Posted Mar 19, 2025 6:05 UTC (Wed) by jrtc27 (subscriber, #107748) [Link] (4 responses)

dpkg-divert does not require cooperation, it is separate from alternatives. It is a powerful tool that lets you move arbitrary packaged files out of the way permanently.

More robust oxidizr behavior?

Posted Mar 19, 2025 10:46 UTC (Wed) by PeeWee (guest, #175777) [Link] (1 responses)

And what happens if coreutils gets an upgrade? The only context I have ever come across diversions is as part of preinst/postrm hooks in package install scripts. And oxidizr, as of now, is just a "3rd party" tool that is unrelated to package management and hence should not mess with files under the control of the package manager (see FHS: must not be written to), i.e. /usr/bin, which is just like tickling the dragon.

If they want people to do testing they should provide safer ways to do it than some quick and dirty hacks that will only result in some - very few - setting up test environments and results of rather synthetic tests, which are not the same as road tested in real scenarios.

More robust oxidizr behavior?

Posted Mar 20, 2025 19:48 UTC (Thu) by jrtc27 (subscriber, #107748) [Link]

If coreutils gets an upgrade then the diverted files remain diverted and the Rust versions don't get overwritten. That's the whole point of dpkg-divert.

More robust oxidizr behavior?

Posted Mar 19, 2025 11:11 UTC (Wed) by bluca (subscriber, #118303) [Link] (1 responses)

It does require cooperation, by policy rather than by construction/functionality.

More robust oxidizr behavior?

Posted Mar 20, 2025 19:50 UTC (Thu) by jrtc27 (subscriber, #107748) [Link]

Yes, distributions don't want the archive to be a Wild West of diverting each other, and in an ideal world coreutils would cooperate with other providers of the same tools. But absent that cooperation, dpkg-divert is at least more robust than just moving the files out of the way with no package manager knowledge, and does not require inherent cooperation from the package for dpkg-divert to be used, unlike alternatives, so it's no worse in that regard than just moving the files directly.

More robust oxidizr behavior?

Posted Mar 19, 2025 9:13 UTC (Wed) by gdt (subscriber, #6284) [Link] (1 responses)

> Such code is brittle by definition

Applications programmers should be able to rely on the Filesystem Hierarchy Standard, which requires some basic utilities to be in /bin and has a section with a heading " /usr/bin : Most user commands".

Ref: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s04....

More robust oxidizr behavior?

Posted Mar 19, 2025 11:19 UTC (Wed) by PeeWee (guest, #175777) [Link]

No, application programmers should do away with onerous assumptions like that or not make such to begin with. Some Apple devs said as much in a talk about launchd, IIRC, and I can only concur. Nowhere does FHS say that one should call those programs by absolute path. $PATH is all you need to find the executable in question, and how that is set up is regulated elsewhere, which I am too lazy to look up now. Essentially getconf provides the bare minimum that is guaranteed, and lo and behold:
$ lsb_release -i
Distributor ID: Ubuntu
$ getconf PATH
/bin:/usr/bin
And I am pretty certain that is also true for any distro worth its salt. Also, if you have missed it, see my other post on the matter; not even /bin/sh should be assumed to exist in that exact location.

Manpages are important

Posted Mar 18, 2025 21:38 UTC (Tue) by fraetor (subscriber, #161147) [Link] (3 responses)

This will need some serious testing, but as long as they do update the man pages I'll be happy.

Though as an amature software archaeologist it'll be a shame to lose some of the historical context in the coreutils manpages.

Manpages are important

Posted Mar 18, 2025 23:44 UTC (Tue) by PeeWee (guest, #175777) [Link] (2 responses)

Or maybe it's a good thing to keep the GNU manpages in place? If anything does not work as described in those, it's a bug and needs to be fixed. And once there are no more such bugs, one could just start with a copy of the GNU manpages, but I guess that may raise some licensing issues. OTOH, how many ways are there to describe well defined behaviour in a user friendly form?

Manpages are important

Posted Mar 18, 2025 23:48 UTC (Tue) by interalia (subscriber, #26615) [Link]

Well during the testing phase, the GNU manpages should still be installed, but renamed like the binaries were. But if I was running the experiment I would want "man ls" to show the manual page for uutils ls, with the GNU manpage prefixed as "man coreutils-ls" or something similar.

Manpages are important

Posted Mar 27, 2025 23:49 UTC (Thu) by raindog308 (guest, #176490) [Link]

manpages?

“See the info page for more details.”

Manpage reuse

Posted Mar 19, 2025 12:54 UTC (Wed) by jhe (subscriber, #164815) [Link]

The GNU coreutils manpages feature an "AUTHORS" and "REPORTING BUGS" section. That should at least be removed before it is presented as documentation for a different project.

OIL RIG

Posted Mar 19, 2025 16:55 UTC (Wed) by antiphase (subscriber, #111993) [Link] (1 responses)

It would be nicer if oxidation involved loss of Electron, but maybe that's too much to hope for

OIL RIG

Posted Mar 20, 2025 16:24 UTC (Thu) by jorgegv (subscriber, #60484) [Link]

Niiiiice one... :-D

bummer the uutils licence is MIT and not GPL

Posted Mar 20, 2025 12:56 UTC (Thu) by h7KdD8Z (guest, #169613) [Link] (1 responses)

Real bummed to learn the the uutils project is MIT licenced and not GPL. Anyone have any background on that decision? Curious to read more about the justification.

bummer the uutils licence is MIT and not GPL

Posted Mar 20, 2025 15:59 UTC (Thu) by patrick_g (subscriber, #44470) [Link]

Lots of information here :
https://fosdem.org/2025/schedule/event/fosdem-2025-6196-r...

The license issue is addressed during the talk.

Ubuntu going downhill...

Posted Mar 20, 2025 15:27 UTC (Thu) by pj (subscriber, #4506) [Link]

The coming chaos around this makes me glad I recently ditched Ubuntu (after ~15years of dedicated use), though I did it because of the banner popups advertising Ubuntu Pro whenever I used apt.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds