Oxidizing Ubuntu: adopting Rust utilities by default
If all goes according to plan, the Ubuntu project will soon be replacing many of the traditional GNU utilities with implementations written in Rust, such as those created by the uutils project, which we covered in February. Wholesale replacement of core utilities at the heart of a Linux distribution is no small matter, which is why Canonical's VP of engineering, Jon Seager, has released oxidizr. It is a command-line utility that helps users easily enable or disable the Rust-based utilities to test their suitability. Seager is calling for help with testing and for users to provide feedback with their experiences ahead of a possible switch for Ubuntu 25.10, an interim release scheduled for October 2025. So far, responses from the Ubuntu community seem positive if slightly skeptical of such a major change.
Next 20 years of Ubuntu
Ubuntu celebrated 20 years since its first release in 2024 last year. Seager reflected on that milestone and published his vision for the next 20 years of Ubuntu in February. One of his themes for the future is modernization, calling on the project to constantly assess the foundations of the distribution against the needs of its users:
We should look deeply at the tools we ship with Ubuntu by default - selecting for tools that have resilience, performance and maintainability at their core. There are countless examples in the open source community of tools being re-engineered, and re-imagined using tools and practices that have only relatively recently become available. Some of my personal favourites include command-line utilities such as eza, bat, and helix, the new ghostty terminal emulator, and more foundational projects such as the uutils rewrite of coreutils in Rust. Each of these projects are at varying levels of maturity, but have demonstrated a vision for a more modern Unix-like experience that emphasizes resilience, performance and usability.
On March 12, Seager published a a follow-up to introduce his plan to start
adopting some of the tools as defaults—with an eye to having
them in place for the next Ubuntu long-term support (LTS) release,
26.04. The rationale for the switch is primarily "the
enhanced resilience and safety that is more easily achieved with Rust
ports
". He cited a blog
post by Rust
core developer Niko Matsakis. The post, in a nutshell, is about
Matsakis's vision for using Rust to write (or rewrite) foundational
software; that is, "the software that underlies everything
else
".
Those who have been following the continuing debates and
discussions about using Rust will find familiar themes in
Matsakis's arguments in its favor: Rust provides the performance of
C/C++ without demanding perfection from developers, it
provides reliability, and it makes developers more productive regardless
of experience level. Its reliability makes it particularly suitable
for foundational software because "when foundations fail,
everything on top fails also
". Given Ubuntu's widespread
adoption, Seager wrote, "it behooves us to be absolutely certain
we're shipping the most resilient and trustworthy software we
can
".
Seager also thinks that embracing Rust will help meet another of his goals for Ubuntu, increasing the number of contributors. Not because Rust is necessarily easier to use than C, but because it provides a framework that makes it harder for contributors to commit potentially unsafe code. Presumably, though it was unsaid, that would make Rust a more attractive language for those interested in contributing but not interested in programming in C for whatever reason.
oxidizr
The abstract possibility that Rust utilities would be better, or even feasible, for Ubuntu is no substitute for hands-on experience. To that end, Seager created oxidizr as a way to quickly swap in (and out) Rust utilities in place of the traditional counterparts with relatively low risk. He released the first version, 1.0.0, on March 7. It is available under the Apache 2.0 license and, as one might expect, written in Rust.
The project is not yet packaged for Ubuntu, nor does Seager have a personal package archive (PPA) set up for users to install oxidizr with APT. There are binary releases on GitHub, or users can install the tool using cargo:
$ cargo install --git https://github.com/jnsgruk/oxidizr
The binary releases may be the easiest way to get started, as oxidizr requires the exists function in the fs module, but exists was added in Rust 1.81.0 (released in September 2024) while the Rust version in Ubuntu 24.10 is still at 1.80.1. I used rustup to install the most recent stable version of Rust, and then used cargo to install oxidizr.
The oxidizr utility calls a set of utilities that can be independently replaced an "experiment". Experiments are Rust modules that define the packages to be installed (or removed) and handle renaming of the utilities to enable or disable use of the Rust versions. The current set of experiments include replacing GNU coreutils, findutils, or diffutils with the uutils coreutils, findutils, or diffutils, as well as replacing traditional sudo with the Rust-based sudo-rs.
For instance, to try out sudo-rs a user would run this command:
# oxidizr enable --experiments sudo-rs
That will install the sudo-rs package from the Ubuntu package repository, back up the sudo binary, and create a /usr/bin/sudo symbolic link that targets the Rust binary (/usr/lib/cargo/bin/sudo). To enable all experiments, a user would use the all target instead:
# oxidizr enable --experiments --all
Finally, to revert the system to the traditional utilities and remove the replacement packages from the system:
# oxidizr disable --allAccording to Seager, oxidizr works on all versions of Ubuntu after 24.04 LTS, but the uutils diffutils experiment is only supported on Ubuntu 24.10 or later. He did urge users to start testing on a virtual machine or other machine that is not their production workstation or server for safety's sake. Seager reported that he hasn't had many problems, but he has run into one incompatibility: the uutils cp, mv, and ls replacements don't support the -Z flag yet, which is used to set the SELinux context of a file or (in the case of ls) print a file's security context.
In my brief testing, I did not run into any problems with the uutils versions of the utilities or the changes oxidizr made to the system in order to swap them in. However, I did note that oxidizr does not make any changes to the system's man pages. Even when the GNU utilities have been replaced with the uutils versions, the GNU man pages are left in place, so "man cp" still displays the GNU version. It would be good to switch the man pages too in order to expose users to any gaps in the uutils documentation as well as the utilities themselves.
Reactions
Fern Dziadulewicz asked
if the move toward uutils meant that "Ubuntu is actually kind of
heading towards GNUlessness
", as with some other Linux
distributions that shy away from GNU components. Seager responded
that people should not read too much into the change:
This is not symbolic of any pointed move away from GNU components - it's literally just about replacing coreutils with a more modern equivalent.
Sure, the license is different, and it's a consideration, but it's by no means a driver in the decision making.
That response did not satisfy Joseph Erdosy, who wrote that he would migrate to Fedora or Rocky Linux if Ubuntu goes through with the change. He said that he liked Rust and the idea of better, memory-safe alternatives, but that he was unhappy that the biggest "oxidized" project was an MIT-licensed rewrite of GPL-licensed code.
This decision seems to align with a broader trend of companies deprecating GPL software in favour of more permissively licensed alternatives, often under the guise of "modernization." However, the real-world impact is clear: free software is increasingly co-opted into proprietary ecosystems, weakening the principles that made Linux successful.
A few other users quickly agreed with Erdosy, then Ian Weisser announced
that he was putting
the topic into "Slow Mode" to "prevent piling-on until the
developers have a chance to respond and keep this topic
constructive
". Shortly after, Seager responded
that he did not agree that this potential move posed a threat to
Ubuntu, or its community. He reiterated that it was not
indicative of a political agenda or wider move away from GPL'ed
software, and said that most of Canonical's own software is and would
continue to be GPL'ed.
Ubuntu is a collection of software that we curate to build a distribution. It's a project dedicated to shipping the latest, and best open source we can find. There is no evidence of foul play, bad practice or poor intentions from the uutils maintainers - they're a thoughtful, dedicated community who are building their own software, and even contributing back to GNU coreutils in some cases. They are achieving things I think we should aspire to with Ubuntu in the coming years, and I remain committed to giving this a chance at success - noting that we and others will need to work closely with them to resolve issues with locales, selinux support and other issues.
If the current situation changes and we believe that the interests of the uutils project are no longer aligned with those of Ubuntu, we can change the coreutils package we choose to ship with Ubuntu.
Sergey Davidoff wondered why the Debian alternatives system, which is used to designate default applications when multiple programs with the same function are installed, was not sufficient for experimenting with Rust utilities. Julian Andres Klode replied that the alternatives system would not be suitable because the existing package would need to cooperate. He also responded to another user, "rain", who had floated the idea of allowing users to switch out individual commands. Klode said that it was a bad idea to allow users to select between Rust and non-Rust implementations on a per-command level, as it would make the resulting systems hard to support.
Liam Proven asked
about support on versions of Ubuntu for architectures other than
x86_64 and Arm, such as s390 and ppc64le, since "the LLVM Rust
toolchain is still a little immature and code generation for other
architectures is lacking
". Uutils project founder and Ubuntu
developer Sylvestre Ledru asked
if Proven had any bug reports to share, since Firefox had been using
the LLVM Rust toolchain to ship Firefox on those architectures for
years. He pointed out that uutils had been successfully building on Debian and
Ubuntu with those architectures as targets for a few years as well.
Next steps
Seager said that he had met with Ledru to discuss the idea of
making uutils coreutils the default in Ubuntu 25.10, and Ledru
felt that the project was ready for that level of exposure. Now it is
just a matter of specifics, he said, and the Ubuntu Foundations team
is already working up a plan to implement this in the next release
cycle. He did acknowledge that there was a need for caution and was open to
the possibility that he would need to "scale back on the
ambition
" if making the switch meant compromising stability or
reliability in an Ubuntu LTS release. If the switch doesn't work
out, it should be easy enough to revert in time for next year's LTS release.
To date, Ubuntu seems to be the first major Linux distribution that has seriously considered a switch to uutils. If Ubuntu 25.10 ships with uutils coreutils, it will be a significant win for the uutils project that grants exposure to a much larger user base than it has enjoyed so far. The "oxidize Ubuntu" experiment has the potential to accelerate Rust's adoption and inspire further attempts to replace C-based utilities with Rust, or it might have a chilling effect if Ubuntu runs into serious problems. Either way, the project should be instructive for the larger community.
Posted Mar 18, 2025 17:00 UTC (Tue)
by arachnist (subscriber, #94626)
[Link] (29 responses)
worst case scenario, we're going to see one of the funnier ubuntu releases in recent memory. ;)
Posted Mar 18, 2025 17:17 UTC (Tue)
by jzb (editor, #7867)
[Link]
I wonder how many more issues like that will start popping up when more people start using these. It will be interesting to see, won't it? Definitely the kind of testing uutils needs to be a legit replacement. Whatever warts the GNU utilities may have, they've gotten a lot of use over the years and have been road-tested quite well. I am eager to see the results.
Posted Mar 18, 2025 18:11 UTC (Tue)
by LtWorf (subscriber, #124958)
[Link] (24 responses)
Posted Mar 18, 2025 19:01 UTC (Tue)
by khim (subscriber, #9252)
[Link] (2 responses)
I think one of the insights that actually made Rust possible is an observation about complete infeasibility of declaring memory leaks as errors. Tracing GC “doesn't have memory leaks”, but only if you define “memory leaks” in an extremely perverse fashion: why would I care that my program “doesn't have memory leaks” but instead wastes gigabytes of memory in the “inactive caches”? If it looks like a And once you realize that elimination of memory leaks in layman sense is impossible you immediately realize that tracing GC is not needed and can think about how to live in a world without one. And then you end up with Rust – not by design but by accident and/or observation.
Posted Mar 18, 2025 23:49 UTC (Tue)
by shahms (subscriber, #8877)
[Link] (1 responses)
Posted Mar 19, 2025 8:57 UTC (Wed)
by taladar (subscriber, #68407)
[Link]
Posted Mar 18, 2025 19:33 UTC (Tue)
by mb (subscriber, #50428)
[Link] (10 responses)
Of course they are considered problematic in Rust.
One basically can't simply forget to destroy an object due to automatic drops. (and "forget" is an explicit operation).
But Rust makes it much harder than say C to just accidentally forget to free something in an obscure error path.
Posted Mar 19, 2025 14:08 UTC (Wed)
by cultpony (subscriber, #167240)
[Link] (9 responses)
Posted Mar 19, 2025 23:05 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (8 responses)
Rust is a systems language with manual memory management, just like C, but drenched in a thick layer of syntactic sugar (to automatically free things when you're done using them) and static analysis (to detect when you free something before you're done using it). It does not make arbitrary decisions about when to deallocate things (contrast with a GC'd language, which does make such decisions). If the compiler did not deallocate something for you, it means that you have (knowingly or not) asked the compiler to keep that thing alive.
In most cases, if something no longer needs to exist, you can std::mem::drop() it, or just return from whichever scope owns the allocation. drop() is a safe function, meaning the compiler will not let you use anything that has potentially been dropped (by either means), and in fact drop() is really just a convenience function that takes ownership, does nothing, and immediately returns. You can't drop static variables or anything that you don't own.
There are objects with "more complicated" ownership models than that (a simple example being Rc/Arc), but those objects still have some notion of dropping (you can std::mem::drop() any variable, but if that variable is participating in some shared ownership chicanery, the shared allocation might outlive it).
There is also one other catch: Stack allocations always last until the function returns. That's not a Rust limitation, it's just how the stack works (at least, in any language that has a call stack). If a stack variable is moved from (or dropped early, but that's equivalent to moving it), what actually happens is that the variable's contents are memcpy'd into the new location, the drop flags are updated to indicate that the variable is now uninitialized garbage (and must not be dropped or otherwise used again), and the variable binding is deleted from the current namespace (so you can't use it again). But the stack allocation is still physically occupied until the function returns. This is rarely a problem because we usually allocate large objects on the heap (plus, the optimizer can do all sorts of things with the physical stack layout anyway).
Posted Mar 20, 2025 1:06 UTC (Thu)
by wahern (guest, #37304)
[Link] (6 responses)
That's not how C works. Automatic variables, *including* VLAs, are scoped to blocks. If they weren't, than you'd have problems with loops and stack overflow. Allocations using the common "alloca" builtin do last for the entire function, but VLAs were deliberately given different semantics.
Posted Mar 20, 2025 4:13 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
Posted Mar 20, 2025 13:40 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link] (4 responses)
[If only one language had that mistake, or, even if several did this but they don't regard it as a mistake and fix it, that would be a different matter but in fact this mistake has happened several times and been fixed in IIRC at least C# and Go]
Posted Mar 20, 2025 22:39 UTC (Thu)
by wahern (guest, #37304)
[Link] (3 responses)
The stack does shrink. Example program: For `echo 200 100 5 | ./a.out` I get: As the size of successive stack allocations decrease, so does the frame size.
Posted Mar 21, 2025 13:39 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link] (2 responses)
I assume, since your example is a VLA that if you write a conventional C89 array or any other type it is not in fact creating and destroying the allocation.
Posted Mar 21, 2025 22:42 UTC (Fri)
by wahern (guest, #37304)
[Link] (1 responses)
For posterity: I've been using gcc version 14.2.0 (MacPorts gcc14 14.2.0_3+stdlib_flag) on an ARM M1 with these test cases. (__builtin_stack_address was too convenient, but not supported by the installed Apple clang toolchain, though it seems it is supported by the latest upstream clang release.)
Posted Mar 21, 2025 23:29 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link]
So that's on me. It did cause me to go find out what the current status is of (formally supported rather than as a hack) VLA-like Rust objects (ie a runtime sized object lives on the stack) and seems like they're not close.
Posted Mar 21, 2025 17:18 UTC (Fri)
by anton (subscriber, #25547)
[Link]
Posted Mar 18, 2025 20:38 UTC (Tue)
by mbiebl (subscriber, #41876)
[Link]
Posted Mar 18, 2025 21:01 UTC (Tue)
by Phantom_Hoover (subscriber, #167627)
[Link] (8 responses)
Posted Mar 19, 2025 12:41 UTC (Wed)
by Baughn (subscriber, #124425)
[Link] (7 responses)
Posted Mar 19, 2025 20:28 UTC (Wed)
by Phantom_Hoover (subscriber, #167627)
[Link] (6 responses)
Posted Mar 19, 2025 22:08 UTC (Wed)
by excors (subscriber, #95769)
[Link] (5 responses)
I think a more useful formal definition is "allocated memory that will not be accessed in the future". (Formal definitions are happy to rely on oracles that can see the future). "Unreachable" is just an approximation with the (very useful) property of being a computable function, so that's what practical GCs use.
But there are many variations of "reachable": referenced by another allocated object (cycles won't be collected), reachable from a root set (cycles will be collected), reachable from some integer on the stack that happens to look like a pointer (conservative vs precise), reachable even if you ignore weak references, etc. Those details are quality-of-implementation issues, they're not a fundamental part of what a memory leak is.
"Not accessed in the future" is much more fundamental. It's uncomputable in general, but a human (or sophisticated algorithm) can sometimes determine that a reachable object will never be used, and I think it's fair to call that a memory leak. Then you can say e.g. "A cache with a bad policy is another name for a memory leak" - it doesn't matter that the cache contents are technically reachable (https://devblogs.microsoft.com/oldnewthing/20060502-07/?p...)
(https://inside.java/2024/11/22/mark-scavenge-gc/ expresses the same idea: "An object is said to be live if it will be accessed at some time in the future execution of the mutator" and "GCs typically approximate liveness using pointer reachability". And e.g. https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf implements a "liveness-based oracle" in Java, by recording every allocation and memory access and then replaying the program, to test how a GC implementation compares against a theoretically optimal freeing of memory.)
Informally you'd add "...and is large enough and long-lived enough to care about" to the definition of memory leak, but that's very subjective. Neither GC nor RAII can completely save you from wasting memory on non-live objects, so you'll always end up having to profile and debug to find the ones worth caring about. (They'll save you a lot of effort compared to manual memory management, though.)
Posted Mar 20, 2025 4:19 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
The definition I use (at my day job as an SRE) is even more pragmatic: A program is leaking memory if, when you graph its memory usage over the last (e.g.) 12 hours, it's roughly a straight line going up and to the right. But that requires you to actually have real monitoring, which some people apparently don't.
Posted Mar 20, 2025 9:13 UTC (Thu)
by taladar (subscriber, #68407)
[Link]
A more elaborate example might be a work queue where a priority field is only used on enqueuing but still kept around until the task has been processed to completion.
Mostly that falls under your "is large enough to care about" but in general it is just a trade-off between being worth restructuring your entire application data structures to be able to free pieces you won't need independently and the amount of extra memory used.
Posted Mar 20, 2025 12:12 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (2 responses)
The details around "deallocation" are, IMO, the hard chunk of defining "will not be accessed in the future". We wouldn't consider let mut foo = Foo::new(); foo.do_the_thing(); /* 1 */ drop(foo); as having a leak just because at point /* 1 */ there's an allocated object that will not be accessed again, but you might want to define the program as having a leak if, at /* 1 */, it spawned a thread that did all the rest of the program's work apart from freeing foo.
Posted Mar 21, 2025 13:27 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
If you count deallocation as access then we are back to square one with tracing-GC based language never having any leaks while real-world Java programs wasting gigabytes for stuff they would never need. Not a very useful definition. But if you would look on issue of memory leaks from layman perspective, more precisely, CFO perspective then situation is much simpler: we don't care about bounded memory leaks at all. They don't raise our bill of materials unpredictably. What we do are about are unbounded leaks: situations when ratio between memory spent on “useful work” and “memory leaks” goes to zero. And it's much easier to define what is unbounded memory leak. Imagine that you program runs alongside of that oracle that tells it whether certain object would be touched in the future or not (without counting destructors/deallocators). Count the amount of memory it needs. Now run real program with the same inputs. How much memory that run needs? The smaller the ratio the better and if it's not bounded by anything then you have an unbounded memory leak. P.S. Note that most real world programs use more memory then they, theoretically, could. Tracing GC based ones are especially egregious since they usually need at least 2x more than theoretical minimum (simple, naïve, mark-and-sweep algorithm simply require 2x more to even be usable, while modern approaches can work with more but their efficiency becomes drastically reduced). But as long as ratio is bounded (you need to pay for 16GiB of memory if you plan to process 1GiB files or something like that) CFO can easily adjust bill of materials. Unfounded leak, on the other hand, means you have no idea how much would you need to pay. And that is the critical difference.
Posted Mar 21, 2025 16:35 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
This is why I said At one extreme, all allocated memory is leaked, because there's always a time when it's still allocated but has not yet been deallocated; at the other extreme, no allocated memory is leaked because all memory is implicitly freed by program exit.
You need to find a definition of "deallocated" that is useful for the case you're considering; for example, memory is considered deallocated for leak purposes if you use the language level facility† to release it before the next language level memory allocation, function return, or function call. That way, you've allowed for RAII (you say that destructors run just before function return), but you've ensured that any route by which memory usage can grow unbounded is considered to be a leak, as are bounded leaks where you're "merely" late deallocating (such as your tracing GC example of having gigabytes allocated but never used).
† A language level facility could be something like C's free for heap objects and leaving the scope for stack objects, but for a language like Python, you could define it as "set the last reference to None or a different object, and ensure that there are no cyclical references" to get a useful definition.
Posted Mar 19, 2025 13:17 UTC (Wed)
by jhe (subscriber, #164815)
[Link] (2 responses)
Isn't it bad practice to do the checks for existence or inode type before open()? TOCTOU?
I wasn't able to check if coreutils does the same as coreutils does not contain more(1). Which throws up another bunch of questions but i will stop here.
Posted Mar 19, 2025 13:30 UTC (Wed)
by intelfx (subscriber, #130118)
[Link] (1 responses)
It's not a security boundary. It is only there to give a better error message (although the second check for the file existence looks like it could be safely dropped as it doesn't even provide a better error). Any TOU errors will be caught in File::open() a few lines below.
Posted Mar 19, 2025 13:51 UTC (Wed)
by intelfx (subscriber, #130118)
[Link]
Although, of course, nothing stops that code from matching on `ErrorKind::IsADirectory` in the same match{} statement instead of duplicating the error handling blurb.
So you're right that the code _is_ sloppy. But I'd say not exactly to the level of a security issue.
Posted Mar 18, 2025 17:21 UTC (Tue)
by lmb (subscriber, #39048)
[Link] (21 responses)
That's saddening, because otherwise, I'm a huge fan of Rust vs C(++), static linking aside.
Posted Mar 18, 2025 22:03 UTC (Tue)
by tchernobog (guest, #73595)
[Link] (19 responses)
However, in the case of such base utilities, you basically have to provide bug-by-bug compatibility with gnu coreutils by now.
I kinda doubt a company will take these utils, close source them, and resell them without redistributing sources. It would bring only marginal benefit.
I am much more worried about new, innovative implementations with a higher degree of complexity. For instance rsync, or the new ripgrep implementation are much more sophisticated and would be more worrisome without copyleft.
But most of these tools are just painful to write to wrap correctly POSIX or Windows API, but not inherently hard to code.
Posted Mar 19, 2025 0:06 UTC (Wed)
by parametricpoly (subscriber, #143903)
[Link] (13 responses)
Yes, there's C23, but the ML family of languages is much better if correctness is valued. Algebraic types are more expressive, dependent typing allows specifying useful invariants, automatic memory management makes a lot of sense now that systems have 64+ gigabytes of RAM. Parsing those languages is easier because the grammar is more straightforward. C has some nasty limitations: the lack of modules, the need for complex pre-processor makes incremental and efficient parsing almost impossible.
If people are going to switch to new apps, they will be making the switch based on the technical merits. If a Rust app is more secure, less crash prone, and faster to develop, it's a big win for the users. Now, to avoid being replaced by non-copyleft clones, new copyleft apps are needed. This means GNU needs to come up with new languages. I don't think Guile helps here.
Posted Mar 19, 2025 0:57 UTC (Wed)
by Paf (subscriber, #91811)
[Link] (11 responses)
Posted Mar 19, 2025 8:52 UTC (Wed)
by anselm (subscriber, #2796)
[Link] (9 responses)
I obviously don't speak for the GNU project, but I would assume they prefer code they can compile with gcc, and the gcc-based Rust compiler isn't quite there yet.
Posted Mar 19, 2025 11:42 UTC (Wed)
by excors (subscriber, #95769)
[Link] (8 responses)
It seems typical for a successful new language to take 10-15 years to reach a reasonable level of maturity and acceptance. If there's an urgent need to defend copyleft, you can't afford to pause and build a whole new language first.
Posted Mar 19, 2025 12:18 UTC (Wed)
by anselm (subscriber, #2796)
[Link] (7 responses)
Which is, if anything, an argument for finishing the gcc-based Rust compiler, rather than coming up with an entirely new language from scratch.
I don't believe that the GNU project has a problem in principle with Rust, the language. The fact that a Rust frontend for gcc is in the works seems to suggest otherwise.
Of course if you're a “GPL maximalist” it kinda sucks if people who used to use the GPL'ed coreutils in C are jumping ship to a different package which is technically superior, coincidentally written in Rust, and unfortunately happens to be more liberally licensed. Having said that, if the GNU project is primarily interested in a more modern coreutils replacement for the mythical “GNU operating system”, then once gcc-rs can compile uutils it can simply declare that uutils is now “part of the GNU operating system” much like, e.g., X11 or TeX (neither of which were GPL-licensed, nor part of the GNU project) were stipulated to be “part of the GNU operating system” back when the idea was new.
In any case there is certainly no urgent need for the GNU project to come up with an entirely new “GNU language” just to be able to implement a new version of the GPL coreutils. The GNU project could always write their own version, under the GPL, in Rust, to be compiled with gcc-rs once that is ready. It's just that right now the GNU project may perhaps be excused for not doing development in Rust while their own compiler can't deal with it yet.
Posted Mar 19, 2025 13:38 UTC (Wed)
by ceplm (subscriber, #41334)
[Link] (4 responses)
Actually, I am not sure about, and I am not even sure we shouldn't have a problem.
Posted Mar 19, 2025 14:38 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link] (3 responses)
> https://softwarefreedom.org/podcast/2009/jul/07/0x11/
Don't see any relevance to this podcast on Rust. Why would FSF/GNU have any problems at all with Rust and if they have a problem, have they explained it?
Posted Mar 19, 2025 15:01 UTC (Wed)
by ceplm (subscriber, #41334)
[Link] (2 responses)
Posted Mar 19, 2025 15:08 UTC (Wed)
by daroc (editor, #160859)
[Link] (1 responses)
Cyclone was released in 2001, so even if someone had a patent before that which they could argue covered borrow checking, it has pretty clearly expired by now.
There are absolutely risks to using newer programming languages, but I'm not convinced that patent encumbrance is a particular problem in Rust's case.
Posted Mar 20, 2025 9:06 UTC (Thu)
by taladar (subscriber, #68407)
[Link]
Posted Mar 19, 2025 14:07 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
Note, though, that the NBSoI is not the only way to end up with a new language - you can also have languages that are basically the same combination of ideas as existing languages, but with a different syntax or emphasis (e.g. the huge family of Lisp-like languages). It's just that the NBSoI is where things get interesting, since it's where techniques move from "great in theory, lousy in practice" to "this is usable now".
Posted Mar 20, 2025 23:22 UTC (Thu)
by jwakely (subscriber, #60262)
[Link]
The GNU project doesn't control GCC, so I don't think you can draw any conclusions about GNU's view on Rust from the existence of gccrs.
Posted Mar 19, 2025 11:46 UTC (Wed)
by tux3 (subscriber, #101245)
[Link]
Posted Mar 19, 2025 21:38 UTC (Wed)
by jmalcolm (subscriber, #8876)
[Link]
Posted Mar 19, 2025 10:53 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
`rclone` already exists: <https://rclone.org/>. It's not a drop-in, but it also supports way more remote storage protocols (e.g., I `rclone` my `restic` backups to Google Drive and Backblaze (S3-ish) with it).
Posted Mar 21, 2025 22:39 UTC (Fri)
by ndiddy (subscriber, #167868)
[Link] (3 responses)
There was a podcast interview here: https://youtu.be/5qTyyMyU2hQ?t=1270 with the lead uutils maintainer where he brought up that some car manufacturers had already started using uutils in their products instead of the GNU core utils because it means they don't have to comply with the GPL. From a corporate standpoint, when you have one set of tools where you have to comply with the GPL, and then a drop-in replacement for them where you don't, of course you'll use the tools that don't require GPL compliance.
Posted Mar 22, 2025 9:18 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
What it DOES bring them is a big reduction in pain. If I can ship a product, based on a publicly available tree, without all the hassle of tracking, responding to requests, etc etc, then that's a big attraction.
And regardless of whether you're an engineer, a programmer, an analyst, people at the sharp end like to collaborate. It's bean counters who all too often don't see the benefit of collaboration, but they do see the cost of getting sued.
What we need is a GPL-lite, that contains all the downstream protections, and rather than saying "you have to share the source" replaces it with "you must develop in public, and tell your customers where to find it". Basically, it has to be publicly readable, 3rd-party hosted, and advertised to upstream and downstream alike.
At the end of the day, engineers want to share, but they don't want all the GPL Administrative Hassle that comes with the GPL. All bean counters can see is the cost. The GPL is making the wrong person pay! There's a good chance I will push my changes upstream because I can see the benefit. If I don't, upstream may (or may not) mine my respository because they see a benefit. And any customer who wants the source may have a bit of grief working out exactly which source they've got, but they have got it (and if I can't tell them, that may well be a cost to me). (Programming in Excel it's costing me dear at the moment!)
Cheers.
Posted Mar 23, 2025 1:48 UTC (Sun)
by himi (subscriber, #340)
[Link] (1 responses)
Unless they're actually modifying the code, of course. Which . . . well, for coreutils? I'd have to assume that's just going to be compilation support for whatever platform they're using, in which case it'd make far more sense to submit patches upstream than to maintain their own fork in-house, and the same logic would apply whether they're using GNU coreutils or uutils.
It sounds like either the companies in question don't actually understand the way the GPL works (which shouldn't be an issue if they have competent lawyers), or they're pulling an Apple and avoiding any GPLed code on ideological grounds.
Posted Mar 23, 2025 7:09 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
"6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:
a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange."
Have you read the GPL? Have you understood it? I haven't quoted the entirety of section 6, but if you are a business there is a hell of a lot more than just "documenting you got it from upstream". You - CORPORATELY - are on the hook for making sure your customer can get the source. And that is expensive administrative hassle companies would much rather avoid.
There are "sort of" getouts, 6c, and 6e, but they're not aimed at corporates, and they still come with grief companies don't want. I've only just noticed 6e, but unless the company controls that location, they're probably not complying with it, and if they do control it it's more hassle that again they don't want.
Cheers,
Posted Mar 19, 2025 11:16 UTC (Wed)
by ballombe (subscriber, #9523)
[Link]
Posted Mar 18, 2025 17:23 UTC (Tue)
by wtarreau (subscriber, #51152)
[Link] (37 responses)
$ time for i in {1..2000}; do /bin/true;done
$ time for i in {1..2000}; do /usr/lib/cargo/bin/coreutils/true;done
That's 4.6 times slower. Same for other tools, wc is 3 times slower:
$ time for i in {1..2000}; do /bin/wc /dev/null >/dev/null;done
$ time for i in {1..2000}; do /usr/lib/cargo/bin/coreutils/wc /dev/null >/dev/null;done
and "expr" almost 3 times as well:
$ time for i in {1..2000}; do /bin/expr $RANDOM + $RANDOM >/dev/null;done
$ time for i in {1..2000}; do /usr/lib/cargo/bin/coreutils/expr $RANDOM + $RANDOM >/dev/null;done
I suspect that it's caused by the multi-call binary, which is enormous (23 MB):
$ ls -la /bin/true
$ ls -la /usr/lib/cargo/bin/coreutils/true
That's definitely something to take into consideration! Loading (and dynamically linking) a 23MB executable thousands of times per second in scripts is going to cost a lot for certain usages. I've seen numerous utilities in the field that used to be limited by fork-exec speed calling /bin/echo, /bin/expr and even /bin/true, and here the extra cost might make some users particularly unhappy and want to roll back or migrate to another distro.
Also one question concerns the fact that on an ubuntu 24.04, the sum of all the individual tools reportedly supported by the coreutils binary total 5.6 MB. That's a 4x inflation to reach 23 MB. Are there *that* many more features to inflate that much ?
Posted Mar 18, 2025 17:38 UTC (Tue)
by tzafrir (subscriber, #11501)
[Link] (4 responses)
$ time for i in {1..2000}; do /bin/true;done
real 0m1.158s
$ time for i in {1..2000}; do busybox true;done
real 0m0.816s
Posted Mar 18, 2025 17:50 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link]
$ strace -f ./true
$ time for i in $(seq 2000); do ./true ; done
It's fun exercise to write minimal assembly program which reexec itself N times then exits.
Posted Mar 18, 2025 17:56 UTC (Tue)
by wtarreau (subscriber, #51152)
[Link]
Posted Mar 18, 2025 17:59 UTC (Tue)
by dkg (subscriber, #55359)
[Link]
If the performance costs come from the dynamic linking itself (as they surely would for a trivial main() like true, it's no surprise that the binary with the heavier load of libraries to link to would cost more.
Posted Mar 18, 2025 19:11 UTC (Tue)
by ma4ris8 (subscriber, #170509)
[Link]
time for i in {1..2000}; do /bin/true;done
real 0m4.322s
~/rust/true-bin$ time for i in {1..2000}; do ./target/release/true-bin;done
real 0m2.681s
------------ main.rs --------------
use core::panic::PanicInfo;
#[panic_handler]
#[unsafe(no_mangle)]
--------- Cargo.toml is --------------
[dependencies]
# Let's optimize for small size!
------------ build.rs is ----------------------
// The following options optimize for code size!
// Tell the linker to exclude the .eh_frame_hdr section.
Compilation: cargo build --release
Posted Mar 18, 2025 17:59 UTC (Tue)
by barryascott (subscriber, #80640)
[Link]
Posted Mar 18, 2025 18:13 UTC (Tue)
by eru (subscriber, #2753)
[Link] (1 responses)
Posted Mar 19, 2025 1:28 UTC (Wed)
by lkundrak (subscriber, #43452)
[Link]
Posted Mar 18, 2025 18:34 UTC (Tue)
by dskoll (subscriber, #1630)
[Link] (16 responses)
OK, I understand the desire for Rust wrt safety, but really... is it necessary to rewrite true (or false) in Rust rather than C? Really?
Posted Mar 18, 2025 18:39 UTC (Tue)
by atai (subscriber, #10977)
[Link] (1 responses)
Posted Mar 18, 2025 19:35 UTC (Tue)
by ma4ris8 (subscriber, #170509)
[Link]
objcopy -R .eh_frame -R .comment ./target/release/true-bin even-smaller-true-bin
./target/release/true-bin: 712 bytes
~/rust/true-bin$ time for i in {1..2000}; do /bin/true;done
~/rust/true-bin$ time for i in {1..2000}; do ./target/release/true-bin;done
~/rust/true-bin$ time for i in {1..2000}; do ./even-smaller-true-bin ; done
objdump --disassemble-all ./even-smaller-true-bin
./even-smaller-true-bin: file format elf64-x86-64
Disassembly of section .interp:
0000000000200078 <.interp>:
Disassembly of section .text:
00000000002000c8 <.text>:
Posted Mar 18, 2025 19:32 UTC (Tue)
by jkingweb (subscriber, #113039)
[Link] (12 responses)
Posted Mar 18, 2025 19:51 UTC (Tue)
by ma4ris8 (subscriber, #170509)
[Link] (11 responses)
It is good to have all binaries first, and optimize later.
/bin/true is 32544 bytes, and links three libraries.
In Rust language, no memory allocator, libc, or vdso library reference was needed.
In this case, "unsafe" needed to be used: this is not using std library,
Posted Mar 18, 2025 22:30 UTC (Tue)
by willy (subscriber, #9762)
[Link] (10 responses)
(This is something that has always bugged me about GNU; tools shouldn't be forced to have those options)
Posted Mar 19, 2025 0:58 UTC (Wed)
by josh (subscriber, #17465)
[Link]
Posted Mar 19, 2025 16:47 UTC (Wed)
by hmh (subscriber, #3838)
[Link] (6 responses)
It is still fast when you use it as you're supposed to (i.e. no parameters), small enough on anything that isn't embedded (and would use busybox or toybox instead). And it is not going to create security issues that are not present everywhere else (because it just links to glibc, and it is using glibc's i18n). But yeah, it *is* bloated for no good reason: there's a man page, so /bin/cat really doesn't need or benefit in any way from --help or --version, and the minimal /bin/true and /bin/false would be a lot smaller.
That said, on anything worth of notice, true and false are going to be shell builtins.
Now, GNU /bin/cat is optimized to all heck, that would be a more interesting one to compare with the rust version...
Anyway, this particular rust project explicitly opted into the dependency hell pattern, and thus IMO it is too much of a dependency chain vulnerability for something that I'd run :-(
Posted Mar 20, 2025 11:25 UTC (Thu)
by chris_se (subscriber, #99706)
[Link] (5 responses)
Yes, that's my main issue with the current state of affairs w.r.t. Rust. I rather like the language itself, but I'm utterly baffled that many Rust people saw what was going on with npm and thought "sure, let's do more of that". (Ok, it's not quite as bad yet as leftpad, but still...)
Posted Mar 21, 2025 8:36 UTC (Fri)
by taladar (subscriber, #68407)
[Link] (4 responses)
I'd much rather have a hundred small Rust dependencies than one Qt or openssl that comes with hundreds of critical bugs and security holes that do not even affect the part of it I am using but I have to deal with the related upgrades and CVEs anyway.
Posted Mar 21, 2025 11:26 UTC (Fri)
by excors (subscriber, #95769)
[Link] (1 responses)
In C++, if I want something very simple like a circular buffer class, I might find it as part of Boost. That's a huge dependency for such a little feature, which does have some drawbacks. But because it's huge I can be confident there are many developers working on the project. There are review processes, and if one developer tries to slip in something naughty then there's a reasonable chance another developer will spot it before it's released. Security researchers will be running their tools over it. If a vulnerability is reported, there are responsible maintainers who will respond promptly.
If I want the same in Rust, I'll probably find a library that is just one random guy on GitHub. A lot of the code has probably been reviewed by exactly zero other people. There is nothing to mitigate against that developer being malicious, or having their GitHub account compromised, or carelessly accepting a pull request from another random user. They might ignore a vulnerability report for months. They're lacking all the processes and shared responsibility that comes from being in a large project.
I'd agree the huge dependencies will probably have more accidental vulnerabilities, because the sheer quantity of code will outweigh the improved review processes - but Rust's memory safety should already mitigate a lot of that risk, compared to C/C++. That means deliberate backdoors are a relatively greater risk, even before attackers realise there aren't enough buffer overflows and use-after-frees left for them to exploit and they'll have to shift towards more supply chain attacks.
Posted Mar 24, 2025 10:13 UTC (Mon)
by taladar (subscriber, #68407)
[Link]
Posted Mar 23, 2025 15:53 UTC (Sun)
by surajm (subscriber, #135863)
[Link] (1 responses)
I hope to see this situation improve over time as larger organizations continue to adopt rust and place more strict rules on allowable dependencies.
Posted Mar 23, 2025 17:11 UTC (Sun)
by farnz (subscriber, #17727)
[Link]
Posted Mar 20, 2025 12:51 UTC (Thu)
by MortenSickel (subscriber, #3238)
[Link] (1 responses)
Posted Mar 20, 2025 12:54 UTC (Thu)
by MortenSickel (subscriber, #3238)
[Link]
Posted Mar 18, 2025 19:46 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Mar 18, 2025 22:34 UTC (Tue)
by willy (subscriber, #9762)
[Link] (9 responses)
Posted Mar 18, 2025 23:10 UTC (Tue)
by PeeWee (guest, #175777)
[Link] (8 responses)
Posted Mar 18, 2025 23:14 UTC (Tue)
by PeeWee (guest, #175777)
[Link]
Posted Mar 18, 2025 23:15 UTC (Tue)
by willy (subscriber, #9762)
[Link] (6 responses)
And yes, having true as a separate binary is great for the cases you point out, and I never said it shouldn't exist.
My point, which I was quite explicit about, is that none of this matters for the performance of executing a shell script. Be it bash, ksh or some other shell of your choice, if you care about performance you already implemented true as a built-in.
Posted Mar 18, 2025 23:36 UTC (Tue)
by PeeWee (guest, #175777)
[Link] (3 responses)
Posted Mar 19, 2025 5:06 UTC (Wed)
by joib (subscriber, #8541)
[Link] (2 responses)
I don't think you can extrapolate from the startup overhead to the performance of other utilities doing more work.
Posted Mar 19, 2025 10:26 UTC (Wed)
by PeeWee (guest, #175777)
[Link] (1 responses)
At least this is worth having an eye on. If it turns out that the benefits outweigh the downsides, I'd be the last to insist on a fast
Posted Mar 19, 2025 15:13 UTC (Wed)
by joib (subscriber, #8541)
[Link]
Will they? Per the original post in this subthread, uutils has an invocation overhead of 4.6s/2000=0.0023s (minus the shell looping overhead). Keep in mind that more complex coreutils utilities will have higher overhead than /usr/bin/true as they need to link in more libraries and map more pages, reducing the relative penalty of uutils. And of course most uses of these utilities actually do more work, amortizing the startup overhead.
> But of course things like find and sort may end up being faster, depending on the size of their working set.
I think things like find, sort, cp etc. will be faster or slower depending on the implementation and tuning choices, algorithms used etc. None of which is impacted by the overhead of launching the binary in the first case.
Posted Mar 19, 2025 1:00 UTC (Wed)
by josh (subscriber, #17465)
[Link]
Performance is regularly cited as a reason to keep /bin/sh pointing to dash rather than bash. If bash were faster, I think it's quite likely many distributions (including Debian) would have pointed /bin/sh to bash.
Posted Mar 19, 2025 4:40 UTC (Wed)
by interalia (subscriber, #26615)
[Link]
It'll be interesting to know how much of a difference uutils would make for running shell scripts if the startup time for the utilities is slower e.g. if a script has a loop that runs cp, ls, diff etc. in a loop. But I imagine with more use there'll be more focus on performance and startup time rather than just feature parity, so it'll improve over what it is now.
Posted Mar 19, 2025 12:21 UTC (Wed)
by ferringb (subscriber, #20752)
[Link]
Your stats don't make much sense to me. The file size looks like a debug build, but the runtimes are in the range of a release build, at least against my system. I'm confused, in short. File size stats:
The odd part here is your file sizes look like a debug build, but the timings align roughly w/ a release build. There's noise between your run and mine, but I'm using hyperfine for this.
Posted Mar 20, 2025 22:24 UTC (Thu)
by gmatht (guest, #58961)
[Link]
Posted Mar 18, 2025 19:26 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (19 responses)
Posted Mar 18, 2025 19:46 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Mar 18, 2025 21:13 UTC (Tue)
by wtarreau (subscriber, #51152)
[Link] (1 responses)
Posted Mar 18, 2025 21:17 UTC (Tue)
by jrtc27 (subscriber, #107748)
[Link] (15 responses)
One could at least use dpkg-divert to move the non-Rust versions out of the way in a more proper manner, which it doesn't seem to do from a cursory grep of the source.
Posted Mar 18, 2025 22:22 UTC (Tue)
by PeeWee (guest, #175777)
[Link] (14 responses)
Posted Mar 18, 2025 22:32 UTC (Tue)
by willy (subscriber, #9762)
[Link] (6 responses)
Posted Mar 18, 2025 22:47 UTC (Tue)
by PeeWee (guest, #175777)
[Link] (3 responses)
I knew I shouldn't have used
Posted Mar 19, 2025 9:12 UTC (Wed)
by taladar (subscriber, #68407)
[Link] (2 responses)
Posted Mar 19, 2025 10:54 UTC (Wed)
by PeeWee (guest, #175777)
[Link] (1 responses)
Posted Mar 21, 2025 2:47 UTC (Fri)
by raven667 (subscriber, #5198)
[Link]
Posted Mar 19, 2025 0:37 UTC (Wed)
by PeeWee (guest, #175777)
[Link] (1 responses)
Posted Mar 20, 2025 8:33 UTC (Thu)
by riking (guest, #95706)
[Link]
Posted Mar 19, 2025 6:05 UTC (Wed)
by jrtc27 (subscriber, #107748)
[Link] (4 responses)
Posted Mar 19, 2025 10:46 UTC (Wed)
by PeeWee (guest, #175777)
[Link] (1 responses)
If they want people to do testing they should provide safer ways to do it than some quick and dirty hacks that will only result in some - very few - setting up test environments and results of rather synthetic tests, which are not the same as road tested in real scenarios.
Posted Mar 20, 2025 19:48 UTC (Thu)
by jrtc27 (subscriber, #107748)
[Link]
Posted Mar 19, 2025 11:11 UTC (Wed)
by bluca (subscriber, #118303)
[Link] (1 responses)
Posted Mar 20, 2025 19:50 UTC (Thu)
by jrtc27 (subscriber, #107748)
[Link]
Posted Mar 19, 2025 9:13 UTC (Wed)
by gdt (subscriber, #6284)
[Link] (1 responses)
Applications programmers should be able to rely on the Filesystem Hierarchy Standard, which requires some basic utilities to be in /bin and has a section with a heading " /usr/bin : Most user commands".
Ref: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s04....
Posted Mar 19, 2025 11:19 UTC (Wed)
by PeeWee (guest, #175777)
[Link]
Posted Mar 18, 2025 21:38 UTC (Tue)
by fraetor (subscriber, #161147)
[Link] (3 responses)
Though as an amature software archaeologist it'll be a shame to lose some of the historical context in the coreutils manpages.
Posted Mar 18, 2025 23:44 UTC (Tue)
by PeeWee (guest, #175777)
[Link] (2 responses)
Posted Mar 18, 2025 23:48 UTC (Tue)
by interalia (subscriber, #26615)
[Link]
Posted Mar 27, 2025 23:49 UTC (Thu)
by raindog308 (guest, #176490)
[Link]
“See the info page for more details.”
Posted Mar 19, 2025 12:54 UTC (Wed)
by jhe (subscriber, #164815)
[Link]
Posted Mar 19, 2025 16:55 UTC (Wed)
by antiphase (subscriber, #111993)
[Link] (1 responses)
Posted Mar 20, 2025 16:24 UTC (Thu)
by jorgegv (subscriber, #60484)
[Link]
Posted Mar 20, 2025 12:56 UTC (Thu)
by h7KdD8Z (guest, #169613)
[Link] (1 responses)
Posted Mar 20, 2025 15:59 UTC (Thu)
by patrick_g (subscriber, #44470)
[Link]
The license issue is addressed during the talk.
Posted Mar 20, 2025 15:27 UTC (Thu)
by pj (subscriber, #4506)
[Link]
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
duck memory leak, swims acts like like a duck memory leak, and quacks makes life painful like a duck memory leak, then it probably is a duck memory leak – and proponents of tracing GC wouldn't convince me otherwise.resource usage concerns
resource usage concerns
resource usage concerns
And leaks are hard to code by accident in Rust.
Of course it's possible to program memory leaks by creating cyclic references (although cyclic references are harder in Rust than in most other common languages) or lists that are never freed.
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
#include <stdio.h>
#include <stdint.h>
#include <string.h>
__attribute__((noinline))
static void showfp(unsigned n, intptr_t otop) {
intptr_t top = (intptr_t)__builtin_stack_address();
printf("n:%u off:%td\n", n, (otop > top)? otop - top : top - otop);
}
int main(void) {
unsigned n = 0;
intptr_t top = (intptr_t)__builtin_stack_address();
while (1 == scanf("%u", &n)) {
char buf[n];
memset_s(buf, n, 0, n);
showfp(n, top);
}
return 0;
}
n:200 off:272
n:100 off:176
n:5 off:80
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
Stack allocations always last until the function returns. That's not a Rust limitation, it's just how the stack works (at least, in any language that has a call stack).
No. In every language with guaranteed tail-call optimization, all stack allocation ends at the latest before the tail-call. In Prolog, compilers sort the variables by lifetime, and stack-deallocate before every call, such that only live variables consume stack memory on the call. This reduces the memory consumption for recursive predicates that are not tail-recursive; I expect that there are other implementations of languages where recursion is important that use the same technique.
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
resource usage concerns
You need a good definition of accessed for this, too - it's not just "accesses" in the sense of reads and writes, but also deallocations count as accesses (otherwise all memory is leaked by this definition, since there's a period between the last read/write and the deallocation, even if the program is careful to keep this small). It also needs to focus on the "right" set of accesses - you want, for example, to not always count main's stack as leaked since it's not freed until the end of the program, but you also don't want to count something as "not leaked" just because it happens that RAII will free it before the end of the program.
resource usage concerns
> but also deallocations count as accesses
resource usage concerns
resource usage concerns
If you count deallocation as access then we are back to square one with tracing-GC based language never having any leaks while real-world Java programs wasting gigabytes for stuff they would never need.
The details around "deallocation" are, IMO, the hard chunk of defining "will not be accessed in the future"
.
resource usage concerns
resource usage concerns
resource usage concerns
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
The gcc-based Rust compiler is still a long way ahead of the gcc-based compiler for a hypothetical GNU language that hasn't been invented yet.
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
Against that, we're at about the right time in Rust's lifecycle for the Next Big Synthesis of Ideas (NBSoI) in programming language design to come together and produce something that's practically useful and academically interesting. If someone's going to do that under the GNU umbrella, that'd be great.
A new "GNU language"
Weakened license protection
Weakened license protection
There is no technical issue with shipping GPL software that depends on an MIT compiler, but I imagine this wouldn't feel very satisfying from the GNU point of view.
Weakened license protection
Weakened license protection
Weakened license protection
Weakened license protection
Wol
Weakened license protection
Weakened license protection
Wol
Weakened license protection
Performance concerns when heavily used in scripts ?
real 0m0.993s
user 0m0.568s
sys 0m0.479s
real 0m4.623s
user 0m0.784s
sys 0m2.530s
real 0m1.294s
user 0m0.627s
sys 0m0.719s
real 0m3.847s
user 0m0.916s
sys 0m2.983s
real 0m1.316s
user 0m0.631s
sys 0m0.750s
real 0m3.796s
user 0m0.873s
sys 0m2.998s
-rwxr-xr-x 1 root root 26936 Apr 5 2024 /bin/true
lrwxrwxrwx 1 root root 25 Feb 25 2024 /usr/lib/cargo/bin/coreutils/true -> ../../../../bin/coreutils
$ ls -la /usr/bin/coreutils
-rwxr-xr-x 1 root root 23361264 Feb 25 2024 /usr/bin/coreutils
Performance concerns when heavily used in scripts ?
user 0m0.803s
sys 0m0.408s
user 0m0.559s
sys 0m0.302s
Performance concerns when heavily used in scripts ?
execve("./true", ["./true"], 0x7ffe15f24de8 /* 73 vars */) = 0
exit(0) = ?
+++ exited with 0 +++
real 0m0.514s
user 0m0.289s
sys 0m0.292s
$ time ./a.out 69000
real 0m1.000s
user 0m0.017s
sys 0m0.973s
Performance concerns when heavily used in scripts ?
Your insanely fast busybox doesn't match what i'm seeing here. On debian amd64, on a semi-recent , Framework laptop, with a hot cache for all these binaries:
Performance concerns when heavily used in scripts ?
0 dkg@bob:~$ time for i in {1..2000}; do /usr/bin/coreutils true; done
real 0m3.020s
user 0m1.946s
sys 0m1.017s
0 dkg@bob:~$ time for i in {1..2000}; do /usr/bin/true; done
real 0m1.410s
user 0m0.996s
sys 0m0.404s
0 dkg@bob:~$ time for i in {1..2000}; do /usr/bin/busybox true; done
real 0m1.864s
user 0m1.285s
sys 0m0.558s
0 dkg@bob:~$
To be fair, busybox's dynamic linking is limited to libc and libresolve, while uutils' multicall coreutils binary links to libc, libgcc_s, libselinux, libm, and libpcre2-8. (/usr/bin/true only links to libc
Performance concerns when heavily used in scripts ?
Doing the naive "exit(0)" in main in Rust, yields better result using "origin" crate:
https://crates.io/crates/origin
user 0m0.406s
sys 0m4.068s
user 0m0.233s
sys 0m2.625s
#![no_std]
#![no_main]
use origin::program;
fn sleeping(_x: &PanicInfo) -> ! {
loop {}
}
unsafe fn origin_main(_argc: usize, _argv: *mut *mut u8, _envp: *mut *mut u8) -> i32 {
program::exit(0)
}
---- main.rs ends ----
[package]
name = "true-bin"
version = "0.1.0"
edition = "2024"
# Origin can be depended on just like any other crate. For no_std, disable
# the default features, and add the desired features.
origin = { version = "0.25.1", default-features = false, features = ["origin-start"] }
panic-halt = "1.0.0"
[profile.release]
# Give the optimizer more lattitude to optimize and delete unneeded code.
lto = true
# "abort" is smaller than "unwind".
panic = "abort"
# Tell the optimizer to optimize for size.
opt-level = "z"
# Delete the symbol table from the executable.
strip = true
---------- Cargo.toml ends ------------
fn main() {
// Pass -nostartfiles to the linker. In the future this could be obviated
// by a `no_entry` feature: <https://github.com/rust-lang/rfcs/pull/2735>
println!("cargo:rustc-link-arg=-nostartfiles");
println!("cargo:rustc-link-arg=-Wl,--no-eh-frame-hdr");
// Tell the linker to make the text and data readable and writable. This
// allows them to occupy the same page.
println!("cargo:rustc-link-arg=-Wl,-N");
// Tell the linker to exclude the `.note.gnu.build-id` section.
println!("cargo:rustc-link-arg=-Wl,--build-id=none");
// Disable PIE, which adds some code size.
println!("cargo:rustc-link-arg=-Wl,--no-pie");
// Disable the `GNU-stack` segment, if we're using lld.
println!("cargo:rustc-link-arg=-Wl,-z,nognustack");
}
---------------- build.rs ends ------------------------
Performance concerns when heavily used in scripts ?
If so try sleep 1000000 in one terminal then retest.
Does having the coreutils in memory optimise your tests by the kernel sharing the memory?
Performance concerns when heavily used in scripts ?
Performance concerns when heavily used in scripts ?
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true (was Performance concerns when heavily used in scripts ?)
even-smaller-true-bin: 504 bytes
real 0m4.393s
user 0m0.407s
sys 0m4.166s
real 0m2.688s
user 0m0.289s
sys 0m2.594s
real 0m2.226s
user 0m0.430s
sys 0m1.977s
200078: 2f (bad)
200079: 6c insb (%dx),%es:(%rdi)
20007a: 69 62 36 34 2f 6c 64 imul $0x646c2f34,0x36(%rdx),%esp
200081: 2d 6c 69 6e 75 sub $0x756e696c,%eax
200086: 78 2d js 0x2000b5
200088: 78 38 js 0x2000c2
20008a: 36 2d 36 34 2e 73 ss sub $0x732e3436,%eax
200090: 6f outsl %ds:(%rsi),(%dx)
200091: 2e 32 00 cs xor (%rax),%al
2000c8: 48 89 e7 mov %rsp,%rdi
2000cb: 55 push %rbp
2000cc: e9 00 00 00 00 jmp 0x2000d1
2000d1: b8 e7 00 00 00 mov $0xe7,%eax
2000d6: 31 ff xor %edi,%edi
2000d8: 0f 05 syscall
2000da: 0f 0b ud2
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true (was Performance concerns when heavily used in scripts ?)
It was quite interesting, to see that the original
Rust's binary went into 712 bytes,
and then into 504 bytes by removing two unnecessary ELF sections.
All that needed to be done, is to use CPU registers, stack and call Kernel's exit() syscall with zero argument.
but safety is still near obvious.
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true bloat, and /bin/cat
/bin/true bloat, and /bin/cat
/bin/true bloat, and /bin/cat
/bin/true bloat, and /bin/cat
/bin/true bloat, and /bin/cat
/bin/true bloat, and /bin/cat
I suspect, though, that this is more "vibes" than reality; sure, a big dependency with several maintainers looks healthier from the outside, but in practice, it's not that rare for a big dependency to internally be several small fiefdoms, each of which has just one maintainer. You thus have something that actually hasn't seen maintenance for years, but it "looks" maintained because it's part of a bigger thing where the other bits are well-maintained.
/bin/true bloat, and /bin/cat
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true (was Performance concerns when heavily used in scripts ?)
/bin/true (was Performance concerns when heavily used in scripts ?)
Performance concerns when heavily used in scripts ?
Try Performance concerns when heavily used in scripts ?
dash
, the default /bin/sh
provider in Ubuntu. true
as an actual binary does have its uses outside of the shell context too. I remember using it to override some program shortcuts in desktop environments. Don't like Ctrl-Q for quitting Firefox? Just set a keyboard shortcut in GNOME, for instance, and point it to the executable /bin/true
; It would be overkill and rather counter intuitive to start a shell only to have it exit, i.e. sh -c :
. Or try prototyping systemd service units; an
[Service]
ExecStart=true
is the simplest way to get started. true
is, or rather should be, the cheapest to run binary there is.
I just realized that Performance concerns when heavily used in scripts ?
dash
does implement true
as a builtin, I just did the wrong check (which true
instead of type true
).
Performance concerns when heavily used in scripts ?
But all the other coreutils, findutils, what have you, play a big part in the performance of the shell script, since the shell is just the glue, basically. And when Performance concerns when heavily used in scripts ?
true
is already rather slow what else is in store? Don't get too hung up on true
is all I am saying. I'm sure, the above examples were chosen because it is such a trivial executable and thus lends itself to measure the overhead of running uu-coreutils.
Performance concerns when heavily used in scripts ?
Yes and no, if every call to a uutil incurs that kind of overhead existing scripts will be noticeably slower. But of course things like Performance concerns when heavily used in scripts ?
find
and sort
may end up being faster, depending on the size of their working set.
true
in a showstopper kind of way.Performance concerns when heavily used in scripts ?
Performance concerns when heavily used in scripts ?
Performance concerns when heavily used in scripts ?
Performance concerns when heavily used in scripts ?
gnu coreutils-9.5 gcc Gentoo Hardened 14.2.1_p20241221 p7
8189040 unstripped
6573656 stripped
rust 1.85.0 from upstream, coreutils from main (sha: fc46a041f80dc)
rust coreutils profile=debug
127032616 unstripped
25930968 stripped # this should look familiar
rust coreutils profile=release
12510480 unstripped
11487080 stripped
rust coreutils profile=release-small (profile just produces stripped, I added it cause they had it)
7849832 stripped
Integrate Ash/Dash?
More robust oxidizr behavior?
More robust oxidizr behavior?
More robust oxidizr behavior?
More robust oxidizr behavior?
More robust oxidizr behavior?
Code that calls various binaries by their absolute path does exist
Such code is brittle by definition, in that it requires the respective binary to always be in that place. OK, there is the omnipresent #!/bin/sh
but nobody is talking about replacing shells or other interpreters, just some of the foundational utilities. Plus, the better approach, which I learned from some IBM coding examples, believe it or not, is: #!/usr/bin/env sh
or whatever interpreter is desired.
some of them are even good citizens and detect the location at configure time
I don't get how that makes a good citizen
. The configure time
may well be spent under a different environment, i.e. with a different $PATH
that may not even exist at runtime. If anything, one should call which <command>
or command -v <command>
at runtime, or whatever equivalent lookup method the programming framework provides.
One could at least use dpkg-divert to move the non-Rust versions out of the way in a more proper manner, which it doesn't seem to do from a cursory grep of the source.
That approach and the suggestion of using update-alternatives
has been denied already, because both require cooperation by the respective package, which is not feasible for this kind of temporary experimentation. I still maintain that update-alternatives
could be used - bent, really - to do what oxidizr
does, by using a different $DPKG_ROOT
, i.e. DPKG_ROOT=/usr/local
and some crafting of paths, but, having seen @mathstuf's suggestion, that seems like cracking nuts with a sledgehammer; IOW: KISS. Maybe there is some added value in using the alternatives system, because somebody on the discourse thread pointed out that the man pages remain the same, e.g. the GNU coreutils man pages are shown, when in fact the uu-coreutils are in use. But for now that's all in my head and I haven't tried that approach yet; I may be missing something that could be a showstopper - it's been a while since I did anything remotely serious with it.
More robust oxidizr behavior?
But it is pretty much the only guaranteed path to exist, give or take a few I am too lazy to look up right now; the gist should be clear. Try More robust oxidizr behavior?
#!/usr/bin/env python
then. That way you can have your own local version in /usr/local/bin/
and don't need to change your script only to try a different iteration of the interpreter.
sh
in that example. *sigh*
More robust oxidizr behavior?
Yes, as I have eluded to elsewhere in this thread. But that would be the only fixed and absolute path. BTW, why does More robust oxidizr behavior?
$PATH
exist if people insist on calling by absolute paths? Unless there is a very good reason, one should just not do that.
More robust oxidizr behavior?
BTW, that guarantee does not exist in POSIX, on the contrary:
More robust oxidizr behavior?
Applications should note that the standard PATH to the shell cannot be assumed to be either /bin/sh or /usr/bin/sh, and should be determined by interrogation of the PATH returned by getconf PATH, ensuring that the returned pathname is an absolute pathname and not a shell built-in.
And it is quite involved to install scripts with the correct absolute path to sh
, see further down in the spec - I bet that no current Linux distro does it that way. Given all that and that env
is also a defined by POSIX one may just as well use that installation routine to get the absolute path of it and install any script (not just sh
ones) using that approach, or just assume that /bin/env
or /usr/bin/env
will always exist, and YOLO. I'll leave it at that, since this is getting more off topic. The point was, and still is, that one should not rely (too heavily) on absolute paths, it's bad practice. I consider the need for absolute paths in shebangs to be a historical relic we have to live with, but anything beyond that should be fine with plain relative (to $PATH
) executable path names.
NixOS takes the position that More robust oxidizr behavior?
env
, sh
, and ld-linux
are in fact the only absolute paths to binaries you get:
$ ls /bin
sh
$ ls /usr/bin
env
$ ls /lib64
ld-linux-x86-64.so.2
More robust oxidizr behavior?
And what happens if More robust oxidizr behavior?
coreutils
gets an upgrade? The only context I have ever come across diversions is as part of preinst
/postrm
hooks in package install scripts. And oxidizr
, as of now, is just a "3rd party" tool that is unrelated to package management and hence should not mess with files under the control of the package manager (see FHS: must not be written to
), i.e. /usr/bin
, which is just like tickling the dragon.
More robust oxidizr behavior?
More robust oxidizr behavior?
More robust oxidizr behavior?
More robust oxidizr behavior?
No, application programmers should do away with onerous assumptions like that or not make such to begin with. Some Apple devs said as much in a talk about More robust oxidizr behavior?
launchd
, IIRC, and I can only concur. Nowhere does FHS say that one should call those programs by absolute path. $PATH
is all you need to find the executable in question, and how that is set up is regulated elsewhere, which I am too lazy to look up now. Essentially getconf
provides the bare minimum that is guaranteed, and lo and behold:
$ lsb_release -i
Distributor ID: Ubuntu
$ getconf PATH
/bin:/usr/bin
And I am pretty certain that is also true for any distro worth its salt.
Also, if you have missed it, see my other post on the matter; not even /bin/sh
should be assumed to exist in that exact location.
Manpages are important
Manpages are important
Manpages are important
Manpages are important
Manpage reuse
OIL RIG
OIL RIG
bummer the uutils licence is MIT and not GPL
bummer the uutils licence is MIT and not GPL
https://fosdem.org/2025/schedule/event/fosdem-2025-6196-r...
Ubuntu going downhill...