|
|
Subscribe / Log in / New account

DNF5 delayed

DNF5 delayed

Posted Aug 19, 2023 2:38 UTC (Sat) by rsidd (subscriber, #2582)
Parent article: DNF5 delayed

Debian recently celebrated 30 years, and dpkg and apt haven't undergone incompatible changes for, I think, at least 20 of those years? Added features, improved frontends, yes. Back in the early 2000s I installed knoppix for my system, then "upgraded" to debian by changing some things in /etc/apt/sources.list and doing apt-get update and apt-get dist-upgrade, then "upgraded" to ubuntu in a similar way. Inadvisable for many reasons, but ubuntu's upgrade tool is basically apt dist-upgrade with a few preprocessing steps to handle possible base-system changes, as far as I can tell.

If it works, fix what doesn't work but why change it? Is python really the bottleneck here, and is rewriting in C++ likely to yield a more maintainable codebase?


to post comments

DNF5 delayed

Posted Aug 19, 2023 2:51 UTC (Sat) by jccleaver (guest, #127418) [Link] (19 responses)

> If it works, fix what doesn't work but why change it? Is python really the bottleneck here, and is rewriting in C++ likely to yield a more maintainable codebase?

For the life of me, I can't see why dnf is particular performance-sensitive. Getting it right is about a billion times more important than getting it fast when it comes to package management, and if you're depending on 'yum update' to run in 5 seconds instead of 12 for some critical function then you're probably doing something wrong.

Furthermore, python is still a standard piece of sysadmin toolkit, and far more grokkable at a human level than C++ would be.

If there's a time-sensitive core, sure put that in compiled code. But the closer the UI it gets I feel like the less important this should be.

DNF5 delayed

Posted Aug 19, 2023 7:10 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (9 responses)

One reason is to make the minimal container as small as possible and remove the Python interpreter from it. In the past microdnf was used instead.

DNF5 delayed

Posted Aug 19, 2023 15:02 UTC (Sat) by meyert (subscriber, #32097) [Link] (7 responses)

I think a containerized app should never include any tools like DNF or apt, only minimal libs to support packaged application, to reduce attack surface.
I think most real-world containers in a security sensitve environment will be based on distroless, alpine or use APKO.

DNF5 delayed

Posted Aug 19, 2023 17:21 UTC (Sat) by smoogen (subscriber, #97) [Link] (5 responses)

In theory not having tools like apt, dnf, etc inside of the container is the right thing to do.

The reality is that nearly everyone using containers starts yelling and screaming that any container can't be worked on properly because they need to do all this work to make this one little thing(*) added to make it work.

(*) Narrator: It wasn't and never is one little thing. Eventually you find your 200 containers are all running their own sshd daemons, apt/dnf, and layers of additional software to make this one thing work the way you wanted it to. [And you ended up not being able to replicate that when rebuilding it.. so you have kept this artisinal container for years past its life.]

DNF5 delayed

Posted Aug 19, 2023 17:43 UTC (Sat) by amacater (subscriber, #790) [Link] (1 responses)

One little thing - so much this. This is the reason why there are so many Docker images of varying quality, and why some folk choose to rebuild their own Docker images not by downloading some random image but by trying to rebuild from the Dockerfile.

It's another thing that's kept me away from using Docker extensively - you've no provenance. (I've no
experience but would imagine the same problem will eventually hit Podman)

DNF5 delayed

Posted Aug 20, 2023 14:37 UTC (Sun) by intelfx (subscriber, #130118) [Link]

Docker and Podman reimplement the completely identical underlying idea, so I see no reason why the same problem that supposedly applies to Docker should _not_ hit Podman.

DNF5 delayed

Posted Aug 19, 2023 18:12 UTC (Sat) by jccleaver (guest, #127418) [Link] (2 responses)

That's great, but that's container-world's problem to deal with.

Admin-levels tools on real systems shouldn't be afflicted with reduced functionality and weird bugs and instability out of a need to accommodate the needs of the hyper-optimized container world.

Can size be reduced when there's low hanging fruit? Sure. But this is not that.

(See also: How systemd was pushed onto all of us)

DNF5 delayed

Posted Aug 22, 2023 6:52 UTC (Tue) by knotapun (guest, #166136) [Link] (1 responses)

What's so bad about systemd? It seems to be an appropriate tool, in the right place. There's some rough spots, but with most things it feels appropriate.

DNF5 delayed

Posted Aug 22, 2023 9:36 UTC (Tue) by zdzichu (subscriber, #17118) [Link]

Please don't reopen this topic, we all had our share of flamewars a decade ago.

DNF5 delayed

Posted Aug 24, 2023 19:09 UTC (Thu) by jond (subscriber, #37669) [Link]

> I think a containerized app should never include any tools like DNF or apt, only minimal libs to support packaged application, to reduce attack surface.

I completely agree but the tooling to support this needs to catch up (as per smoogen’s comment)

DNF5 delayed

Posted Aug 19, 2023 16:16 UTC (Sat) by jccleaver (guest, #127418) [Link]

>One reason is to make the minimal container as small as possible and remove the Python interpreter from it.

Premature minimalization, like premature optimization, is the bane of a lot of annoyance at this point. Some of us still have system administration duties the old-fashioned and aren't trying to save kilobytes of space that will only come back to haunt you at 3am when you're called to go do some sort of diagnosis on it only to discover 'ping' isn't installed.

That said, for those that do need this, it seems like the easier thing would be for yum/dnf to not be installed at all. If you're already tossing out every last unnecessary byte for static images, then you should already know precisely what packages you need rpm to lay down, and you don't need a dynamic or repo-based dependency generator at all.

>In the past microdnf was used instead.

This would make sense as a replacement, but the human tool does not need to be minimized in this way. We still use bash for day-to-day scripting, not busybox, after all.

DNF5 delayed

Posted Aug 19, 2023 11:19 UTC (Sat) by pizza (subscriber, #46) [Link] (1 responses)

> For the life of me, I can't see why dnf is particular performance-sensitive.

It's not so much raw performance as it is outsized memory usage. Fedora has had to bump the minimal system specs solely due to DNF running out of memory when doing updates.

DNF5 delayed

Posted Aug 19, 2023 15:24 UTC (Sat) by gbailey (subscriber, #58) [Link]

I was wondering the same; I don't have issues with dnf other than occasionally on a new Fedora VM that can't run "dnf upgrade" because it runs out of RAM. But I am curious why memory issues can't be addressed by changing the local storage so everything doesn't have to be held in RAM? I'd assume the c++ rewrite has to use some different method as well?

DNF5 delayed

Posted Aug 19, 2023 13:02 UTC (Sat) by vadim (subscriber, #35271) [Link] (3 responses)

Performance is relevant in several ways:

There are automated tasks, like Docker builds and CI runs that may need to call on the package manager.

And there's performance from the standpoint of an user. Slow performance can be a serious annoyance. On my high end, 16 core machine, `dnf info firefox` takes 2 seconds from a cold start, and 1.1 seconds once it's been cached. I think that's pretty bad, considering there's far slower machines out there that don't use a NVMe. There's no reason why a simple query shouldn't be near instant. On lower capability machines people may be waiting 15 seconds for an answer. Probably repeatedly if they don't get what they want on the first attempts.

And there's the metadata downloads, where you run a query and suddenly it takes 3 minutes for it to download stuff.

DNF5 delayed

Posted Aug 19, 2023 16:12 UTC (Sat) by jccleaver (guest, #127418) [Link] (2 responses)

Yes, but how many times are you, as a person, doing that on a given machine? I get that annoyance factors add up, but a 2 second to a 1 second change does not seem to warrant to me a codebase shift like this, with all the errant bugs, incompatibility and instability that will come with it.

DNF5 delayed

Posted Aug 19, 2023 19:40 UTC (Sat) by vadim (subscriber, #35271) [Link] (1 responses)

No, I mean, it's 2 seconds on a high end machine. I expect it's a lot longer on something like a Raspberry Pi. That's why it's a problem, the package manager should have good performance on any suitable hardware, not just the latest stuff.

DNF5 delayed

Posted Aug 21, 2023 16:56 UTC (Mon) by jccleaver (guest, #127418) [Link]

>No, I mean, it's 2 seconds on a high end machine. I expect it's a lot longer on something like a Raspberry Pi. That's why it's a problem, the package manager should have good performance on any suitable hardware, not just the latest stuff.

The *package manager* should, yes. But the package manager here is RPM (and librpm), which runs well, fast, and stable.

Dependency and repo management is a distinct layer on top of this, and that's what I'm suggesting doesn't need to put speed over correctness.

DNF5 delayed

Posted Aug 19, 2023 19:14 UTC (Sat) by mseeber (subscriber, #126394) [Link] (2 responses)

DNF is used in RPM based yocto builds. The step that installs all packages into the image can take quite a big part oft the build time in some cases, so i think speedups are welcome there.

DNF5 delayed

Posted Aug 21, 2023 4:21 UTC (Mon) by AdamW (subscriber, #48457) [Link] (1 responses)

This won't really help that much, unfortunately. Actual package install operations are still done by RPM and are more bound by what they're actually doing than by the package management layers.

The performance improvement with dnf5 is more in the stuff that happens before the actual package operations - metadata parsing, mainly.

DNF5 delayed

Posted Aug 24, 2023 14:01 UTC (Thu) by ehiggs (subscriber, #90713) [Link]

DNF5 delayed

Posted Aug 19, 2023 7:13 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (2 responses)

> Debian recently celebrated 30 years, and dpkg and apt haven't undergone incompatible changes for, I think, at least 20 of those years?

It's the same here, isn't it? New hashing or compression algorithms have been introduced over the years, but the basic rpm tooling has stayed the same and yum/dnf /zipper are all frontends to handle dependency resolution and network access.

DNF5 delayed

Posted Aug 19, 2023 7:34 UTC (Sat) by rsidd (subscriber, #2582) [Link] (1 responses)

RPM is like deb, and yum/dnf/zipper are like apt, in my understanding.

Or even more accurately, apt is both the software management system, and the name of one particular front-end to it. Other front-ends exist like apt-get, aptitude, synaptic. It sounds like apt (the backend) is like libdnf. It has grown in features but not incompatibly. Meanwhile apt (the front-end) is almost drop-in compatible with apt-get, apt-cache and friends, which it replaced, and apt-get etc are still available if you want to use those.

DNF5 delayed

Posted Aug 19, 2023 21:54 UTC (Sat) by ballombe (subscriber, #9523) [Link]

apt was written in C++ from the start.

DNF5 delayed

Posted Aug 19, 2023 10:05 UTC (Sat) by mokki (subscriber, #33200) [Link] (1 responses)

I would guess one reason for dropping python is that bootstrapping a new release or architecture is harder if the package and dependency manager, which is needed as early as possible, is written in python.

DNF5 delayed

Posted Aug 21, 2023 4:23 UTC (Mon) by AdamW (subscriber, #48457) [Link]

That's not really a huge issue. We don't bootstrap new releases, they branch off rawhide. New arches have to be bootstrapped, but they happen very rarely and we do have a process for it when it's needed.

DNF5 delayed

Posted Aug 20, 2023 19:07 UTC (Sun) by Sesse (subscriber, #53779) [Link] (11 responses)

> Debian recently celebrated 30 years, and dpkg and apt haven't undergone incompatible changes for, I think, at least 20 of those years?

But also not all that much new innovation; quality-of-life improvements (like apt-get → apt) and lots of bugfixes sure, but I can't recall anything really big since multiarch (2005) and possibly triggers (2007). dpkg is still using flat text files (one per package, read from disk on startup) for keeping track of installed files. apt doesn't have anything like libsolv AFAIK (I believe it's pretty much hand-rolled, and I still need to go to aptitude when apt can't figure out the conflicts). But it _is_ nice that it doesn't break under you, indeed.

DNF5 delayed

Posted Aug 21, 2023 13:53 UTC (Mon) by foom (subscriber, #14868) [Link] (9 responses)

> flat text files (one per package, read from disk on startup)

That seems to have been a fine choice, since apt and dpkg appear reasonably performant, whereas people are always complaining about how slow yum or dnf are?

DNF5 delayed

Posted Aug 21, 2023 14:25 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (2 responses)

Performance has never been a huge concern of mine as yum and dnf have always been way more *useful* in my book (though I did use `apt-rpm` way back in Fedora Core days, yum eventually improved enough to prefer it, probably around Fedora 8 or so). My main preferences for yum/dnf are around:

- dnf is a single tool that answers all my queries; I can never keep the set of apt tools for different queries straight (nevermind that some need more metadata downloaded…manually or not, I can't recall)
- `dnf install $file` works without hassle
- dnf tells me the versions that will be installed
- dnf won't "oh, there are no deps, but you obviously had no typos in your package name" auto-yes the "do you want to install this" prompt
- `dnf history undo` is so very nice
- dnf doesn't ask me questions in the middle of an installation process

DNF5 delayed

Posted Oct 9, 2023 8:42 UTC (Mon) by daenzer (subscriber, #7050) [Link] (1 responses)

> - dnf is a single tool that answers all my queries; I can never keep the set of apt tools for different queries straight (nevermind that some need more metadata downloaded…manually or not, I can't recall)

metadata for apt-file is now automatically downloaded along with other APT metadata.

Speaking of metadata, dnf automatically downloading it at random times can be annoying. dnf4 has --cacheonly to avoid that, dnf5 doesn't (yet?) though AFAICT.

> - `dnf install $file` works without hassle

This works with the new "apt" frontend (as opposed to the old "apt-get" one) as well.

> - dnf won't "oh, there are no deps, but you obviously had no typos in your package name" auto-yes the "do you want to install this" prompt

The flip side of this is is that if the package name has no typos, and the package doesn't require installing any dependencies, the separate confirmation step is superfluous.

> - dnf doesn't ask me questions in the middle of an installation process

Neither does APT, it's per-package configuration scripts. (Now with debconf, at least any such questions should happen back-to-back early on; in the old pre-debconf days, they could happen more or less anytime)

DNF5 delayed

Posted Oct 10, 2023 16:07 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

> Speaking of metadata, dnf automatically downloading it at random times can be annoying. dnf4 has --cacheonly to avoid that, dnf5 doesn't (yet?) though AFAICT.

Yes; `dnf -C info` is not uncommon in my history. Note that `dnf` does download file metadata on-demand, so you never fetch it unless using something like `dnf install */libfoo.so.*`. But it knew it needed it and updated it wanted instead of always being a separate call. It's good to know that there's at least one fewer command to care about now.

> This works with the new "apt" frontend (as opposed to the old "apt-get" one) as well.

I'll try and update my finger knowledge (`apt-get` is still in there as I think I needed it in the days before `apt` did everything and I haven't kept track of `apt` release announcements).

> The flip side of this is is that if the package name has no typos, and the package doesn't require installing any dependencies, the separate confirmation step is superfluous.

Eh, I prefer the consistent behavior more myself. There's `-y` for scripts and the like.

DNF5 delayed

Posted Aug 21, 2023 15:06 UTC (Mon) by Sesse (subscriber, #53779) [Link] (5 responses)

dpkg feels pretty slow to me, and it's only getting slower as a typical system gets more and more packages. I mean, even dpkg -i hello.deb (installing 277 kB of files) needs over 500 ms on a modern NVMe drive! On a HDD, we're talking about several seconds. full-upgrades can take many minutes just in unpacking packages, when the drive can sustain many gigabytes per second.

It may be that RPM is even slower, I don't know. But this is not fast by any reasonable standard.

DNF5 delayed

Posted Aug 21, 2023 17:44 UTC (Mon) by mbunkus (subscriber, #87248) [Link] (4 responses)

I think this might be more due to the fact that dpkg makes different tradeoffs than rpm wrt. file system security: it syncs the file system rather often. Here's a comparison of installing the same software I provide distro-specific packages for on Debian 12 & Fedora 38:

[0 root@8ea9c2baf151 …/mkvtoolnix] cat /etc/debian_version
bookworm/sid
[0 root@8ea9c2baf151 …/mkvtoolnix] strace -o ~/s.txt dpkg -i mkvtoolnix_79.0-0~ubuntu2304bunkus01_amd64.deb
Selecting previously unselected package mkvtoolnix.
(Reading database ... 86951 files and directories currently installed.)
Preparing to unpack mkvtoolnix_79.0-0~ubuntu2304bunkus01_amd64.deb ...
Unpacking mkvtoolnix (79.0-0~ubuntu2304bunkus01) ...
Setting up mkvtoolnix (79.0-0~ubuntu2304bunkus01) ...
Processing triggers for hicolor-icon-theme (0.17-2) ...
Processing triggers for man-db (2.11.2-1) ...
[0 root@8ea9c2baf151 …/mkvtoolnix] grep -E 'fsync|sync_file_range|fdatasync|syncfs' ~/s.txt | wc -l
136

[0 fc38(64) root@149617e45639 ~…/x86_64] cat /etc/fedora-release
Fedora release 38 (Thirty Eight)
[0 fc38(64) root@149617e45639 …/x86_64] strace -o ~/s.txt rpm -Uhv mkvtoolnix-79.0-1.fedora38.x86_64.rpm
warning: mkvtoolnix-79.0-1.fedora38.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 10c052a6: NOKEY
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
1:mkvtoolnix-79.0-1.fedora38 ################################# [100%]
[0 fc38(64) root@149617e45639 …/x86_64] grep -E 'fsync|sync_file_range|fdatasync|syncfs' ~/s.txt | wc -l
5

The contents of both files aren't 100% comparable, but those numbers aren't even remotely comparable. Syncing hurts very much on HDDs, that's true.

DNF5 delayed

Posted Aug 21, 2023 17:48 UTC (Mon) by Sesse (subscriber, #53779) [Link] (3 responses)

dpkg syncs way too much compared to what you'd actually need, yes. It's entirely possible to fsync less and still be equally safe, so it's not like more fsyncs == safer.

dpkg is much faster under eatmydata, but still, reading the entire database into RAM (parsing text files line-by-line) is pretty unneeded.

DNF5 delayed

Posted Aug 21, 2023 17:54 UTC (Mon) by mbunkus (subscriber, #87248) [Link] (2 responses)

OK, pure conjecture here on my part garnished with some experience. If each invocation of dpkg reads the whole database, that should not take a lot of time safe for the first time — assuming reading is done with some proper chunking (meaning
only do a handful of big read calls, allowing for I/O speed). Unless the parsing algorithm itself is really bad, parsing several MB of data in-memory should be much faster than reading it from storage.

The next invocation should then get the whole database's data from the kernel's caches, shouldn't it? Sure, there are most likely more performant ways to store the data, or ways that would require fewer data to be read (and written, too), but does the database speed really matter that much compared to the FS syncs?

I'm talking about a system upgrade situation here, not about installing a single package.

Am I completely off base here?

DNF5 delayed

Posted Aug 21, 2023 18:11 UTC (Mon) by Sesse (subscriber, #53779) [Link] (1 responses)

> If each invocation of dpkg reads the whole database, that should not take a lot of time safe for the first time — assuming reading is done with some proper chunking (meaning only do a handful of big read calls, allowing for I/O speed).

How can you do a handful of big read calls to read thousands of files? There's one for each package installed.

> The next invocation should then get the whole database's data from the kernel's caches, shouldn't it?

Parsing 600000+ lines of text (example number from my laptop) takes real CPU time, even if the I/O is free or nearly so.

DNF5 delayed

Posted Aug 21, 2023 18:19 UTC (Mon) by mbunkus (subscriber, #87248) [Link]

> How can you do a handful of big read calls to read thousands of files? There's one for each package installed.

Ooooh I didn't know that. I thought it only reads the files directly in /var/lib/dpkg, not all the .list files, too. Good to know! I agree, that seems like a rather inefficient way to handle the information.

DNF5 delayed

Posted Aug 30, 2023 9:04 UTC (Wed) by jwilk (subscriber, #63328) [Link]

> multiarch (2005)

It was 2010 in APT; 2012 in dpkg.

> apt doesn't have anything like libsolv

APT has supported external solvers since 2011: https://packages.debian.org/unstable/apt-cudf

DNF5 delayed

Posted Aug 22, 2023 16:05 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

> If it works, fix what doesn't work but why change it?

The number of repositories, the number of packages in those repositories, and the complexity of package relationships (as expressed in package metadata) continues to grow. That’s why algorithms that were good enough once upon a time need some rewriting.

At the same time old tools like yum/dnf accumulate flags and options, not always as consistently as one may like in hindsight, so some cleanup is welcome.

dnf5 tried and failed to achieve both too fast, it will take some more time to get right, that’s why the Fedora release process mandates contingency plans.

DNF5 delayed

Posted Aug 24, 2023 19:07 UTC (Thu) by jond (subscriber, #37669) [Link]

> Debian recently celebrated 30 years, and dpkg and apt haven't undergone incompatible changes for, I think, at least 20 of those years?

> If it works, fix what doesn't work but why change it?

To finally catch up with Debian? :)

> Is python really the bottleneck here

I don’t know whether it’s a performance bottleneck but requiring even a subset of the python in a core tool is problematic for small systems and containers. We use microdnf in container builds to avoid dnf/yum/python, but it can creep back in via other routes (in particular at the moment, crypto-policies-scripts is a hard dep of nss, necessary for FIPS and is a python script)


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds