|
|
Subscribe / Log in / New account

DNF5 delayed

DNF5 delayed

Posted Aug 21, 2023 15:06 UTC (Mon) by Sesse (subscriber, #53779)
In reply to: DNF5 delayed by foom
Parent article: DNF5 delayed

dpkg feels pretty slow to me, and it's only getting slower as a typical system gets more and more packages. I mean, even dpkg -i hello.deb (installing 277 kB of files) needs over 500 ms on a modern NVMe drive! On a HDD, we're talking about several seconds. full-upgrades can take many minutes just in unpacking packages, when the drive can sustain many gigabytes per second.

It may be that RPM is even slower, I don't know. But this is not fast by any reasonable standard.


to post comments

DNF5 delayed

Posted Aug 21, 2023 17:44 UTC (Mon) by mbunkus (subscriber, #87248) [Link] (4 responses)

I think this might be more due to the fact that dpkg makes different tradeoffs than rpm wrt. file system security: it syncs the file system rather often. Here's a comparison of installing the same software I provide distro-specific packages for on Debian 12 & Fedora 38:

[0 root@8ea9c2baf151 …/mkvtoolnix] cat /etc/debian_version
bookworm/sid
[0 root@8ea9c2baf151 …/mkvtoolnix] strace -o ~/s.txt dpkg -i mkvtoolnix_79.0-0~ubuntu2304bunkus01_amd64.deb
Selecting previously unselected package mkvtoolnix.
(Reading database ... 86951 files and directories currently installed.)
Preparing to unpack mkvtoolnix_79.0-0~ubuntu2304bunkus01_amd64.deb ...
Unpacking mkvtoolnix (79.0-0~ubuntu2304bunkus01) ...
Setting up mkvtoolnix (79.0-0~ubuntu2304bunkus01) ...
Processing triggers for hicolor-icon-theme (0.17-2) ...
Processing triggers for man-db (2.11.2-1) ...
[0 root@8ea9c2baf151 …/mkvtoolnix] grep -E 'fsync|sync_file_range|fdatasync|syncfs' ~/s.txt | wc -l
136

[0 fc38(64) root@149617e45639 ~…/x86_64] cat /etc/fedora-release
Fedora release 38 (Thirty Eight)
[0 fc38(64) root@149617e45639 …/x86_64] strace -o ~/s.txt rpm -Uhv mkvtoolnix-79.0-1.fedora38.x86_64.rpm
warning: mkvtoolnix-79.0-1.fedora38.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 10c052a6: NOKEY
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
1:mkvtoolnix-79.0-1.fedora38 ################################# [100%]
[0 fc38(64) root@149617e45639 …/x86_64] grep -E 'fsync|sync_file_range|fdatasync|syncfs' ~/s.txt | wc -l
5

The contents of both files aren't 100% comparable, but those numbers aren't even remotely comparable. Syncing hurts very much on HDDs, that's true.

DNF5 delayed

Posted Aug 21, 2023 17:48 UTC (Mon) by Sesse (subscriber, #53779) [Link] (3 responses)

dpkg syncs way too much compared to what you'd actually need, yes. It's entirely possible to fsync less and still be equally safe, so it's not like more fsyncs == safer.

dpkg is much faster under eatmydata, but still, reading the entire database into RAM (parsing text files line-by-line) is pretty unneeded.

DNF5 delayed

Posted Aug 21, 2023 17:54 UTC (Mon) by mbunkus (subscriber, #87248) [Link] (2 responses)

OK, pure conjecture here on my part garnished with some experience. If each invocation of dpkg reads the whole database, that should not take a lot of time safe for the first time — assuming reading is done with some proper chunking (meaning
only do a handful of big read calls, allowing for I/O speed). Unless the parsing algorithm itself is really bad, parsing several MB of data in-memory should be much faster than reading it from storage.

The next invocation should then get the whole database's data from the kernel's caches, shouldn't it? Sure, there are most likely more performant ways to store the data, or ways that would require fewer data to be read (and written, too), but does the database speed really matter that much compared to the FS syncs?

I'm talking about a system upgrade situation here, not about installing a single package.

Am I completely off base here?

DNF5 delayed

Posted Aug 21, 2023 18:11 UTC (Mon) by Sesse (subscriber, #53779) [Link] (1 responses)

> If each invocation of dpkg reads the whole database, that should not take a lot of time safe for the first time — assuming reading is done with some proper chunking (meaning only do a handful of big read calls, allowing for I/O speed).

How can you do a handful of big read calls to read thousands of files? There's one for each package installed.

> The next invocation should then get the whole database's data from the kernel's caches, shouldn't it?

Parsing 600000+ lines of text (example number from my laptop) takes real CPU time, even if the I/O is free or nearly so.

DNF5 delayed

Posted Aug 21, 2023 18:19 UTC (Mon) by mbunkus (subscriber, #87248) [Link]

> How can you do a handful of big read calls to read thousands of files? There's one for each package installed.

Ooooh I didn't know that. I thought it only reads the files directly in /var/lib/dpkg, not all the .list files, too. Good to know! I agree, that seems like a rather inefficient way to handle the information.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds