Relocating Fedora's RPM database
The deadlines for various kinds of Fedora 36 change proposals have mostly passed at this point, which led to something of a flurry of postings to the distribution's devel mailing list over the last month. One of those, for a seemingly fairly innocuous relocation of the RPM database from /var to /usr, came in right at the buzzer for system-wide changes on December 29. There were, of course, other things going on around that time, holidays, vacations, and so forth, so the discussion was relatively muted until recently. Proponents have a number of reasons why they would like to see the move, but there is resistance, as well, that is due, at least in part, to the longstanding "tradition" of the location for the database.
To /usr
As is normal for Fedora change proposals, program manager Ben Cotton posted the proposal to the mailing list on behalf of its owners: Chris Murphy, Michel Alexandre Salim, and Neal Gompa. It can also be found on the Fedora wiki. In a nutshell, it would move the database maintained by the RPM package manager from its current location, /var/lib/rpm, to /usr/lib/sysimage/rpm; the former would be replaced with a symbolic link to the latter. The DNF package-management tool would not be switching locations for its database, at least yet; the RPM developers have mixed feelings about the change, but are not standing in its way.
The RPM database tracks which packages have been installed and their metadata. In addition, it tracks all of the files that get installed to the system, their locations, and which package is responsible for them. RPM uses it to clean up packages when they are removed; users can query the database, as well, to answer various questions about their packages and files.
There are several benefits for Fedora listed in the proposal, but the main driving force seems to be support for rolling back failed or undesired updates of the operating system. The RPM database describes what is in /usr for the most part, so having it stored in the same directory hierarchy means that /usr can be rolled back as a unit, without needing to change /var. The switch is also in keeping with what the Fedora variants based on rpm-ostree (CoreOS, IoT, Sliverblue, and Kinoite) already do; it is also aligned with another RPM-based distribution ecosystem, the SUSE distributions (SUSE Linux Enterprise, openSUSE, and Tumbleweed).
Much of the early discussion revolved around the Filesystem
Hierarchy Standard (FHS), which Fedora packaging explicitly
follows—with a few exceptions. Tom Hughes pointed
out that the FHS descriptions for
/usr ("shareable, read-only data
") and for
/var ("variable data files
") seem somewhat at
odds with the proposal. In particular, he said, the /var
description "appears to exactly
describe the RPM database
". Thus the changes are not FHS compliant,
nor do they follow the packaging guidelines, he said.
Stephen John Smoogen agreed
and suggested that the change needed to be made in the FHS first, then it
could be done for the distribution. Fedora project leader Matthew Miller did not
think that was likely to happen, since the most recent release
(FHS 3.0) was in 2015 "and the
whole thing has been effectively dead since
". On the other hand,
Miller was in
favor of reviving the FHS effort, especially with buy-in from other
distributions.
But, as Murphy (and others) said, no matter what the FHS says, /usr cannot be more than "mostly read-only" or the system can never be updated:
In practice it is read-only data, except when software is being installed or updated. The FHS is a PITA sometimes, but it's not advocating for systems that can't be updated or changed.
RPM developer Panu Matilainen agreed
that the database has generally only been changed during update operations,
"but it doesn't mean it will
stay that way forever more
". There are unimplemented features that
might change that situation; beyond
that, there are already situations
where changes are made completely outside of /usr but that require
an update to the RPM database:
There seems to be this strange underlying assumption that all packaged content lives in /usr when that's not at all the case. To install a kernel, or a config-only package (under etc), or 3rd party software putting stuff under /opt, or... you need a writable rpmdb. Ditto for 'rpm --import'.
Vitaly Zaitsev also pointed
to the FHS and its "read-only" language for /usr, but as Gordon
Messmer said:
"If /usr really is read-only, then it probably doesn't matter where
the
rpmdb is, since packages can't be installed (generally).
" Gompa concurred
with that, and expanded on the advantages of moving the database out of
/var:
The bigger problem is that having the RPM database in /var makes it much harder to correctly implement a boot-to-snapshot scheme for Fedora Linux that reasonably maintains system state properly once /var is carved out. The reason that /var *isn't* carved out by default with our Btrfs configuration is because of the RPM database. Once the RPM database is moved, it becomes possible to split /var into its own subvolume and make it trivial to configure a full boot-to-snapshot system leveraging the technologies we have available to us.
Miller suggested
that the "Benefit to Fedora" section of the change proposal add more
information about that "pretty compelling benefit
". Murphy made
that change, but did caution: "There's more hurdles to jump, just
so no one thinks snapshots
and rollbacks are showing up in Fedora 36.
"
/state
Things had mostly quieted down in the discussion after the new year, at
least until David Cantrell threw a bit
of a curve ball on January 9. As with others in the thread, he was uncomfortable
with moving the database to /usr, "but we should move it
to gain the improvements as noted in the feature proposal
". In his
lengthy message, he suggested some other, novel ways to look at /usr
and /var:
It is generally understood that /usr contains [most of] the installed system. What I think is a bigger requirement or [expectation] now is that one can tar up /usr and transport it to another system or virtual machine or container and expect that it will _probably_ work maybe with a bit of tinkering. This is a really valuable thing to have for developers. Moving the RPM database to this tree adds a component that is unnecessary and sort of out of place.[...] Reading comments and talking to people, the long standing understanding of /var is still "that's stuff you can rm -rf and the system will still work fine". Technically you could remove the RPM database and the system still function, but arguably would still be broken since you really want the RPM database. This use case of removing the RPM database and still having a functioning system is really only useful for data recovery scenarios. You're ultimately going to reinstall. Probably.
With that in mind, he suggested moving "the RPM database and other
variable and stateful data
" to a new top-level directory called
/state. The FHS does not prevent the addition of new top-level
directories, Cantrell said. Michael Catanzaro thought that
adding a top-level directory was "a pretty big hammer
" and
that perhaps it was easier to just support two separate locations for the
RPM database.
But rpm-ostree developer Colin Walters took
exception to Cantrell's "unnecessary and sort of out of place
"
characterization. "Multiple independent groups who *actually work* on
image based updates and/or client side snapshots all generally agree that
the rpmdb should be in /usr.
" Murphy said
that /state does not really solve the problem, it is "just
rearranging the chairs
". The RPM database holds information about
multiple locations, so it needs to stay in sync with the rest of the
system, no matter where it lives:
If /usr is to be truly portable and have e.g. 'rpm query, verify, remove, reinstall' work as expected, you need the metadata (the database) representing its state to always come along for the ride. Either the database is already in /usr, or you have to make sure /usr and /state are inseparable.If /usr and /state are inseparable, and if rpm can also describe anything in /etc or /var or /opt, then all or part of those directories are also inseparable from /state. And thus /usr. So I think /state doesn't help.
Cantrell is not alone in feeling that /usr is a bad location, however;
Matilainen said
that putting the RPM database there "just *feels so wrong*
".
There is other data like the RPM database, he said, so having a separate location
for it all makes a lot of sense:
For many practical purposes it's probably just rearranging the chairs, but a separate top-level directory describing the *system* state seems instinctively *much* more correct solution to it than stuffing it somewhere deep inside a loosely related fs.Just FWIW, I would quit my whining about this right there if it went to a new toplevel directory instead because it just *feels* right unlike /usr.
Further pieces
Walters said that problems, such as PGP keys being added via rpm --import, need to be addressed as part of the adoption process:
The TL;DR for me is: I think everyone agrees that moving the rpmdb as it is today to /usr is not 100% a perfect fit. But it's a ~90% fit - almost all the raw data is just headers which are clearly immutable data generated elsewhere. And by making this change, we're basically saying we'll fix the remaining 10% of cases.[...] But, I hope we can get agreement about something like having `rpm --import` write to `/etc/pki/rpm-gpg` and dropping gpg keys from the rpmdb.
There is, it seems, an effort toward regularizing package installation in various ways with an end goal that is clear to some, but perhaps not to others. Zbigniew Jędrzejewski-Szmek described where he sees things heading:
Traditionally, packages installed all kinds of files all over the place. But we're slowly and painfully moving towards the model where:
- packages are only allowed to install under /usr, /var, and /etc. (Or under /opt, but I'd want to move that to /usr/opt…)
- packages must support /var/cache being wiped at any time, and most packages support anything under /var being wiped at any time.
- systemd and other projects are trying to only use /etc for local admin state, and support "factory reset" by wiping /etc and /var.
Based on that, he said that it makes sense to put the database under
/usr somewhere, and the exact location is only a matter of
convention, "so obviously we want to follow
what opensuse and others are already using
". But Matilainen was concerned
that there was something of a "hidden agenda
" behind the
change proposal. He thought the goals should be more clearly stated:
I'm not saying these are necessary bad goals at all, it's just that there's a huge disconnect between reality and the above model on which this change seems based on, and not a single mention about these goals and changes needed to get there. I mean, I totally get that you can't change everything at once, but if there's a plan this big behind something then maybe it should be brought up front, no?
Matilainen said that "nearly all packages
put something in /etc
", for example, which is something that is being ignored,
but Walters said
that the handling of /etc is something that is being improved in
the image-based update world:
rpm-ostree uses ostree, which introduces /usr/etc which are the pristine default config files. /etc is 3-way merged by ostree. One of the major benefits of this that I really love is `ostree admin config-diff` - at any point we can show you machine-local changes from the default, and it's trivial to reset back to defaults without redownloading a whole RPM.[...] There's no hidden agenda - the goal is to support image based updates as well as client side snapshots, factory reset, etc. And we're shipping today versions of Fedora that do a lot of this, and we want to continue to improve it.
The discussion is still ongoing as of this writing, which likely means that
the Fedora Engineering Steering Committee (FESCo) will not decide on the
proposal right away. Several members are favorably disposed to it, as can
be seen in the FESCo issue
entry. Matilainen has said
that he is "not going to stand in front of the Change truck
",
even though he has reservations about the proposal. While it may "feel
wrong" to do so, at least for some, it seems that the writing may already
be on the wall for this particular change. Whether that larger agenda
(hidden or not) is adopted will play out over the next few releases.
Posted Jan 12, 2022 22:55 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (5 responses)
- TeX module databases (though I don't see what this is exactly in the guidelines)
Posted Jan 13, 2022 0:20 UTC (Thu)
by shalem (subscriber, #4062)
[Link] (1 responses)
I'm not really involved in this whole discussion but to me /state feel like a solution which is looking for a problem to solve, rather then the other way around.
Posted Jan 13, 2022 6:32 UTC (Thu)
by epa (subscriber, #39769)
[Link]
Posted Jan 13, 2022 1:54 UTC (Thu)
by ebassi (subscriber, #54855)
[Link] (2 responses)
g-i is machine readable ABI description; it's the equivalent of a header file and the shared library. Definitely not "state", unless you consider libraries as state. You cannot delete the introspection data without breaking applications written in any dynamic language, for instance.
Posted Jan 13, 2022 2:43 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Posted Jan 13, 2022 9:52 UTC (Thu)
by smcv (subscriber, #53363)
[Link]
The pattern where there's a compiled summary/cache/more-efficient-form for multiple files (like GSettings schemas glib-2.0/schemas/*.xml being summarized in gschemas.compiled) is useful if at least one of these is true:
* the files being summarized are in a format that is relatively costly to parse, compared with the number of programs that will want to parse it, so updating a summary during package management operations is cheaper than reading the source files from first principles every time (like schema XML)
None of those apply to GObject-Introspection: if a Python or JS program wants to use a GTK 3 UI, then it should already know (hard-coded in its source code) that it wants the Gtk typelib, API version 3.0, so it can load Gtk-3.0.typelib without needing to look it up in a summary. Typelibs are a binary format like a shared library, so the caller doesn't need to parse something like XML.
Posted Jan 13, 2022 0:27 UTC (Thu)
by dskoll (subscriber, #1630)
[Link]
I don't have skin in this game because I don't use Fedora, but /state just seems terrible. A filesystem is the very definition of state and adding a new top-level directory just muddies the waters.
On the issue of /usr vs /var, I can see the arguments for /usr and I'd probably lean towards that.
Posted Jan 13, 2022 3:16 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
I'm serious, just do it. PGP keys and other mutable stuff can go into /etc.
Posted Jan 13, 2022 3:56 UTC (Thu)
by rfunk (subscriber, #4054)
[Link]
Posted Jan 13, 2022 4:30 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
Posted Jan 13, 2022 9:32 UTC (Thu)
by smurf (subscriber, #17840)
[Link] (1 responses)
Posted Jan 14, 2022 17:55 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link]
That is not what "state" means to me, and frankly, I have never heard that definition of "state" before now.
Posted Jan 13, 2022 7:08 UTC (Thu)
by lamawithonel (subscriber, #86149)
[Link] (20 responses)
Gentoo uses /var/db/pkg for similar information, where each sub-directory is generated by the package it represents. If the rpmdb files were like this, it would be much less jarring to install them in /usr/lib, as most of them could be statically packaged in the rpms. They would be essentially the same as most [all?] other files in /usr. Then all you need-- maybe-- is an index, which could be regenerated, hence stored in /var/cache.
I understand this is a larger change with a lot of tools that would need updates, but I don't see what's so different about this information that it can't be stored like this. Am I missing something?
Posted Jan 13, 2022 7:49 UTC (Thu)
by LtWorf (subscriber, #124958)
[Link] (2 responses)
Posted Jan 13, 2022 11:47 UTC (Thu)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Jan 13, 2022 21:31 UTC (Thu)
by gabrbedd (guest, #90145)
[Link]
Posted Jan 13, 2022 10:02 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link] (16 responses)
/var as it exists today is in sore need of restructuring, but returning to the bad old days where everything static is in /usr and everything that does not fit the classification is stuffed in /home to avoid thinking about it is not a solution.
Posted Jan 13, 2022 10:31 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (14 responses)
I am not sure whether they do. The idea seems to be to wipe /var (and /etc) as a factory reset. Having a real database in /var is no contradiction. Of course, the data would be lost if wiping /var, but this is actually what I would expect from a factory reset.
But I see your point that there is a clear difference between /var/cache and a database. The data in /var/cache is expected to be recoverable (e.g. by recomputation), while a database holds data that would be lost irrevocably. Thus having a better structure of /var would be nice. Or perhaps splitting /var, one part for user data (e.g. databases) and a second part for intermittent data (/var/cache, /var/spool, /var/log, etc.). The first part is the part that you might want to move to a different system just as you would want to transfer the files under /home, while you usually do not care about the second part if you migrate to a new system.
The rpm database fits into neither of these two categories. It cannot be easily regenerated and it is no user data that you might want to use on a different system.
Posted Jan 13, 2022 11:22 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link] (12 responses)
1. installation files (recreated on install, identical over many systems)
It‘s a triptych not a diptych.
The third category is usually the smallest in size but the most valuable. It is hard to manage so the temptation to pretend it does not exist (or will be managed by someone else, usually the cloud nowadays) is permanent
Putting 3. in /usr can not work unless /usr is redefined as something that does not fit static installs.
3. is also often pre-seeded from static installation data and it is tempting to conclude that because pre-seeding makes it “almost” installation data is can be conflated with installation data but that does not work in real life (real life test: what happens when the filesystem is wiped out).
Also a lot of apps do not support layered preseeding, preseeded data has to be deployed in the “valuable data” space on install, not in the “static data” space.
A lot of systems do not make the effort to separate 1. 2. and 3. they are in the 90% almost working category.
But 90% almost working is known as never working from the user side.
Posted Jan 13, 2022 11:36 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link] (3 responses)
Posted Jan 13, 2022 16:33 UTC (Thu)
by smoogen (subscriber, #97)
[Link] (2 responses)
Posted Jan 13, 2022 17:07 UTC (Thu)
by dbnichol (subscriber, #39622)
[Link]
Posted Jan 13, 2022 19:45 UTC (Thu)
by walters (subscriber, #7396)
[Link]
Posted Jan 13, 2022 11:51 UTC (Thu)
by taladar (subscriber, #68407)
[Link] (3 responses)
Posted Jan 13, 2022 18:02 UTC (Thu)
by rfunk (subscriber, #4054)
[Link] (2 responses)
Posted Jan 14, 2022 16:48 UTC (Fri)
by jccleaver (guest, #127418)
[Link] (1 responses)
There's an argument that updated "permanent state" should go in /etc/, since package install/removal makes changes to things in /etc/ as well. Consider /etc/shells, which may be updated by the installation of a new shell package. Or any of the myriad other caches and records in there.
I'd support a move from /var/lib/rpm to /var/rpm, but I'd sooner suggest /etc/ than a new top-level directory for this. The prohibition on binaries in /etc/ is about executable programs, not binary caches or records -- especially not of things that correlate explicitly with package install/removal.
Posted Jan 15, 2022 15:33 UTC (Sat)
by luto (guest, #39314)
[Link]
Posted Jan 13, 2022 12:08 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (1 responses)
Of course. I did not talk about your 1., as I was talking about /var and the installation files should be under /usr. And actually I would add a forth type: system configuration, i.e., the data usually found under /etc. Of course, configuration data is often valuable, but it is still different in the sense that it is also often system dependent. You cannot simply copy /etc to a different system and expect the things to just work.
> The third category is usually the smallest in size but the most valuable.
This depends on the use case. In many cases it is the largest in size by orders of magnitude. Our family photo collection alone is far bigger than 1.+2.
The valuable data is usually stored in /home, /var, or both, depending on whether it is managed by a user or by a system application (like a database engine). For the data in /home it is quite obvious that one wants to have a backup, but some of the data in /var is equally important. And having a backup of the system configuration under /etc is definitely not the worst thing either.
> A lot of systems do not make the effort to separate 1. 2. and 3. they are in the 90% almost working category.
Actually, the installation files (1.) are mostly separated under /usr. Whether this is a separate volume or just a folder does not really matter for most people. And for those who care this can be easily arranged as a separate volume. The mess is below /var and also below /home. In both places 2. and 3. are mixed up. Under /home there are, e.g., browser caches. Clearly not the most important thing to backup.
And I definitely agree that there should be some effort to separate these things. In this way, I really like much of the systemd development. All the systemd support for containers and stateless systems needs a clean filesystem layout. Fork a new instance by reusing /usr and providing a fresh (empty) /var. And they are also working hard on reducing the necessary configuration under /etc, thus that containers and similar lightweight systems can gather all necessary data from the environment (e.g. network configuration by dhcp) and /etc can be empty on system startup. Let's hope that they do not stop halfway and continue cleaning up. There is still some way to go.
Posted Jan 13, 2022 13:44 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link]
The next frontier is sorting out 2. and 3.
Posted Jan 13, 2022 13:25 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (1 responses)
And not that I had anything of importance there, I don't think, but I think I've just lost /var from my old system. And it had things like my personal wiki, I think the mail system stored a load of stuff there, etc etc. /var actually contains a lot of data (like other people said, databases), and its loss is NOT a minor matter.
I've now got into the habit of stashing most stuff in /home somewhere, but it would be nice if there was some way of separating stuff like user databases from other stuff like system databases (mail, printer spools, gentoo's installation stuff, whatever whatever).
Cheers,
Posted Jan 13, 2022 20:38 UTC (Thu)
by JanC_ (guest, #34940)
[Link]
Posted Jan 21, 2022 16:10 UTC (Fri)
by Jonno (subscriber, #49613)
[Link]
Doesn't /srv already serve that use-case already? The FHS say it is for "site-specific data which is served by this system", which I sure read as including user data such as databases (or websites, or maildirs, etc).
For example, my mysql server uses /srv/myqsl as the datadir, and my imap server uses /srv/mail. (My smtp server uses /var/spool/postfix for in-flight messages, but emails destined for local users are delivered to /srv/mail/${USER}/.INBOX/).
Posted Jan 13, 2022 14:09 UTC (Thu)
by walters (subscriber, #7396)
[Link]
Or to restate this, around separating /usr and /var - "factory reset" is only one half of the coin. The other half is knowing that OS upgrades won't affect your user data. See for example https://github.com/coreos/rpm-ostree/pull/888 which I am still very proud of =)
Posted Jan 13, 2022 11:16 UTC (Thu)
by k3ninho (subscriber, #50375)
[Link] (1 responses)
K3n.
Posted Jan 15, 2022 15:19 UTC (Sat)
by Conan_Kudo (subscriber, #103240)
[Link]
Posted Jan 13, 2022 11:56 UTC (Thu)
by taladar (subscriber, #68407)
[Link] (1 responses)
Actually I would be fine with anything under /usr being wiped three times a day over /var being wiped even once because /var is where all our important data lives (between /var/www, /var/lib/mysql and /var/lib/postgresql among others). /usr is the easily restorable distro stuff, /var is the unique data.
Posted Jan 13, 2022 13:05 UTC (Thu)
by matthias (subscriber, #94967)
[Link]
Posted Jan 13, 2022 12:20 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link]
Posted Jan 15, 2022 4:27 UTC (Sat)
by songmaster (subscriber, #1748)
[Link]
Posted Jan 20, 2022 15:22 UTC (Thu)
by eduperez (guest, #11232)
[Link]
Relocating Fedora's RPM database
- font databases (fc-cache)
- ldconfig caches
- gconfig schemas
- gobject-introspection bits (maybe?)
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
* there are lots of files and many programs will want to load all of them, so reading the summary requires fewer syscalls and fewer disk seeks than reading the source files (like icons and fonts)
* the name of the correct file to open to get a particular "interface" is not immediately obvious unless you can look it up in the summary (like fonts, which have no obvious relationship between font name and filename)
/state
Relocating Fedora's RPM database
/usr/state
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
2. caches (transient stuff that exists to optimise performance and can be lost)
3. valuable data (the things that can not be recreated on install and are worth saving)
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Configuration and State
Configuration and State
Configuration and State
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Wol
Relocating Fedora's RPM database
Relocating Fedora's RPM database
> [...]
> Or perhaps splitting /var, one part for user data (e.g. databases) and a second part for intermittent data (/var/cache, /var/spool, /var/log, etc.).
Relocating Fedora's RPM database
boot-to-ENOSPC
I like btrfs but I currently assume -- which I hope is out of date -- that btrfs + snapshots + ENOSPC means corruption. Is that still the case? What changed?
That hasn't been the case for many years. The scheme discussed in the Change discussion (and referenced in the article) is the same one openSUSE and SUSE Linux Enterprise have used for many years now.
boot-to-ENOSPC
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database
Relocating Fedora's RPM database