|
|
Subscribe / Log in / New account

Relocating Fedora's RPM database

By Jake Edge
January 12, 2022

The deadlines for various kinds of Fedora 36 change proposals have mostly passed at this point, which led to something of a flurry of postings to the distribution's devel mailing list over the last month. One of those, for a seemingly fairly innocuous relocation of the RPM database from /var to /usr, came in right at the buzzer for system-wide changes on December 29. There were, of course, other things going on around that time, holidays, vacations, and so forth, so the discussion was relatively muted until recently. Proponents have a number of reasons why they would like to see the move, but there is resistance, as well, that is due, at least in part, to the longstanding "tradition" of the location for the database.

To /usr

As is normal for Fedora change proposals, program manager Ben Cotton posted the proposal to the mailing list on behalf of its owners: Chris Murphy, Michel Alexandre Salim, and Neal Gompa. It can also be found on the Fedora wiki. In a nutshell, it would move the database maintained by the RPM package manager from its current location, /var/lib/rpm, to /usr/lib/sysimage/rpm; the former would be replaced with a symbolic link to the latter. The DNF package-management tool would not be switching locations for its database, at least yet; the RPM developers have mixed feelings about the change, but are not standing in its way.

The RPM database tracks which packages have been installed and their metadata. In addition, it tracks all of the files that get installed to the system, their locations, and which package is responsible for them. RPM uses it to clean up packages when they are removed; users can query the database, as well, to answer various questions about their packages and files.

There are several benefits for Fedora listed in the proposal, but the main driving force seems to be support for rolling back failed or undesired updates of the operating system. The RPM database describes what is in /usr for the most part, so having it stored in the same directory hierarchy means that /usr can be rolled back as a unit, without needing to change /var. The switch is also in keeping with what the Fedora variants based on rpm-ostree (CoreOS, IoT, Sliverblue, and Kinoite) already do; it is also aligned with another RPM-based distribution ecosystem, the SUSE distributions (SUSE Linux Enterprise, openSUSE, and Tumbleweed).

Much of the early discussion revolved around the Filesystem Hierarchy Standard (FHS), which Fedora packaging explicitly follows—with a few exceptions. Tom Hughes pointed out that the FHS descriptions for /usr ("shareable, read-only data") and for /var ("variable data files") seem somewhat at odds with the proposal. In particular, he said, the /var description "appears to exactly describe the RPM database". Thus the changes are not FHS compliant, nor do they follow the packaging guidelines, he said.

Stephen John Smoogen agreed and suggested that the change needed to be made in the FHS first, then it could be done for the distribution. Fedora project leader Matthew Miller did not think that was likely to happen, since the most recent release (FHS 3.0) was in 2015 "and the whole thing has been effectively dead since". On the other hand, Miller was in favor of reviving the FHS effort, especially with buy-in from other distributions.

But, as Murphy (and others) said, no matter what the FHS says, /usr cannot be more than "mostly read-only" or the system can never be updated:

In practice it is read-only data, except when software is being installed or updated. The FHS is a PITA sometimes, but it's not advocating for systems that can't be updated or changed.

RPM developer Panu Matilainen agreed that the database has generally only been changed during update operations, "but it doesn't mean it will stay that way forever more". There are unimplemented features that might change that situation; beyond that, there are already situations where changes are made completely outside of /usr but that require an update to the RPM database:

There seems to be this strange underlying assumption that all packaged content lives in /usr when that's not at all the case. To install a kernel, or a config-only package (under etc), or 3rd party software putting stuff under /opt, or... you need a writable rpmdb. Ditto for 'rpm --import'.

Vitaly Zaitsev also pointed to the FHS and its "read-only" language for /usr, but as Gordon Messmer said: "If /usr really is read-only, then it probably doesn't matter where the rpmdb is, since packages can't be installed (generally)." Gompa concurred with that, and expanded on the advantages of moving the database out of /var:

The bigger problem is that having the RPM database in /var makes it much harder to correctly implement a boot-to-snapshot scheme for Fedora Linux that reasonably maintains system state properly once /var is carved out. The reason that /var *isn't* carved out by default with our Btrfs configuration is because of the RPM database. Once the RPM database is moved, it becomes possible to split /var into its own subvolume and make it trivial to configure a full boot-to-snapshot system leveraging the technologies we have available to us.

Miller suggested that the "Benefit to Fedora" section of the change proposal add more information about that "pretty compelling benefit". Murphy made that change, but did caution: "There's more hurdles to jump, just so no one thinks snapshots and rollbacks are showing up in Fedora 36."

/state

Things had mostly quieted down in the discussion after the new year, at least until David Cantrell threw a bit of a curve ball on January 9. As with others in the thread, he was uncomfortable with moving the database to /usr, "but we should move it to gain the improvements as noted in the feature proposal". In his lengthy message, he suggested some other, novel ways to look at /usr and /var:

It is generally understood that /usr contains [most of] the installed system. What I think is a bigger requirement or [expectation] now is that one can tar up /usr and transport it to another system or virtual machine or container and expect that it will _probably_ work maybe with a bit of tinkering. This is a really valuable thing to have for developers. Moving the RPM database to this tree adds a component that is unnecessary and sort of out of place.

[...] Reading comments and talking to people, the long standing understanding of /var is still "that's stuff you can rm -rf and the system will still work fine". Technically you could remove the RPM database and the system still function, but arguably would still be broken since you really want the RPM database. This use case of removing the RPM database and still having a functioning system is really only useful for data recovery scenarios. You're ultimately going to reinstall. Probably.

With that in mind, he suggested moving "the RPM database and other variable and stateful data" to a new top-level directory called /state. The FHS does not prevent the addition of new top-level directories, Cantrell said. Michael Catanzaro thought that adding a top-level directory was "a pretty big hammer" and that perhaps it was easier to just support two separate locations for the RPM database.

But rpm-ostree developer Colin Walters took exception to Cantrell's "unnecessary and sort of out of place" characterization. "Multiple independent groups who *actually work* on image based updates and/or client side snapshots all generally agree that the rpmdb should be in /usr." Murphy said that /state does not really solve the problem, it is "just rearranging the chairs". The RPM database holds information about multiple locations, so it needs to stay in sync with the rest of the system, no matter where it lives:

If /usr is to be truly portable and have e.g. 'rpm query, verify, remove, reinstall' work as expected, you need the metadata (the database) representing its state to always come along for the ride. Either the database is already in /usr, or you have to make sure /usr and /state are inseparable.

If /usr and /state are inseparable, and if rpm can also describe anything in /etc or /var or /opt, then all or part of those directories are also inseparable from /state. And thus /usr. So I think /state doesn't help.

Cantrell is not alone in feeling that /usr is a bad location, however; Matilainen said that putting the RPM database there "just *feels so wrong*". There is other data like the RPM database, he said, so having a separate location for it all makes a lot of sense:

For many practical purposes it's probably just rearranging the chairs, but a separate top-level directory describing the *system* state seems instinctively *much* more correct solution to it than stuffing it somewhere deep inside a loosely related fs.

Just FWIW, I would quit my whining about this right there if it went to a new toplevel directory instead because it just *feels* right unlike /usr.

Further pieces

Walters said that problems, such as PGP keys being added via rpm --import, need to be addressed as part of the adoption process:

The TL;DR for me is: I think everyone agrees that moving the rpmdb as it is today to /usr is not 100% a perfect fit. But it's a ~90% fit - almost all the raw data is just headers which are clearly immutable data generated elsewhere. And by making this change, we're basically saying we'll fix the remaining 10% of cases.

[...] But, I hope we can get agreement about something like having `rpm --import` write to `/etc/pki/rpm-gpg` and dropping gpg keys from the rpmdb.

There is, it seems, an effort toward regularizing package installation in various ways with an end goal that is clear to some, but perhaps not to others. Zbigniew Jędrzejewski-Szmek described where he sees things heading:

Traditionally, packages installed all kinds of files all over the place. But we're slowly and painfully moving towards the model where:
  1. packages are only allowed to install under /usr, /var, and /etc. (Or under /opt, but I'd want to move that to /usr/opt…)
  2. packages must support /var/cache being wiped at any time, and most packages support anything under /var being wiped at any time.
  3. systemd and other projects are trying to only use /etc for local admin state, and support "factory reset" by wiping /etc and /var.

Based on that, he said that it makes sense to put the database under /usr somewhere, and the exact location is only a matter of convention, "so obviously we want to follow what opensuse and others are already using". But Matilainen was concerned that there was something of a "hidden agenda" behind the change proposal. He thought the goals should be more clearly stated:

I'm not saying these are necessary bad goals at all, it's just that there's a huge disconnect between reality and the above model on which this change seems based on, and not a single mention about these goals and changes needed to get there. I mean, I totally get that you can't change everything at once, but if there's a plan this big behind something then maybe it should be brought up front, no?

Matilainen said that "nearly all packages put something in /etc", for example, which is something that is being ignored, but Walters said that the handling of /etc is something that is being improved in the image-based update world:

rpm-ostree uses ostree, which introduces /usr/etc which are the pristine default config files. /etc is 3-way merged by ostree. One of the major benefits of this that I really love is `ostree admin config-diff` - at any point we can show you machine-local changes from the default, and it's trivial to reset back to defaults without redownloading a whole RPM.

[...] There's no hidden agenda - the goal is to support image based updates as well as client side snapshots, factory reset, etc. And we're shipping today versions of Fedora that do a lot of this, and we want to continue to improve it.

The discussion is still ongoing as of this writing, which likely means that the Fedora Engineering Steering Committee (FESCo) will not decide on the proposal right away. Several members are favorably disposed to it, as can be seen in the FESCo issue entry. Matilainen has said that he is "not going to stand in front of the Change truck", even though he has reservations about the proposal. While it may "feel wrong" to do so, at least for some, it seems that the writing may already be on the wall for this particular change. Whether that larger agenda (hidden or not) is adopted will play out over the next few releases.



to post comments

Relocating Fedora's RPM database

Posted Jan 12, 2022 22:55 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (5 responses)

Other bits that feel like they'd be at home in `/state` include:

- TeX module databases (though I don't see what this is exactly in the guidelines)
- font databases (fc-cache)
- ldconfig caches
- gconfig schemas
- gobject-introspection bits (maybe?)

Relocating Fedora's RPM database

Posted Jan 13, 2022 0:20 UTC (Thu) by shalem (subscriber, #4062) [Link] (1 responses)

Note that everything you listed is stuff which you update when you update files under /usr. Most of it basically is indexes to making looking thing under /usr up faster, so regenerating this together with any changes to /usr and having them in sync with /usr really makes sense.

I'm not really involved in this whole discussion but to me /state feel like a solution which is looking for a problem to solve, rather then the other way around.

Relocating Fedora's RPM database

Posted Jan 13, 2022 6:32 UTC (Thu) by epa (subscriber, #39769) [Link]

Yep, there is a difference between metadata which is updated once when the installed software changes, and pure ‘cache’ which can be regenerated at any time.

Relocating Fedora's RPM database

Posted Jan 13, 2022 1:54 UTC (Thu) by ebassi (subscriber, #54855) [Link] (2 responses)

> - gobject-introspection bits (maybe?)

g-i is machine readable ABI description; it's the equivalent of a header file and the shared library. Definitely not "state", unless you consider libraries as state. You cannot delete the introspection data without breaking applications written in any dynamic language, for instance.

Relocating Fedora's RPM database

Posted Jan 13, 2022 2:43 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (1 responses)

I thought there was a central db of g-i bits. Am I confusing it with something else or was this previous behavior that isn't done anymore?

Relocating Fedora's RPM database

Posted Jan 13, 2022 9:52 UTC (Thu) by smcv (subscriber, #53363) [Link]

I think you're confusing it with something else.

The pattern where there's a compiled summary/cache/more-efficient-form for multiple files (like GSettings schemas glib-2.0/schemas/*.xml being summarized in gschemas.compiled) is useful if at least one of these is true:

* the files being summarized are in a format that is relatively costly to parse, compared with the number of programs that will want to parse it, so updating a summary during package management operations is cheaper than reading the source files from first principles every time (like schema XML)
* there are lots of files and many programs will want to load all of them, so reading the summary requires fewer syscalls and fewer disk seeks than reading the source files (like icons and fonts)
* the name of the correct file to open to get a particular "interface" is not immediately obvious unless you can look it up in the summary (like fonts, which have no obvious relationship between font name and filename)

None of those apply to GObject-Introspection: if a Python or JS program wants to use a GTK 3 UI, then it should already know (hard-coded in its source code) that it wants the Gtk typelib, API version 3.0, so it can load Gtk-3.0.typelib without needing to look it up in a summary. Typelibs are a binary format like a shared library, so the caller doesn't need to parse something like XML.

/state

Posted Jan 13, 2022 0:27 UTC (Thu) by dskoll (subscriber, #1630) [Link]

I don't have skin in this game because I don't use Fedora, but /state just seems terrible. A filesystem is the very definition of state and adding a new top-level directory just muddies the waters.

On the issue of /usr vs /var, I can see the arguments for /usr and I'd probably lean towards that.

Relocating Fedora's RPM database

Posted Jan 13, 2022 3:16 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

Can we split the difference and do a /usr/state?

I'm serious, just do it. PGP keys and other mutable stuff can go into /etc.

/usr/state

Posted Jan 13, 2022 3:56 UTC (Thu) by rfunk (subscriber, #4054) [Link]

Some subdirectory of /usr makes the most sense to me too. That way it's under /usr so it can satisfy the needs of snapshots and so on, but anyone who really wants to can symlink it over to /var/state or whatever, or can make it a separate filesystem from /usr.

Relocating Fedora's RPM database

Posted Jan 13, 2022 4:30 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (2 responses)

IMHO /usr/meta makes more logical sense (because it contains metadata about the rest of /usr). "state" is just a different way of saying "data," which conveys no useful information.

Relocating Fedora's RPM database

Posted Jan 13, 2022 9:32 UTC (Thu) by smurf (subscriber, #17840) [Link] (1 responses)

Sure it does; it says that there's constant package-specific data in there that's created dynamically, as opposed to the rest of /usr which is shipped 1:1 in the packages. Thus when I want to back up the state of my /usr, I only need to collect /usr/state – the rest can be restored from my package archive.

Relocating Fedora's RPM database

Posted Jan 14, 2022 17:55 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

> Sure it does; it says that there's constant package-specific data in there that's created dynamically, as opposed to the rest of /usr which is shipped 1:1 in the packages.

That is not what "state" means to me, and frankly, I have never heard that definition of "state" before now.

Relocating Fedora's RPM database

Posted Jan 13, 2022 7:08 UTC (Thu) by lamawithonel (subscriber, #86149) [Link] (20 responses)

I'm not familiar with the schema, but what makes the rpmdb so special that it needs to be in a sqlite or berkdb database? To me it seems the real question is what types of files belong in /var versus /usr, and right now rpmdb is generated and unique per system, because it is a database. But does it have to be a binary database?

Gentoo uses /var/db/pkg for similar information, where each sub-directory is generated by the package it represents. If the rpmdb files were like this, it would be much less jarring to install them in /usr/lib, as most of them could be statically packaged in the rpms. They would be essentially the same as most [all?] other files in /usr. Then all you need-- maybe-- is an index, which could be regenerated, hence stored in /var/cache.

I understand this is a larger change with a lot of tools that would need updates, but I don't see what's so different about this information that it can't be stored like this. Am I missing something?

Relocating Fedora's RPM database

Posted Jan 13, 2022 7:49 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (2 responses)

I guess speed is an important factor.

Relocating Fedora's RPM database

Posted Jan 13, 2022 11:47 UTC (Thu) by taladar (subscriber, #68407) [Link] (1 responses)

If speed is the only issue create an sqlite DB of the raw data as a cache in /var/cache

Relocating Fedora's RPM database

Posted Jan 13, 2022 21:31 UTC (Thu) by gabrbedd (guest, #90145) [Link]

Yes! I came here to say this ^^^.

Relocating Fedora's RPM database

Posted Jan 13, 2022 10:02 UTC (Thu) by nim-nim (subscriber, #34454) [Link] (16 responses)

One clear cons of the /usr proposal is that its proponents seem unable to see beyond the static /usr / wipeable /var divide. They are confusing /var with /var/cache (and maybe /var/spool). Where do you put databases in their system (actual databases that change over time and need saving, not special purpose databases like rpmdb that only change on updates) ?

/var as it exists today is in sore need of restructuring, but returning to the bad old days where everything static is in /usr and everything that does not fit the classification is stuffed in /home to avoid thinking about it is not a solution.

Relocating Fedora's RPM database

Posted Jan 13, 2022 10:31 UTC (Thu) by matthias (subscriber, #94967) [Link] (14 responses)

> They are confusing /var with /var/cache (and maybe /var/spool).

I am not sure whether they do. The idea seems to be to wipe /var (and /etc) as a factory reset. Having a real database in /var is no contradiction. Of course, the data would be lost if wiping /var, but this is actually what I would expect from a factory reset.

But I see your point that there is a clear difference between /var/cache and a database. The data in /var/cache is expected to be recoverable (e.g. by recomputation), while a database holds data that would be lost irrevocably. Thus having a better structure of /var would be nice. Or perhaps splitting /var, one part for user data (e.g. databases) and a second part for intermittent data (/var/cache, /var/spool, /var/log, etc.). The first part is the part that you might want to move to a different system just as you would want to transfer the files under /home, while you usually do not care about the second part if you migrate to a new system.

The rpm database fits into neither of these two categories. It cannot be easily regenerated and it is no user data that you might want to use on a different system.

Relocating Fedora's RPM database

Posted Jan 13, 2022 11:22 UTC (Thu) by nim-nim (subscriber, #34454) [Link] (12 responses)

Basically any system is a mix of :

1. installation files (recreated on install, identical over many systems)
2. caches (transient stuff that exists to optimise performance and can be lost)
3. valuable data (the things that can not be recreated on install and are worth saving)

It‘s a triptych not a diptych.

The third category is usually the smallest in size but the most valuable. It is hard to manage so the temptation to pretend it does not exist (or will be managed by someone else, usually the cloud nowadays) is permanent

Putting 3. in /usr can not work unless /usr is redefined as something that does not fit static installs.

3. is also often pre-seeded from static installation data and it is tempting to conclude that because pre-seeding makes it “almost” installation data is can be conflated with installation data but that does not work in real life (real life test: what happens when the filesystem is wiped out).

Also a lot of apps do not support layered preseeding, preseeded data has to be deployed in the “valuable data” space on install, not in the “static data” space.

A lot of systems do not make the effort to separate 1. 2. and 3. they are in the 90% almost working category.

But 90% almost working is known as never working from the user side.

Relocating Fedora's RPM database

Posted Jan 13, 2022 11:36 UTC (Thu) by nim-nim (subscriber, #34454) [Link] (3 responses)

(For all its faults the FHS is the child of sysadmins that could not handwave away data loss)

Relocating Fedora's RPM database

Posted Jan 13, 2022 16:33 UTC (Thu) by smoogen (subscriber, #97) [Link] (2 responses)

Agreed. I got off the mailing lists and out of the argument because that is the fundamental gulf in understanding. There is computing which has clouds of failover data so that downtime can be limited due to 'unlimited VC money' and there is computing which is done on a typical shoestring budget which will take a long time to recover from if a 'reset' happens. Assuming /var should be wipeable and resetable at anytime works well in the 'we can just spin up more instances and keep cold copies or some other <fill in trend today> term' It doesn't work in 'real' systems administration where they are dealing with budgets set 20 years ago.

Relocating Fedora's RPM database

Posted Jan 13, 2022 17:07 UTC (Thu) by dbnichol (subscriber, #39622) [Link]

With my sysadmin hat on, I usually like to put application data (e.g., a website or repositories) in /srv whereas system program data (e.g., the package manager database) lives in /var. Some programs make this easier than others, but then it's straightforward for me to know where the data I actually care about is should it need to be backed up or migrated.

Relocating Fedora's RPM database

Posted Jan 13, 2022 19:45 UTC (Thu) by walters (subscriber, #7396) [Link]

No one is assuming /var is wipable at any time.

See https://lwn.net/Articles/881260/

Relocating Fedora's RPM database

Posted Jan 13, 2022 11:51 UTC (Thu) by taladar (subscriber, #68407) [Link] (3 responses)

I would actually split your 3 in two categories, configuration and state. Configuration is small and could be identical for a sort of soft-factory-reset, state is almost always the largest amount of data in size on the system and it is what most definitely can't be recovered if deleted, not even if you let someone work 24/7 for the next week or two (where you could probably rewrite the configuration from scratch).

Configuration and State

Posted Jan 13, 2022 18:02 UTC (Thu) by rfunk (subscriber, #4054) [Link] (2 responses)

Configuration files go in /etc. State currently goes in /var.

Configuration and State

Posted Jan 14, 2022 16:48 UTC (Fri) by jccleaver (guest, #127418) [Link] (1 responses)

> Configuration files go in /etc. State currently goes in /var.

There's an argument that updated "permanent state" should go in /etc/, since package install/removal makes changes to things in /etc/ as well. Consider /etc/shells, which may be updated by the installation of a new shell package. Or any of the myriad other caches and records in there.

I'd support a move from /var/lib/rpm to /var/rpm, but I'd sooner suggest /etc/ than a new top-level directory for this. The prohibition on binaries in /etc/ is about executable programs, not binary caches or records -- especially not of things that correlate explicitly with package install/removal.

Configuration and State

Posted Jan 15, 2022 15:33 UTC (Sat) by luto (guest, #39314) [Link]

Having /etc/shells be modified by installation of a shell is IMO a mistake. One could construe shells(5) as a *cache* indicating what shells live in /usr/bin, in which case it should live in /var/cache. Or one could construe it as administrative policy, in which case package scripts have no business touching it, and /etc/shells should be an optional override of /usr/etc/shells.

Relocating Fedora's RPM database

Posted Jan 13, 2022 12:08 UTC (Thu) by matthias (subscriber, #94967) [Link] (1 responses)

> It‘s a triptych not a diptych.

Of course. I did not talk about your 1., as I was talking about /var and the installation files should be under /usr. And actually I would add a forth type: system configuration, i.e., the data usually found under /etc. Of course, configuration data is often valuable, but it is still different in the sense that it is also often system dependent. You cannot simply copy /etc to a different system and expect the things to just work.

> The third category is usually the smallest in size but the most valuable.

This depends on the use case. In many cases it is the largest in size by orders of magnitude. Our family photo collection alone is far bigger than 1.+2.

The valuable data is usually stored in /home, /var, or both, depending on whether it is managed by a user or by a system application (like a database engine). For the data in /home it is quite obvious that one wants to have a backup, but some of the data in /var is equally important. And having a backup of the system configuration under /etc is definitely not the worst thing either.

> A lot of systems do not make the effort to separate 1. 2. and 3. they are in the 90% almost working category.

Actually, the installation files (1.) are mostly separated under /usr. Whether this is a separate volume or just a folder does not really matter for most people. And for those who care this can be easily arranged as a separate volume. The mess is below /var and also below /home. In both places 2. and 3. are mixed up. Under /home there are, e.g., browser caches. Clearly not the most important thing to backup.

And I definitely agree that there should be some effort to separate these things. In this way, I really like much of the systemd development. All the systemd support for containers and stateless systems needs a clean filesystem layout. Fork a new instance by reusing /usr and providing a fresh (empty) /var. And they are also working hard on reducing the necessary configuration under /etc, thus that containers and similar lightweight systems can gather all necessary data from the environment (e.g. network configuration by dhcp) and /etc can be empty on system startup. Let's hope that they do not stop halfway and continue cleaning up. There is still some way to go.

Relocating Fedora's RPM database

Posted Jan 13, 2022 13:44 UTC (Thu) by nim-nim (subscriber, #34454) [Link]

I agree 1. is mostly clean in /usr nowadays, it took decades of integrator work but it is mostly done.There is this awful temptation to declare “mission accomplished, everything else can be wiped out”. Some container people are definitely thinking this way.

The next frontier is sorting out 2. and 3.

Relocating Fedora's RPM database

Posted Jan 13, 2022 13:25 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

> A lot of systems do not make the effort to separate 1. 2. and 3. they are in the 90% almost working category.

And not that I had anything of importance there, I don't think, but I think I've just lost /var from my old system. And it had things like my personal wiki, I think the mail system stored a load of stuff there, etc etc. /var actually contains a lot of data (like other people said, databases), and its loss is NOT a minor matter.

I've now got into the habit of stashing most stuff in /home somewhere, but it would be nice if there was some way of separating stuff like user databases from other stuff like system databases (mail, printer spools, gentoo's installation stuff, whatever whatever).

Cheers,
Wol

Relocating Fedora's RPM database

Posted Jan 13, 2022 20:38 UTC (Thu) by JanC_ (guest, #34940) [Link]

There is also /srv where you can store things like wikis & databases if they aren’t user-specific or have to be used by multiple users.

Relocating Fedora's RPM database

Posted Jan 21, 2022 16:10 UTC (Fri) by Jonno (subscriber, #49613) [Link]

> But I see your point that there is a clear difference between /var/cache and a database.
> [...]
> Or perhaps splitting /var, one part for user data (e.g. databases) and a second part for intermittent data (/var/cache, /var/spool, /var/log, etc.).

Doesn't /srv already serve that use-case already? The FHS say it is for "site-specific data which is served by this system", which I sure read as including user data such as databases (or websites, or maildirs, etc).

For example, my mysql server uses /srv/myqsl as the datadir, and my imap server uses /srv/mail. (My smtp server uses /var/spool/postfix for in-flight messages, but emails destined for local users are delivered to /srv/mail/${USER}/.INBOX/).

Relocating Fedora's RPM database

Posted Jan 13, 2022 14:09 UTC (Thu) by walters (subscriber, #7396) [Link]

No one is saying that /var should be casually wiped. Databases (like postgres, etc.) normally go in /var/lib, not /var/cache.

Or to restate this, around separating /usr and /var - "factory reset" is only one half of the coin. The other half is knowing that OS upgrades won't affect your user data. See for example https://github.com/coreos/rpm-ostree/pull/888 which I am still very proud of =)

boot-to-ENOSPC

Posted Jan 13, 2022 11:16 UTC (Thu) by k3ninho (subscriber, #50375) [Link] (1 responses)

>The reason that /var *isn't* carved out by default with our Btrfs configuration is because of the RPM database. Once the RPM database is moved, it becomes possible to split /var into its own subvolume and make it trivial to configure a full boot-to-snapshot system leveraging the technologies we have available to us.
I like btrfs but I currently assume -- which I hope is out of date -- that btrfs + snapshots + ENOSPC means corruption. Is that still the case? What changed?

K3n.

boot-to-ENOSPC

Posted Jan 15, 2022 15:19 UTC (Sat) by Conan_Kudo (subscriber, #103240) [Link]

That hasn't been the case for many years. The scheme discussed in the Change discussion (and referenced in the article) is the same one openSUSE and SUSE Linux Enterprise have used for many years now.

Relocating Fedora's RPM database

Posted Jan 13, 2022 11:56 UTC (Thu) by taladar (subscriber, #68407) [Link] (1 responses)

> most packages support anything under /var being wiped at any time.

Actually I would be fine with anything under /usr being wiped three times a day over /var being wiped even once because /var is where all our important data lives (between /var/www, /var/lib/mysql and /var/lib/postgresql among others). /usr is the easily restorable distro stuff, /var is the unique data.

Relocating Fedora's RPM database

Posted Jan 13, 2022 13:05 UTC (Thu) by matthias (subscriber, #94967) [Link]

See my discussion with nim-nim above. /var is a mess of intermittent data (caches, logs, spool) and important data (as you mention). From an application perspective it is perfectly ok to wipe both, but the user will moan about the data. The idea is to be able to clone systems by forking /usr and providing an empty /var. Or doing factory resets (i.e., remove all user data) by wiping /var.

Relocating Fedora's RPM database

Posted Jan 13, 2022 12:20 UTC (Thu) by ibukanov (subscriber, #3942) [Link]

/usr is already a state of rpm. As such it makes sense to move the parts in /var to /usr.

Relocating Fedora's RPM database

Posted Jan 15, 2022 4:27 UTC (Sat) by songmaster (subscriber, #1748) [Link]

This probably isn’t going to help with speed, but maybe the rpmdb should be split up, so files that go into /usr get recorded somewhere like /usr/rpmdb (or /usr/.rpmdb perhaps) while files installed in /etc are recorded in /etc/rpmdb, and so on. There’s then no need for a separate /state at all as it gets distributed. Of course now the list of installed packages is fragmented and might have lots of duplication, but maybe all of that common data is really just a cache which could be reloaded from the network, so it ought to live in /var.

Relocating Fedora's RPM database

Posted Jan 20, 2022 15:22 UTC (Thu) by eduperez (guest, #11232) [Link]

Perhaps each RPM package should carry a small metadata file (or perhaps it could be autogenerated during the installation process), and install it under "/usr/lib/sysimage/rpm"; then, the RPM database can remain at "/var/lib/rpm", and be rebuilded from scratch when needed, using the individual metadata files.


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds