Improving .deb
Debian Linux and its family of derivatives (such as Ubuntu) are partly characterized by their use of .deb as the packaging format. Packages in this format are produced not only by the distributions themselves, but also by independent software vendors. The last major change of the format internals happened back in 1995. However, a discussion of possible changes has been brought up recently on the debian-devel mailing list by Adam Borowski.
As documented in the deb(5) manual page, modern Debian packages are ar archives containing three members in a particular order. The first file is named debian-binary and has the format version number, currently "2.0", as one line of text. The second archive member is control.tar.xz, containing the package metadata files and scripts that are executed before and after package installation or removal. Then comes the data.tar.xz file, the archive with the actual files installed by the package. For both the control and data archives, gzip, not xz, was used for compression historically and is still a valid option. The Debian tool for dealing with package files, dpkg, has gained support for other decompressors over time. At present, xz is the most popular one both for Debian and Ubuntu.
The choice to use ar as the outer archive format might seem
strange. After all, the only other modern application of this
format is for static libraries (they are ar archives with object code
files inside), and the de-facto standard for archives in the Unix world is
tar, not ar.
The reason for this historical decision is,
according to Ian Jackson,
that "handwriting a
decoder for ar was much simpler than for tar
".
Before 1995, a different format, not based on ar, was used for Debian packages. It was, instead, a concatenation of two ASCII lines (format version and the length of the metadata archive) and two gzip compressed tar archives, one with metadata, similar to the modern control.tar.gz, and one with files, just like data.tar.gz. Even though old-format packages are not in active use now, modern dpkg can still create and install them.
What prompted Borowski to start a discussion about changing the internals of the package format amounts to a few possible improvements that can easily be implemented. For example, while the xz compressor yields the smallest package size, switching to zstd for compression would improve the unpacking time by a factor of eight while still beating the venerable gz in terms of compression ratio. As Borowski suggested:
To be fair, this is not the first time developers have proposed zstd compression support for inclusion into Debian's dpkg. Also, Ubuntu 18.04 ships with zstd support already enabled in its version of dpkg.
Beyond recommending adding support for a new compressor, Borowski suggested returning to the old format. The reason was that ar archives (and thus modern deb packages) store the size of their members as a string of no more than ten decimal digits. It means that data.tar.xz can be at most 9,999,999,999 bytes long, or approximately 9.4GiB. While there are no packages of this size in the Debian archive (the largest package is flightgear-data-base, taking "only" 1,178,833,172 bytes), this limitation is indeed a problem for some communities producing unofficial packages, as confirmed by Sam Hartman. The old format does not have a fixed-size length field and thus does not have such a limitation. In addition, in the benchmarks performed by Borowski, even in the apples-to-apples comparison using the gzip compressor for both format versions, the old format was slightly faster to decompress.
Jackson, as the developer who introduced the currently used format,
responded
that Borowski's suggestion is "an interesting proposal
". He
acknowledged
that the size limitation is indeed a problem and explained the rationale
behind the current format. Namely, the old format was not easy to extract
without dpkg (e.g. on non-Debian systems) and was not easily
extensible. A short discussion thereafter confirmed that people do routinely
extract .deb files on "foreign" Linux distributions by hand and
perceive
this ability as an important property of the .deb package format.
Extensibility, on the other hand, in practice amounted to the addition of new
decompressors and new fields in files that are in the control tarball.
All of that could be done with the old format just as well.
However, switching away from the current "ar with tar files inside" format does not necessarily mean returning to the old format. And that's exactly the objection raised by Ansgar Burchardt. He mentioned the use case of extracting only a few data files (such as the Debian changelog, or a pristine copy of the configuration files), which is currently slow. This operation is slow not only because of a slow decompressor, but also because, in order to get to a file in the middle of a compressed tar archive, one has to decompress and discard everything before it. In other words, fixing this slowness would require switching away from a "compressed tar" format for the data archive to something that supports random access. According to Burchardt, if the Debian project were to introduce one incompatible change to the package format anyway, it would be also a chance to move away from tar, or to tack on other improvements that require incompatible changes. Jackson, however, expressed disagreement with the idea of bundling several incompatible changes together.
Borowski measured
the overhead of switching to a seekable archive format by compressing each file
in the /usr directory and the Linux kernel source individually and
comparing the total size of
the compressed files with the size of a traditional compressed tar.xz
archive. As it turns out, individually compressed files, which are needed
for a seekable archive,
took 1.8x more space, thus making the proposal too expensive. Burchardt
suggested
retesting with the 7z archiver, because it can do something in
between compressing files individually and compressing the whole archive.
Namely, to get a file from the middle of the archive, one needs to decompress
everything not from the very beginning, but only from the beginning of a
so-called "solid block" containing the file in question.
The "solid block" size is tunable. Still, even with
16MiB solid blocks, according to Borowski's measurement, "the space loss
is massive
" (1.2x). This experiment
convinced
Burchardt that switching to a format that allows random access is just not worth
it.
An idea of replacing ar with uncompressed tar as the outer archive format has also been proposed. This would eliminate the package-size limitation, while keeping the advantage that Debian packages can be examined and unpacked by low-level shell tools. This is actually the same as the opkg format used by embedded Linux distributions.
Guillem Jover (the maintainer of dpkg)
acknowledged the problems with both old and current .deb
package formats and, after examining possible alternatives,
concluded
that the proposal to switch the outer archive format to
tar is "the most straightforward and simple of the
options
".
He promised to present a diff to the .deb format documentation and to
start adding support in dpkg version 1.20.x. However,
Borowski objected to any "archive in archive" format design and especially
did not like uncompressed tar as the outer archive, because it wastes
bytes on so-called "blocks" that are only relevant for tape drives. Also,
optional features of the tar archive format, such as sparse file
support,
would unnecessarily complicate the implementation.
Jackson
suggested
that it is possible to support only a strict subset of the tar format,
without the problematic features. He noted that it is already the case for
the usage of ar as the outer archive format, "to the point that
it is awkward to *create* a legal .deb with a normal ar
utility
". He also brought up his old idea on how to deal with the
data.tar.xz
size limit: just split it into multiple files and store them in the ar
archive as extra members. This proposal has the advantage that it is still
compatible with third-party tools and amounts to absolutely no change if
the existing package size limit is not hit.
At this point, the discussion accumulated quite a large number of conflicting proposals and opinions. Due to the issue being too contentious, Jover retracted his promise to work on changing the format documentation. The thread died off without any conclusions or action items. Still, at this time no official Debian packages come too close to the limitations of the current .deb format, so no urgent action is needed. And, if someone needs to unofficially package something really big, they can do it right now — thanks to Borowski's idea about the old format, which is still supported.
Index entries for this article | |
---|---|
GuestArticles | Patrakov, Alexander E. |
Posted May 28, 2019 18:24 UTC (Tue)
by smoogen (subscriber, #97)
[Link] (11 responses)
Posted May 28, 2019 19:03 UTC (Tue)
by mbunkus (subscriber, #87248)
[Link] (9 responses)
But the worst of all is its horrible overhead. The whole stream is split into Ogg packets (one audio or video frame in one packet) — so far so good. But Ogg packets are again spread across one or more Ogg pages, adding more overhead for no real benefit (the reason is to have points to sync to in streaming scenarios as each Ogg page starts with a well known byte sequence, but that's completely irrelevant for file storage).
The most ridiculous thing of it all is how the size of Ogg packets is encoded. The algorithm is:
size = 0
So in order to encode the size of an arbitrary file, let's say /bin/bash at 1,166,912 bytes, the size will be encoded in 4,577 bytes. Encoding the size of the example from the article, 1,178,833,172 bytes, would require a wooping 4,5 MB.
Apart from that Ogg has no provisions for storing things a file container might need (ownership, permissions, ACLs, extended attributes, whatever else) while containing stuff you don't need for file storage (granulepos values used for inter-stream time synchronization, serial numbers). You'd need to invent all kinds of proprietary meta data formats. I'm certain all of the existing code handling Ogg files has been written with handling audio/video streams in mind, meaning there's zero cross-platform/cross-application gain.
At that point you're basically inventing a whole new container format.
There are tons of container formats out there tailored to file storage. You've already mentioned two of them. Here are a couple of others: tar, 7z. Neither of them is perfect, to be sure, but at least some of them (e.g. tar with certain extensions as created by e.g. the BSD's tar command) can deal with ACLs, extended attributes, several either offer different compression algorithms (7z, zip) or are completely agnostic to them (cpio, tar) and therefore extensible. They're much, much better suited to file storage than Ogg is.
OK this has gotten much too long and ranty. I always get somewhat frustrated and emotional when talking about Ogg. I hope you didn't take any of this personal; it certainly wasn't meant that way.
Posted May 28, 2019 19:27 UTC (Tue)
by ncm (guest, #165)
[Link] (1 responses)
Posted May 28, 2019 19:53 UTC (Tue)
by mbunkus (subscriber, #87248)
[Link]
The fundamental difference between A/V containers & file containers is how multiple streams/tracks/files are laid out. In a file container all files are laid out one after the other. Accessing the content of one file is ideally as seeking to its start position and doing one long read operation.
In an A/V container, on the other hand, you place those parts of each stream/track close together that need to be played together. All of the data is tightly interleaved by their timestamps. This is done in order not to have to seek forward and backward all the time, which is especially atrocious for transports with high latency (e.g. optical discs or online streaming). In the Good Old Days™ there were a lot of AVI files (and I even have a couple of MP4 files) where track content was laid out like in a file container (first all the video data, then all the audio data), and playing such a file from a CD-ROM was neigh impossible.
Posted May 28, 2019 22:19 UTC (Tue)
by lmartelli (subscriber, #11755)
[Link] (5 responses)
Posted May 29, 2019 6:58 UTC (Wed)
by mbunkus (subscriber, #87248)
[Link] (3 responses)
Posted May 29, 2019 12:57 UTC (Wed)
by jezuch (subscriber, #52988)
[Link] (2 responses)
Did you mean: while (byte == 255); ?
(Oh the perils of commenting on a forum full of programmers ;) )
Posted May 29, 2019 13:04 UTC (Wed)
by mbunkus (subscriber, #87248)
[Link] (1 responses)
Posted May 29, 2019 14:49 UTC (Wed)
by gevaerts (subscriber, #21521)
[Link]
Posted May 29, 2019 7:47 UTC (Wed)
by weberm (guest, #131630)
[Link]
Posted May 30, 2019 18:44 UTC (Thu)
by smoogen (subscriber, #97)
[Link]
Posted May 28, 2019 18:48 UTC (Tue)
by logang (subscriber, #127618)
[Link] (10 responses)
I don't expect I'd want to install any application whose compressed size is greater than 9GiB. What would be in such a monstrosity? It'd be enough to fit an entire typical debian desktop install in a single package, and then some.
Posted May 28, 2019 19:15 UTC (Tue)
by excors (subscriber, #95769)
[Link]
Posted May 28, 2019 19:19 UTC (Tue)
by pizza (subscriber, #46)
[Link] (4 responses)
Believe me, nobody "wants" to install Vivado. :-)
Posted May 28, 2019 19:30 UTC (Tue)
by logang (subscriber, #127618)
[Link] (3 responses)
I haven't touched Vivado in a long time but the Xilinx tools used to contain an entire Java runtime, QT, perl, etc, etc. (I hope you don't care about security updates on all this code.) The real problem is the proprietary software bundles that don't use shared libraries and need to have the kitchen sink included.
Also, speaking to Vivado specifically, I think it would also be very sensible to split it up into a main deb plus one deb per FPGA family (Spartan, Virtex, Artex, Kintex, etc). That way users can choose what they want and they don't necessarily need to waste so much disk space.
So back to my main point, having the file size limit can force developers to do sensible things like solve the problems above.
Posted May 29, 2019 17:41 UTC (Wed)
by thoughtpolice (subscriber, #87455)
[Link]
If that was the case and how it worked in reality, they would have already fixed it. But they haven't: rather than getting a .deb package that can easily be installed like any others, they ship tarball installers with horrid self-extraction programs. In fact they probably wouldn't do it anyway even with that problem fixed, because they want one binary blob they ship to every platform, hence why they vendor everything under the sun. They have no interest in supporting multiple package formats. You're making a categorical error in thinking their goals (ship highly expensive, niche, proprietary software to users in controlled environments with support contracts) are the same as yours ("nice" Linux system integration). In fact the only people who suffer under this setup are the people who *do* want to be "nice" about Linux system integration, and ship .deb files of large programs/packages (for reasons that may not entirely be under their control.) Not Xilinx. And even if it wasn't a moot point, single device families (UltraScale+ IIRC) consume more space than is already allowed by a single .deb anyway, so there you go.
If you think a company like Xilinx that makes billions of dollars a year is going to change all of this because Debian has an artificial technical restriction on the size of .deb files: they won't, you're just not that important. They do not care about what Linux distro developers/users want or think is "ideal", and they have far more money and time than you do, so they can make it work. They have more money to burn than you have time to implement artificial technical restrictions to try and "force them" to behave. I don't know why people persist in believing this kind of trivial, easily-countered approach works: the entire premise relies on the faulty assumption that the dynamics of power are in your favor. They are not.
Posted May 30, 2019 6:00 UTC (Thu)
by Tov (subscriber, #61080)
[Link]
That is exactly what we have appimage/snap/flatpak for!
Posted May 30, 2019 15:41 UTC (Thu)
by imMute (guest, #96323)
[Link]
As a pro-distribution guy, I like the benefits that package maintainers bring. On the other hand, I love that the Xilinx toolchain is entirely self contained: no need to mess around with figuring out which dependencies are needed (because god help them if they actually document those things). I totally see why flatpack (et. al.) are rapidly gaining popularity.
Posted May 28, 2019 20:12 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted May 28, 2019 21:47 UTC (Tue)
by flussence (guest, #85566)
[Link] (1 responses)
tar's default is a null-terminated string of 11 octal digits (8GiB limit!), some implementations store 12 non-null digits (64GiB), the GNU-proprietary extended format sets the MSB bit to 1 and uses the other 95 bits as a binary-encoded number (32ZiB), and PAX instead defines a size xattr that accepts arbitrary-precision integers. None of the large size formats have universal support.
That's just one facet of it. It's an awful format in modern times, kept alive entirely through inertia. The same could be said about a lot of container/compression formats.
Posted May 31, 2019 7:11 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
Are there still reasons not to use it?
Posted May 29, 2019 4:42 UTC (Wed)
by dvdeug (guest, #10998)
[Link]
Posted May 28, 2019 21:10 UTC (Tue)
by jhoblitt (subscriber, #77733)
[Link] (22 responses)
Posted May 28, 2019 21:27 UTC (Tue)
by compenguy (guest, #25359)
[Link] (4 responses)
The database format of rpms is painful to deal with, and really rather impractical for some of the use cases described in the article, such as extracting the installer contents for manual installation on an unsupported system.
Also, most of the linux installation professionals I know hate rpm with a passion and would much rather work with deb packages, for a host of reasons not directly relating to the file format itself. The state management for package install/upgrade/uninstall is more robust and intuitive for deb being one of the really big ones. I will say, though, that on the deb side of things, I miss rpm's autoreq/autoprov system. Deb's tooling doesn't let you provide/require a SONAME, rather the tooling will look at known packages and use the name of the package that installs that lib as the dependency.
Most of the rest is kind of a grey area of just having different design patterns for solving different kinds of installation problems.
Posted May 30, 2019 15:50 UTC (Thu)
by imMute (guest, #96323)
[Link] (1 responses)
I wonder if this wouldn't be possible in debs with some creative use of Provides and Requires. A package containing a library that "provides" some SONAME could have a "Provides: SONAME-libfoo.so.2" on it. Packages that need that SONAME could add "Requires: SONAME-libfoo.so.2". Specific versioning would be tricky, since you can't know the exact versioning a providing package uses. I'm thinking epoch versions might throw a wrench in there... Also that the SONAME "version" number and the package version number (even just the "upstream" part) aren't always numerically the same.
Since everyone should already be using dh_makeshlibs / dh_shlibdeps, this might not even be too hard to prototype...
Posted May 31, 2019 14:58 UTC (Fri)
by patrakov (subscriber, #97174)
[Link]
Posted Jun 6, 2019 9:38 UTC (Thu)
by hensema (guest, #980)
[Link]
Rpm is using cpio as its archive format. Equally arcane as ar, but it does enable you to extract an rpm on foreign systems.
> Also, most of the linux installation professionals I know hate rpm with a passion and would much rather work with deb packages, for a host of reasons not directly relating to the file format itself.
So what you're saying it's better to invent a new format to avoid hurting feelings of those "professionals"?
Let's face it: Debian can still be Debian even if they switch their underlying package format to RPM. Or any other vaguely modern package format.
Refactoring .deb is a good thing, but it does make sense to shop around for existing solutions that work, are mature and maintained.
Posted Jun 10, 2019 6:33 UTC (Mon)
by ceplm (subscriber, #41334)
[Link]
You know that's like twenty years out-of-date complaint, right? And the only meaning of the word "intuitive" is "what I am used to and I hate anything changing", right?
Posted May 28, 2019 22:08 UTC (Tue)
by amacater (subscriber, #790)
[Link] (16 responses)
You can't do that with rpms - and anyway "whose version of rpm" - I'm old enough to remember when Red Hat broke rpm such that you couldn't install updates, when Mandriva introduced a "newer" version of rpm that was a fork by an erstwhile maintainer, that OpenSUSE rpms don't work well with anyone else's.
Debian's strict policy on packaging and upgrades is what makes seamless upgrade from say, Debian 7 to Debian 9 remotely possible: if you're _really_ lucky, you might just be able to upgrade CentOS 6.8 to 7 or 7.6 to 8 but the rpm world is a reinstall to fix every problem.
Debian and Ubuntu share very similar package formats: Ubuntu developers do things differently at times with versions of gcc or whatever so you can't drop Debian and Ubuntu packages together: but you can easily use the source to rebuild them readily.
Disclaimer: I'm a Debian developer but a CentOS and Red Hat sysadmin advising engineers in my day job.
If you _really_ need more than 9GB in a single .deb, chances are that you're doing it wrong even now.
Posted May 29, 2019 0:26 UTC (Wed)
by kfox1111 (subscriber, #51633)
[Link] (11 responses)
I tried some weird experiments many years ago with the alien package converter. Installed redhat using the debian installer (And I think the storm linux installer...). Also installed debian using anaconda. There isn't a huge difference between rpms and debs at the end of the day.
Its what you put in them that counts. :)
Posted May 29, 2019 0:48 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link]
You sure can. I have done that. rpm2cpio and convenience scripts like rpmls (part of rpmdevtools) make this really easy.
> if you're _really_ lucky, you might just be able to upgrade CentOS 6.8 to 7 or 7.6 to 8 but the rpm world is a reinstall to fix every problem.
Certainly hasn't been the case for years. Many RPM distributions support a straightforward upgrade path.
The complications in enterprise distributions have not much to do with RPM the package format or even the tooling (c.f: things like dnf system upgrade) but the fact that these distributions have a very long lifecycles (10 to 15 years) and they tend to run many third party applications (including proprietary ones) that are brittle in the face of OS upgrades. The answer to that has been VM's and containers.
Minor variations in RPM (nearly all distributions have folded back into using RPM4 which has a very active and healthy upstream project now) don't matter as much. There are packaging differences because unlike Debian/ Debian derivatives like Ubuntu, the RPM world has a broad number of distributions which aren't all derived from Red Hat ex: openSUSE and even in such cases, the divergences have steadily gone away with time. The number of patches in say Fedora, mandriva, opensuse etc patches in RPM package itself is pretty low at this point. Even macros have consolidated considerably.
> but you can easily use the source to rebuild them readily.
You certainly can do that with RPM pretty quickly and I have done that for dozens and dozens of packages. Lots of packages in Fedora due it for supporting EPEL and even more do it for things like openbuildservice.
All of these sounds like issues that are outdated at this point.
Posted May 29, 2019 17:48 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (9 responses)
That's not at all fair to Debian packages. You can make do with RPM and the RPM ecosystem (Yum, DNF), but it's still a pock marked hell scape. Here's a good jumping off point for the low-level sins of RPM specifically: https://xyrillian.de/thoughts/posts/argh-pm.html
Posted May 29, 2019 18:53 UTC (Wed)
by domo (guest, #14031)
[Link]
In early 2010's I spend one month trying to figure out the rpm source code in order to
search for 'rrpmbuild' for code reference.
the format is quite complicated for human observer...
whatever the format is (IMO extending ar(5) is not best option, old tools cannot understand
Best would be some new "extensible linux package format" (w/ sane format, no xml etc)
Even I could device such a format, just that no-one would adopt implementation done
Posted May 30, 2019 11:08 UTC (Thu)
by jond (subscriber, #37669)
[Link] (1 responses)
I'm now wondering how much rpm-ostree might side-step this madness, if at all.
Posted Jun 4, 2019 3:56 UTC (Tue)
by rahulsundaram (subscriber, #21946)
[Link]
I am not sure I follow what you are wondering about here. The internal low level implementation details of RPM format obviously doesn't affect end users. What matters to end users is functionality like library and file based dependencies or boolean dependencies or weak dependencies etc.
ostree based systems don't use RPM at all and therefore dependencies don't really matter all that much on these systems for end users. What you get is a OS that is constructed and pushed to end users a single "immutable" image and everything else is supposed to be running of containers of some sort. rpm-ostree provides some level of compatibility with traditional RPM packages but the more you use them, the more you move away from the advantages that a immutable base image provides. Instead the recommend path is use wrapper like Fedora toolbox and within that you could just install plain rpm packages.
Posted May 31, 2019 1:15 UTC (Fri)
by bojan (subscriber, #14302)
[Link] (4 responses)
So, RPM has been tinkered with since its inception and now has a whole lot of baggage caused by various design errors, improvements, folks finding ways to bend it in new and useful ways. Shocking stuff. Who would have thought that a package format that is 22 years old would be like that. :-)
More planned, BTW: https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zst...
Posted May 31, 2019 2:18 UTC (Fri)
by wahern (subscriber, #37304)
[Link] (1 responses)
> Who would have thought that a package format that is 22 years old would be like that. :-)
Debian package users! The Debian package format is old and wrinkly, but it has aged incredibly well in terms of forethought and capabilities. The tooling is more complex but that's because the ecosystem is layered. Many of the biggest headaches in the land of Yum and RPM (sections, macros, file contents, dependencies, building, ...) are insurmountable and force everybody and everything to accommodate the limitations. (Ignorance is bliss, though!) For every headache one can identify in the land of .debs and Apt there are *both* dirty hacks and clean changes in approach that resolve them; rarely are you stopped in your tracks with the realization you simply cannot accomplish something functionally.
IMO the Debian packaging ecosystem continues to evolve and improve. There are improvements to the RPM ecosystem, but they asymptotically move RPM toward a wall.
Detailing all the issues here would be impractical (and I don't have the memory for it, only the scars), but if you have time carefully go through the history of the development of Modularity (you may need to use Wayback Machine to see how the project specifications changed) and you'll see how RedHat had to backtrack and literally reinvent Modularity late in the RHEL8 development cycle after they realized they couldn't surmount various limitations to RPM, particularly with regards to build-time and run-time dependency management. I remember a co-worker raving about how awesome it would be and me being incredulous that they could pull it off, and lo-and-behold it turned out that they couldn't.
Posted May 31, 2019 4:20 UTC (Fri)
by bojan (subscriber, #14302)
[Link]
So, I have no idea why folks go on these long rants to point out how everything Debian has an almost saint like quality and everything else is pure junk. The fact is that both systems are in widespread use and they work, each with their own limitations.
Posted May 31, 2019 8:05 UTC (Fri)
by amacater (subscriber, #790)
[Link] (1 responses)
If you keep your Fedora fully maintained then you'll be upgrading every 12 months or so and will lose support for your version at best every 18 months.
Now take a neglected Debian 7 - some two years out of support. Move it to 8 which is on long term support. Move it to 9. [In a month or so, you can move it to 10 when Buster comes out, maybe.] That includes the sysvinit-systemd transfer which needs a reboot. That takes you from kernel 3.10 - 4.4 seamlesly and 4.19 next month. Oh, and for fun, do this with no network access. You might do this with CentOS: you _can_ do this with DVD images and Debian :) [And yes, it's an "uphill, both ways in the snow" kind of story - but it's real, and there are lots of machines out there that are "only" two years out of support and have to be maintained and upgraded without data loss. ].
Posted May 31, 2019 8:49 UTC (Fri)
by bojan (subscriber, #14302)
[Link]
Red Hat decided they didn't want to support upgrades from RHEL 5 to 6, but 6 to 7 (for some products) and from 7 to 8 is possible:
https://access.redhat.com/solutions/637583
Posted May 31, 2019 8:30 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
You have all the tools to manipulate rpm files under Linux, you can even open them in generic non-Linux archiving tools like 7zip and it will *just* *work* (yes you will lose rpm-specific metadata. Just like you will lose iso-specific metadata when treating isos like a giant archive. If you absolutely refuse to use native rpm tools just uncompress the source rpm, the whole package is described in a human-readable spec file, you don't absolutely need to read the binary transformation of this same info).
The rpm installation/update process has a mind-numbing amount of entry points, with very specific (and weird) ordering, but the average packager does not have to think about them. When you *do* need to think about them, because the software being packaged has special needs, you’re happy to have them available (or, like pretty much everyone, you decide it’s all too complex, and try to do your own better simpler thing, and months later, when you've exhausted all the weird corner cases required by your software, and actually understand the problem space, you switch to native rpm-provided facilities, because now you actually understand why they need to behave the way they do. Of course some people are too lazy to actually fix all corner cases, or too proud to admit they were wrong, so they will push garbage that does not make use of the tech capabilities, and complain rpm is awful). It's the same difference between an init tech with barebones facilities, that requires you to write giant custom scripts to work (SysV init), and something with built-in capabilities, that requires knowing the manual to call the built-in capabilities correctly (systemd).
And the rest is just the packaging policies and rules of each distro, which are not the same, so anyone looking at the packages done by other distros will be lost and unhappy, and only people that mistake their habits with natural laws of nature will seriously complain about it (yes, Debian packaging is weird and crufty too when looked at by outsiders). And two rpm distros won't do things the same way because they don't have the same opinions, and so would two deb distros.
The rpm format is actually nice enough many distributions adopted it and do their own different thing with it. And yes it also provides automation facilities in form of macros, so you don't have to do it all by hand, and distros with different opinions and objectives will automate things differently, what's the problem with that? It's like complaining not two Firefox users install the same extensions, and it's too hard to understand why two Firefoxes do not behave the exact same way.
Posted Jun 3, 2019 22:22 UTC (Mon)
by logang (subscriber, #127618)
[Link]
Posted Jun 6, 2019 8:55 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (2 responses)
> Debian's strict policy on packaging and upgrades is what makes seamless upgrade from say, Debian 7 to Debian 9 remotely possible: if you're _really_ lucky, you might just be able to upgrade CentOS 6.8 to 7 or 7.6 to 8 but the rpm world is a reinstall to fix every problem.
What you miss here, is that (afaik) all distros that use deb are DERIVATIVES of debian, so they inherited debian's packaging rules.
OpenSUSE (at least its parent) PREDATES rpm, heck iirc it even predates Red Hat, so while it adopted the rpm program and file format, it already had its own, completely different, packaging rules.
Things are a lot better on that front now, I believe ...
(SuSE began as a Slackware derivative, then was derived from some other obscure distro, then became its own master.)
Cheers,
Posted Jun 6, 2019 19:42 UTC (Thu)
by amacater (subscriber, #790)
[Link] (1 responses)
Deb "just works" but only because Debian puts a whole lot of policy in place and developers are constrained to work so that packages co-install, don't overwrite libraries from other packages and so on. It's hard nosed packaging policy that makes it work. [A colleague says "CentOS just downloads it's easy - Debian's too big!" but that's because Debian includes the world and its source ]
Posted Jun 6, 2019 20:50 UTC (Thu)
by rahulsundaram (subscriber, #21946)
[Link]
I agree with that view. That has nothing to do with the format of the archive. It is much more higher level.
>"CentOS just downloads it's easy - Debian's too big!"
Not sure what that means. Net installation works just fine in either.
Posted May 28, 2019 23:25 UTC (Tue)
by oliwarner (subscriber, #81320)
[Link] (1 responses)
Debs will need to gain more than a x8 speedup to survive the next generation of distributions.
Posted May 29, 2019 1:39 UTC (Wed)
by interalia (subscriber, #26615)
[Link]
Debian's long-term direction and future is a worthy discussion but should be a totally separate one. We shouldn't let perfect be the enemy of a good minor improvement.
Posted May 29, 2019 2:45 UTC (Wed)
by gus3 (guest, #61103)
[Link] (3 responses)
Posted May 29, 2019 10:52 UTC (Wed)
by hei8483j (guest, #124709)
[Link] (2 responses)
Posted May 31, 2019 0:11 UTC (Fri)
by compenguy (guest, #25359)
[Link] (1 responses)
What on earth do you think rpms are? Each rpm is a berkeley db hierarchical database.
> The whole Windows experience is so complicated in comparison with a Linux system. In Linux, the main effort is creating good build scripts. In Windows, you are always writing custom actions to supplement the installer itself, its dependencies and runtimes.
Actually, there are a lot of technical parallels between MSI design and execution and rpm design and execution. If you look at the order that rpm scripts are run during upgrade, it's a really mind-bending process and feels really unnatural. But it is, in fact, very very efficient with disk writes/erases especially when not all the files in the package might be changing. An MSI installer with a late-scheduled RemoveExistingProducts executes actions in a sequence _very_ similar to rpms.
In fact, if PowerShell, Wix, and Burn had been invented about a decade prior, the MSI installer development experience would have looked a good bit more like rpm than it currently does.
As it is, though, Microsoft is trying not to invest anything in MSI in order to push their app store distribution model. Apple deprecated their pkg installation system probably almost a decade ago, again "because appstore", but they still can't manage to kill it - it's just too useful (although the pkg system is pretty scary in its own right).
Posted Jun 3, 2019 23:26 UTC (Mon)
by brouhaha (subscriber, #1698)
[Link]
The system-wide RPM database is a berkeley db.
An individual RPM file is just an RPM header prepended to a cpio archive.
Posted May 29, 2019 9:50 UTC (Wed)
by SiB (subscriber, #4048)
[Link]
Problem solved? Big packages can use the old format, no changes required?
Posted May 29, 2019 9:56 UTC (Wed)
by geert (subscriber, #98403)
[Link] (2 responses)
Posted May 29, 2019 10:27 UTC (Wed)
by gb (subscriber, #58328)
[Link]
Posted May 30, 2019 13:01 UTC (Thu)
by mort (guest, #132348)
[Link]
Posted May 29, 2019 15:07 UTC (Wed)
by jezuch (subscriber, #52988)
[Link] (3 responses)
May make somewhat more sense on source packages, though, but these are not .debs
Posted May 30, 2019 17:22 UTC (Thu)
by imMute (guest, #96323)
[Link] (2 responses)
For example, pbuilder itself uses compressed tarballs to store an image of the rootfs at rest. Each time you want to build a package, that tarball has to be uncompressed and extracted. I've found that you can use cowbuilder instead (I'm not sure exactly how the "instead" happens - git-buildpackage does it automagically for me) which keeps everything uncompressed/untarred and uses COW filesystems to copy the "pristine" rootfs for each build. It's incredibly fast to get the chroot ready to use, and then I find myself waiting 2-3 minutes for apt to install all my dependencies (obtained from a cache on the same ramfs; *not* from a mirror).
Posted Jun 3, 2019 10:44 UTC (Mon)
by jezuch (subscriber, #52988)
[Link] (1 responses)
But anyway, xz is relatively fast at decompression as all LZ77 type algorithms are. Is it not fast enough? On the other hand bandwidth is cheap these days so... :)
Posted Jun 4, 2019 3:03 UTC (Tue)
by pabs (subscriber, #43278)
[Link]
Posted May 29, 2019 18:42 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (1 responses)
If you don't mind the uncleanliness and potential security issues of such redundant metadata, one can create an index for tar files, including compressed tar files. I've experimented with this (for both tar and tar+gzip), though nothing releasable. The upside is that adding an index could be done in a backward compatible manner--just another object in the outer archive that could be ignored.
Posted Jun 6, 2019 21:36 UTC (Thu)
by dfsmith (guest, #20302)
[Link]
Posted May 29, 2019 20:16 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
Even Windows static libraries are `ar` format (the linkable part of shared libraries are too, they just have some metadata which says "runtime load this .dll file").
Posted May 29, 2019 22:48 UTC (Wed)
by unixbhaskar (guest, #44758)
[Link]
I think if not months, a few years down the line, surely there must be some change in this respect. No point to make a hasty decision.
Posted May 30, 2019 9:08 UTC (Thu)
by bokr (guest, #58369)
[Link]
Their base long-term dependency is the stability of the linux kernel ABI, as I understand it,
https://www.gnu.org/software/guix/manual/en/html_node/Fea...
Posted May 30, 2019 15:05 UTC (Thu)
by eru (subscriber, #2753)
[Link]
Posted May 31, 2019 23:40 UTC (Fri)
by brouhaha (subscriber, #1698)
[Link]
This seems like a problem only if the format changes include having a significant number of files nested directly in the outer wrapper, rather than in an inner compressed archive.
Posted Jun 6, 2019 3:55 UTC (Thu)
by brunowolff (guest, #71160)
[Link] (2 responses)
Posted Jun 26, 2019 2:50 UTC (Wed)
by fest3er (guest, #60379)
[Link] (1 responses)
As I understand, uncompressing certain .xz archives (perhaps large archives) *can* require a lot of RAM
Haiku have created what sounds like a novel approach to packages. For most user packages, there's no need to unpack and install files. As I understand, the pkg file is simply put where it belongs; once there, its contents become available to the system. To remove the pkg, delete the pkg file. I've no idea how they made it work (perhaps some form of FS union).
Posted Jun 26, 2019 10:32 UTC (Wed)
by excors (subscriber, #95769)
[Link]
I don't believe it depends on the archive size, just on the dictionary size that was chosen when compressing, because the decompressor has to construct that dictionary in RAM. The man page says the default compression mode ("xz -6") uses an 8MB dictionary, and the most expensive preset ("xz -9") uses 64MB, though with custom settings it can support up to 1.5GB. Compression takes roughly 10x more RAM.
(For comparison, zlib(/gzip/etc) uses a 32KB dictionary by default, which is partly why modern algorithms can perform so much better.)
Improving .deb
* ogg . it is a container format which could be used and has a lot of code for it in many different modes. This would allow for the greater portability that Debian strives for.
* zip. Also crossplatform and also default download type from say git(hub/lab) where you could download the deb directly. Make some changes and a zip2exe which installs Debian on a windows box could be possible.
* cpio. no no there are bridges too far.
Improving .deb
do {
byte = read_next_byte()
size += byte
} while (size == 255)
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
I've been using both for ~24 years.
Improving .deb
Improving .deb
Improving .deb
rpm pains (orig: Improving .deb)
make it working elsewhere. While doing that I got some knowledge about the format,
and then got good enough replacement made using perl(1)
it anyway, so something better could be deviced) it should be simple enough everyone
can easily do their own tools (or help extending the existing ones).
which could be adopted by all distributions. the format would have extensible package
metadata format, and then extensible file (metadata, including file contents) format.
by random programmer very often...
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
https://access.redhat.com/documentation/en-us/red_hat_ent...
Improving .deb
Improving .deb
Improving .deb
Wol
Improving .deb
Improving .deb
Improving .deb
Improving .deb
why confine the discussion to Linux-based systems?
why confine the discussion to Linux-based systems?
why confine the discussion to Linux-based systems?
why confine the discussion to Linux-based systems?
Improving .deb
ar limits
Do they have a solution planned?
ar limits
ar limits
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
Improving .deb
so old apps using old libraries can coexist with newer, so long as the kernel supports all
the syscalls involved (which Linus is pretty good at enforcing as the kernel evolves),
Improving .deb
tar for outer wrapper - wasted space
Improving .deb
Improving .deb
Improving .deb