LWN.net Logo

Fedora ponders Software Collections

By Jake Edge
December 20, 2012

A feature in Red Hat Enterprise Linux (RHEL) that supports multiple, parallel installations of programming languages and other normally system-wide tools was recently discussed on the fedora-devel mailing list. Matthew Miller, who recently started as the Fedora cloud architect, raised the idea of bringing Software Collections from RHEL to Fedora. The idea behind Software Collections is interesting, but no clear consensus on how appropriate they might be for Fedora emerged. As with many Fedora discussions of late, this one at least partly comes back to the question of the role that the distribution is meant to fill.

The problem that Miller initially presented is particularly acute in the Ruby and Java worlds, though Python and other tools (e.g. databases) sometimes suffer from it as well. Various packages may depend on different versions of the underlying tools, which makes it difficult to have them coexist on the same system. As an example he noted that the Fedora packages for the Puppet configuration management tool are broken because the Fedora Ruby version is too new. One way to solve that problem is to have multiple Ruby versions available that can be installed in parallel and chosen at runtime. That's exactly the problem that Software Collections sets out to solve.

A Software Collection (SC) uses the same packaging tools (RPM, Yum) already used by RHEL and Fedora, but installs the packages and their dependencies in the /opt/provider hierarchy. The provider piece is a specific string assigned to a vendor, which will allow multiple software providers to share the hierarchy without name collisions. There is also an scl tool that allows choosing one or more SCs to be active when running a specific command. Whatever is needed in the environment for the particular SC will be set up by "scriptlets" that get installed with the collection and are run when the SC is selected.

As Miller notes, there is nothing inherent in Java or Ruby that leads to version-mismatch problems, instead they are caused by the expectations of developers building software using those languages. There is a strong preference for bundling various toolkits and libraries with such packages, which runs counter to the way Fedora and many other distributions do things. Red Hat Eclipse team member Alexander Kurtakov put it more bluntly:

As a Java guy I'm more and more sure that the problem is not in the packaging view but in the wrong view of developers not being capable of making an application if they don't bundle everything. You're [right] the problem is not in the languages it's in the developers :(.

But, the fast-paced nature of Fedora (normally a release every six months or so) may not make for a good match with SCs. Former Fedora project leader Jared Smith was a bit skeptical about the fit:

Given the short shelf-life of a Fedora release and the complication involved in Software Collections, I'm still not convinced that we really need this in Fedora. Can you give me a concrete case where Fedora really needs to be running two different versions of the same software, in a production environment? Given it's longer shelf life and different target audience, RHEL is a better candidate -- and [for] the record, the company I work for uses Software Collections that way. I'm just having a hard time justifying it in my mind for Fedora.

Miller had a ready answer. He outlined three separate uses he saw for SCs in Fedora, starting with handling problems like the Puppet issue. Allowing multiple languages would give more choices for Fedora as a development platform. He also noted that RHEL and Fedora make up an ecosystem where developers targeting the former may well be developing on the latter. Access to SCs on Fedora might be quite useful since they are available for RHEL. Those two potential use cases did not require too much discussion, but the other one did:

On a long-lived platform, Software Collections can provide a way to move faster than the base. On a fast-moving platform like Fedora, we could use it in the other way: providing longer-lived versions of certain components even as the base is upgraded.

Bill Nottingham responded with a self-proclaimed "heretical" suggestion that Fedora be turned into a much smaller platform, with packages from the "grand Fedora universe" that target one or more of those platform releases. In that model, the enormous pile of software that Fedora deals with for each release would be greatly reduced. Miller and others—notably enterprise-leaning participants—looked favorably on the heresy, but it was recognized that it would be a difficult direction for Fedora to take.

There are, of course, downsides to managing software via SCs. The "library bundling problem" comes to mind, for example. If multiple SCs all include a library (or other component) that needs to be upgraded for security reasons, it may require a great deal of work. One could imagine several different vulnerable versions of a library lurking in SCs that are still being used. Each of those needs to be fixed and all of the SCs need to be updated. For RHEL, that's par for the course, but Fedora has generally moves on before those kinds of problems become acute.

The conversation soon pivoted from Nottingham's suggestion to whether SCs might help external projects or companies in making their software available for Fedora. By providing a stable platform and a way for those entities to bundle up and install all of the needed pieces, more software might be made available for Fedora. Much of that software is, of course, proprietary, but there are advocates for making Fedora an easier target for those kinds of applications.

One of the problems that Fedora has faced over the years is the explosion of packages that it tries to maintain, test, and release in a short six-month time frame. Several commenters pointed to SCs (or something like them) as a way to decouple Fedora from that enormous list of packages. Projects could target Fedora, without actually becoming part of Fedora, as Fenando Nasser suggested.

Those kinds of ideas hearken back to an earlier arrangement for the Fedora distribution. While Adam Williamson's humorous idea of a return to Fedora Core and Fedora Extras was not taken seriously, some kind of similar split seemed to gain quite a bit of traction in the discussion. Whether it goes any further than that down the road remains to be seen, but SCs could potentially help the process if it does.

There are logistical and licensing questions—along with plenty of others—that would have to be resolved, of course. If there were a split, how would non-core (for lack of a better term) components manage their SCs and repositories? If they were under the Fedora umbrella, the distribution would have some responsibility for the contents of the packages. If they were not under the umbrella, users would have to somehow enter those repositories into their Yum configuration. Richard Jones outlined a number of issues that would need to be resolved under such a system.

While the conversation veered in a direction that Miller may not have expected, it does give an interesting view into the thinking of some (many?) in the Fedora development community. Some of the interest in a fairly radical change may come from frustration with the delays in the Fedora 18 cycle, but there seems to be more to it than that. For some time now, Fedora has been trying to find (or define) its niche. This conversation is another step along that path.


(Log in to post comments)

Why would one need Software Collections?

Posted Dec 20, 2012 18:06 UTC (Thu) by davecb (subscriber, #1574) [Link]

I can understand wishing to chose a particular interface for your mail, or wishing to chose a particular implementation for one's ldap, but why would one ever want to have multiple versions of a program?

I quite understand the versioning problem (I co-authored an ACM article on the subject) but with Linux's stable interfaces, there isn't a good reason for incompatability except for bugs, and one can deal with them by symlinking, for example, /usr/bin/bizzare to /usr/bin/bizzare_42.0

And anyone who ships versions of system libraries with applications deserves the sharp edge of Linus' tongue!

Back when I was one of the solarii, my evil twin (David J. Brown) imposed versioning on all library interfaces. If you shipped libfart.2.37.so, and you had changed the semantics (but not the signature) of the so-named library call, you had to ship the old .36 implementation as well as the new .37 implementation. Linkers linked the version that programs were compied for, and compilers defaulted to compiling for the newest.

As soon as he'd ensured that was in place, he prohibited vendors from shipping system libraries (they already weren't supposed to), and greatly reduced the likelyhood of them causing a patching/security problem.

If one installed a "too new" program, it had a dependancy on the latest version of the system library, that got installed, and both old and new programs just quietly worked.

Implementation- and use-choices are legitimate: collections as a way to paper over a bug are not.

--dave

Why would one need Software Collections?

Posted Dec 20, 2012 20:01 UTC (Thu) by hummassa (subscriber, #307) [Link]

> why would one ever want to have multiple versions of a program?

This one is simple: file/database formats change. Not all programs -- strike that -- almost no program maintains the whole history of file/database formats for upgrades.

Sometimes you need to have GenealogyProgramPlus v2.3 around together with v4.8, because the file format changed from v2 to v3 and again from v3 to v4 but v4 cannot read v2's files.

Why would one need Software Collections?

Posted Dec 20, 2012 22:07 UTC (Thu) by zlynx (subscriber, #2285) [Link]

Evolution is one of the worst for this. On their mailing list they often answer upgrade questions with the answer "Oh, you have to install Evolution 3.0, let it update your files, then run 3.4 or it won't work."

Well, to do that you have to build all of Evolution, from source, and hope really hard that one of the libraries it wants to link didn't get upgraded to an incompatible version in the meantime. If it did, you have to build ALL OF GNOME 3.0 from source, in its own installation directory, then build Evolution. Or, find each incompatible library and build an older version, then point the configure script to the older version, by hand, for each library.

You might, possibly, be able to point a Fedora 18 yum at a Fedora 14 repo and hope it worked. Or maybe try an install into a chroot. Or a minimal install into a virtual machine.

Software authors really need to *think* about these things when updating data formats. For example, if Evolution used stand alone programs or scripts with no external dependencies for upgrades, the entire upgrade series could be installed with the package. Then you could run 2.20 to 2.24, 2.24 to 3.0, 3.0 to 3.4.

Why would one need Software Collections?

Posted Dec 21, 2012 1:45 UTC (Fri) by bats999 (subscriber, #70285) [Link]

Good example. Apparently local mail clients are insanely complex programs, beyond taming (OK, evolution is more than just a mail client...)

Lilypond would be a good counterexample; convert-ly keeps my files working across versions and is part of the suite, not a third party workaround. I'm sure there are other examples, but I can't think of any at the moment...

Why would one need Software Collections?

Posted Dec 21, 2012 19:04 UTC (Fri) by davecb (subscriber, #1574) [Link]

Ouch! I confess I've only seen that once, and I helped fix it. I forgot it's a real consideration for some customers/victims!

If you want, we can start a jumpstarter project to hire "Guido" to talk to those folks and give them an offer they can't refuse (;-))

--dave

Why would one need Software Collections?

Posted Dec 21, 2012 15:36 UTC (Fri) by mattdm (guest, #18) [Link]

> I can understand wishing to chose a particular interface for your mail, or wishing to chose a particular implementation for one's ldap, but why would one ever want to have multiple versions of a program?

Primarily, this isn't about versions of end-user applications, but of development and library stacks.

Why would one need Software Collections?

Posted Dec 21, 2012 19:06 UTC (Fri) by davecb (subscriber, #1574) [Link]

Good: that is more amenable to the "David J. Brown is as scary as Linus" technique (:-))

--dave

Why would one need Software Collections?

Posted Dec 21, 2012 22:48 UTC (Fri) by cry_regarder (subscriber, #50545) [Link]

When providing compute infrastructure to various users, such as at a research university or lab, quite frequently a researcher needs to keep an identical configuration throughout the lifetime of a particular project. Same libraries, same compiler, etc.

Having these providable in a sane manner other than every researcher having to build and maintain their own software stack is quite desirable.

Also as a software developer, it is valuable to be able to test your software's ability to be compiled by various versions of the compiler. Especially with C++11 as each compiler version adds capabilities.

Cry

Why would one need Software Collections?

Posted Dec 22, 2012 15:11 UTC (Sat) by davecb (subscriber, #1574) [Link]

That's a different problem than software collections was being proposed for: almost all my customers have the "invariance" problem, and uniformly deal with it by freezing the system and then running it it a virtual machine to which they only (auto-)apply security patches.

Of course, every once in a while the security patches break things and they have to revert to the previous version (:-))

--dave

Why would one need Software Collections?

Posted Dec 27, 2012 1:12 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

When providing compute infrastructure to various users, such as at a research university or lab, quite frequently a researcher needs to keep an identical configuration throughout the lifetime of a particular project.

Sure, but if you need a stable platform for your project, Fedora was probably the wrong choice to begin with. Fedora is not trying to be all things to all users; it's trying very much to be a distribution that provides the latest and greatest software to users who want to use the latest and greatest. Users who need more stability than Fedora provides should choose a distribution that concentrates on just that; RHEL would be a good choice for people who are otherwise happy with the general structure of Fedora.

Why would one need Software Collections?

Posted Dec 27, 2012 16:46 UTC (Thu) by amacater (subscriber, #790) [Link]

RHEL (and Fedora) have the problem that they don't provide enough packaged software for long term sustained software development use UNLESS you use repoforge/RPMForge/EPEL in addition.

My recommendation for people who want to develop within an established Linux distribution framework is always to consider Debian first - because everything is available.

This won't help you with Ruby/Python or Node.js development, however, since these languages are all barking mad and use their own packaging systems at insane paces of change, thus creating problems for every Linux distribution

Why would one need Software Collections?

Posted Dec 27, 2012 18:05 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

What you are referring to applies to RHEL since what it includes is only what Red Hat can support commercially. Debian is not a commercial distribution and the same constraints don't exist there. Similarly for Fedora, you don't need external repos for development typically. It includes everything volunteers are interested in maintaining. None of repos you mentioned are even compatible with Fedora.

Why would one need Software Collections?

Posted Dec 28, 2012 12:53 UTC (Fri) by seyman (subscriber, #1172) [Link]

FWIW, I went the opposite route. After installing Debian at work, I started bashing my head against the wall because of all the unavailable Perl modules (stuff like psgi and starman) or outdated (the Perl core itself, JSON::RPC stuck on the 0.9x branch, ...).

Switching to Fedora made all these problems go away.

Why would one need Software Collections?

Posted Dec 28, 2012 13:22 UTC (Fri) by hummassa (subscriber, #307) [Link]

cpan2deb is your friend in Debian.

Why would one need Software Collections?

Posted Dec 28, 2012 22:49 UTC (Fri) by seyman (subscriber, #1172) [Link]

If you have the time to maintain the missing modules yourself, cpan2deb is indeed a solution.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds