A Modularity rethink for Fedora

By Jake Edge
January 3, 2018

We have covered the Fedora Modularity initiative a time or two over the years but, just as the modular "product" started rolling out, Fedora went back to the drawing board. There were a number of fundamental problems with Modularity as it was to be delivered in the Fedora 27 server edition, so a classic version of the distribution was released instead. But Modularity is far from dead; there is a new plan afoot to deliver it for Fedora 28, which is due in May.

The problem that Modularity seeks to solve is that different users of the distribution have differing needs for stability versus tracking the bleeding edge. The pain is most often felt in the fast-moving web development world, where frameworks and applications move far more quickly than Fedora as a whole can—even if it could, moving that quickly would be problematic for other types of users. So Modularity was meant to be a way for Fedora users to pick and choose which "modules" (a cohesive set of packages supporting a particular version of, say, Node.js, Django, a web server, or a database management system) are included in their tailored instance of Fedora. The Tumbleweed snapshots feature of the openSUSE rolling distribution is targeted at solving much the same problem.

Modularity would also facilitate installing multiple different versions of modules so that different applications could each use the versions of the web framework, database, and web server that the application supports. It is, in some ways, an attempt to give users the best of both worlds: the stability of a Fedora release with the availability of modules of older and newer packages, some of which would be supported beyond the typical 13-month lifecycle of a Fedora release. The trick is in how to get there.

The main problem that arose with the modular server edition was, in effect, a lack of modules. It turned out to be far more painful for packagers to build modules than expected, so few did. That left it up to the Modularity team to build the modules that would ship with Fedora 27. As Stephen Gallagher, who has been one of the driving forces behind the initiative, put it:

The most common feedback from users was: "How do I install package foo in Modular Server?". In this case, foo ranged across a wide variety of software, including everything from the screen package to complex third-party applications. In that version of Modularity, a system was either all Modular or none of it was. To make software available in the fully-modular system, its packages needed to be part of a module — and unfortunately, we didn't succeed in making many of those.

In addition, the first mechanism chosen to build modules relied on a "bootstrap" module that, among other things, made it difficult for existing Fedora users to upgrade into a modular server release. Third-party software was also problematic in this first approach, since it would need to be built into a module—something that was difficult for anyone but the Modularity team to accomplish.

New approach

The original plan was to define a build environment (buildroot) specifically for the modular server, but that seems to have caused more problems than it solved. The new plan is to use the "everything" repository for the Fedora 28 release as the underlying "platform module", which makes things more straightforward. Importantly, it makes things easier for module packagers:

What we decided instead is to treat Fedora’s "Everything" repository (essentially, the complete set of software available within a Fedora release) as the "platform module", though the tooling will not report this content as a module. In practical terms, this means that creators of modules will no longer need to go through the very painful process of tracking down which modules provide a dependency that they need. Instead, they will be able to depend on the system version available in the Everything repo.

That change will also make it easy for users to simply upgrade into a modular release. Modules and traditional packages can coexist on a system as well. So far, the plan has been for only the server edition to support modules, but with an easier upgrade path and the ability to support both packages and modules, the idea could be adopted by other editions (e.g. workstation) of Fedora.

In fact, the module-creation process will become so straightforward that automated tools will be provided to create the configuration to build single-source-package modules. "Even for more complex multi-package modules, the automatically-created module definitions provide an easy and obvious starting point." This will make it easy to support multiple versions, as Gallagher notes:

Instead of a complex collection of a package and all of its dependencies, modules will now only need to describe the parts that differ from the base repository. For example, Fedora 28 will ship with the Node.js 8.x LTS release in the standard repository, and a module could be built to provide the 9.x experimental release as an option. We could also easily provide the older 6.x LTS release to support older applications. In these cases, we can ship very simple module definitions which just lists the dist-git branches matching the desired upstream releases.

For future Fedora releases, there will be two sets of repositories to support both the traditional RPM-based distribution and the modular approach. Those who have no interest in modules can disable the modular repositories and continue on as they always have. For others who are looking for the modular approach, though, it will be as easy as simply using the DNF package manager with some new target-specification syntax to pick up modules.

It is not surprising that a change of this nature might run into some turbulence as it gets integrated into a well-established distribution packaging ecosystem like Fedora's. It is a pretty fundamental change to the distribution, so problems are to be expected during the upheaval. As Fedora project leader Matthew Miller put it in the announcement of the rethinking: "Sometimes experiments produce negative results. That's okay — the project learns even when trying a path that doesn't work out, and it iterates to something better." For his part, Gallagher expressed optimism that the Modularity project is now on better track:

This refined plan offers an understandable, approachable, and deliverable future for Modularity. Packagers who don't want to produce modules will be able to continue packaging exactly as they always have with no modification to their workflows. Those who want to provide alternative versions of software in a single release or to easily provide the same version across multiple releases will have new tools to simplify this.

As the number of available modules grows, users of Fedora will have a much easier access to the exact version of software they want to accomplish their tasks. People doing rapid-prototyping can more easily access newer versions of packages and at the same time people running older applications can continue to access the older streams that they need.

As Miller pointed out, progress is not made without some missteps along the way. It remains to be seen if Modularity represents progress, but the problem it addresses is certainly real—the approach Fedora is taking seemingly has the potential to solve it. One the major benefits of development in the open is that these kinds of missteps are not hidden behind delayed releases, vaporware, press releases, and other obfuscation techniques as they often are in the proprietary software world. In the free-software world, we get to see the sausage being made (and remade), so projects can learn from each other. With luck, that makes for better software throughout our ecosystem.

A Modularity rethink for Fedora

Posted Jan 4, 2018 3:47 UTC (Thu) by mattdm (subscriber, #18) [Link]

It's worth noting that while:

Modularity would also facilitate installing multiple different versions of modules so that different applications could each use the versions of the web framework, database, and web server that the application supports.

... is true, Modularity (in this version) doesn't address the problem of installing those different versions on the same system. In most cases where this is needed, we expect containers to be the best approach and didn't want to invent yet another technology in that space. And now, you can get the content for those containers (below your own application) from the same distribution.

A Modularity rethink for Fedora

Posted Jan 4, 2018 9:09 UTC (Thu) by TomH (subscriber, #56149) [Link] (1 responses)

I think that this sentence while correctly describing the result slightly misunderstands the causation:

It turned out to be far more painful for packagers to build modules than expected, so few did.

The reality is that as far as I can recall there was very little (essentially zero) effort to encourage packagers to build modules for their packages. There was a lot of general waffle but I don't recall any sort of direct announcement of the "here's how you build modules, please go and do it" type.

Now that is probably because the people working on modularity realised it was going to be too hard for packagers and so never made that request but the exact cause-effect relationship suggested by that sentence never really occurred.

A Modularity rethink for Fedora

Posted Jan 4, 2018 13:44 UTC (Thu) by sgallagh (guest, #80524) [Link]

TomH is right here. While we published some blogs and general announcements about the process of building modules, we never took it to the general packaging public because we knew that in the state it was in no one outside the Modularity team would have been able to succeed at it. We had been focusing on building tools to simplify things, but we eventually got to a point where we realized that you can only put so many band-aids on the same leg before you need to amputate.

So we identified the biggest problem with the current approach -- the difficulty of managing dependencies -- and designed a new approach that allowed us to bypass that for the majority of cases. We're retooling a bit of the release-engineering pieces that need to be updated for this approach and then you can expect a new set of packaging guidelines and a HOWTO blog post.

This time around (Fedora 28), there will be a big push to get packagers on-board with building their own modules, particularly because we should be able to make it easier for the common case where we are building the same exact sources on multiple releases.

A Modularity rethink for Fedora

Posted Jan 4, 2018 13:45 UTC (Thu) by ewan (guest, #5533) [Link] (1 responses)

The new approach sounds remarkably similar, at least in end result, to the Red Hat Software Collections infrastructure for RHEL/CentOS.

A Modularity rethink for Fedora

Posted Jan 4, 2018 16:42 UTC (Thu) by sgallagh (guest, #80524) [Link]

There are two major differences from Software Collections:
1) Packaging RPMs to be used in modules does not require the use of SCL macros to relocate the content into /opt
2) Software Collections provide parallel-*installability*, while Modules only provide parallel-*availability* (meaning there are multiple versions available, but only one can be enabled at a time -- with some limited exceptions).

So there are plusses and minuses to both approaches, but given the extreme difficulty of creating and consuming SCLs, we think the module approach is a net improvement. It will also make things better for generating containers (which provide parallel-installability) because users will be able to choose a non-default version of software that is still supported by Fedora, rather than needing to roll their own inside their containers.

A Modularity rethink for Fedora

Posted Jan 4, 2018 14:11 UTC (Thu) by paulj (subscriber, #341) [Link] (9 responses)

So.. .what is a module, and how does it differ from an RPM?

This article makes it sound like 'modules' are completely different things to RPMs, is that right? If so, what are they exactly?

A Modularity rethink for Fedora

Posted Jan 4, 2018 15:59 UTC (Thu) by sgallagh (guest, #80524) [Link] (8 responses)

As a quick-and-dirty answer, Modules are collections of RPMs that can be swapped in and out of the package manager. So Fedora might have a default which is the latest release of Ruby on Rails, but I might have an application that requires the previous stable release. So (assuming that a module for that previous release exists), I would `dnf module enable rails:version` and then install my application. The package manager would then use this module to satisfy the dependencies of the application.

A Modularity rethink for Fedora

Posted Jan 4, 2018 17:51 UTC (Thu) by farnz (subscriber, #17727) [Link] (5 responses)

How does this differ from having multiple repositories enabled, with dependencies from the "module" repositories on the "base" repository and on other repositories in the set? Is there something special in place to check that dependencies are met and the set of base + permissible modules is transitively closed, or is there something else that makes this better than having N binary repositories that you enable and disable?

A Modularity rethink for Fedora

Posted Jan 4, 2018 18:46 UTC (Thu) by sgallagh (guest, #80524) [Link] (2 responses)

Most notably, it's much higher-performance than having multiple repositories (and easier for a user to manage). Among other things, yum/dnf scales very poorly to additional repositories (particularly in terms of retrieving the metadata). With modules, all of the content lives in the same repository with common repodata which will be much faster.

As for management, knowing which set of repositories must be enabled in order to get your particular framework is complicated; with Modules, we can set dependencies (similar to RPMs), so that if you want e.g. the "rails:5" module, it will automatically enable the "ruby:2.4" module implicitly. Add to that the ability to get a quick and easy view of all available modules (rather than something like COPR or Ubuntu PPAs where you have to go *find* a new repo) and I think the value over-and-above multiple repos becomes obvious.

A Modularity rethink for Fedora

Posted Jan 5, 2018 13:38 UTC (Fri) by Conan_Kudo (subscriber, #103240) [Link] (1 responses)

> Among other things, yum/dnf scales very poorly to additional repositories (particularly in terms of retrieving the metadata).

This is news to me. My Fedora system is able to handle more than a dozen repositories (COPR repos, third party applications, and Fedora main repositories) rather well.

Sure, it could be better (most of the DNF developer team would like to get rid of librepo and libcomps to do something less convoluted), but it works very well.

A Modularity rethink for Fedora

Posted Jan 5, 2018 14:02 UTC (Fri) by rahulsundaram (subscriber, #21946) [Link]

> My Fedora system is able to handle more than a dozen repositories (COPR repos, third party applications, and Fedora main repositories) rather well.

Unlike yum, dnf downloads the filelists regardless of whether it needs to and that alone is a very significant problem for people with low bandwidth. I know there has been some discussions over this but so far, this hasn't been really addressed.

A Modularity rethink for Fedora

Posted Jan 8, 2018 16:23 UTC (Mon) by mattdm (subscriber, #18) [Link] (1 responses)

I think it's useful to think of this in three separate parts:

* user experience
* what it looks like from an infrastructure/repo/mirror point of view
* packager experience

The decision to use a merged repo with special DNF support for modular awareness vs. managing multiple repos (like the mostly-discarded Copr Playground plugin http://dnf-plugins-core.readthedocs.io/en/latest/copr.html) _mostly_ affects the middle part. It leaks a little bit into the user experience when it comes to managing and enabling things, but ideally with _either_ approach it'd be mostly transparent.

To me, really, that middle part is basically "implementation detail" and it's the user and packager experiences that I care about.

A Modularity rethink for Fedora

Posted Jan 8, 2018 16:48 UTC (Mon) by farnz (subscriber, #17727) [Link]

Yeah - it's just that what sgallagh originally posted sounded to me like the UX and packager experience would be exactly the same as with multiple repos (like COPR Playground provided). I was hence curious; there was clearly a lot of information elided in the original comment, as otherwise people I assume are competent have spent a lot of time reinventing COPR Playground and similar tools.

The follow-on from sgallagh clarified things nicely, though - it's at least clear why it's hard, even if I'm not yet convinced that I'd solve the problems the way they are (but then, I'm not doing the work, nor do I understand well enough to provide informed criticism, so I'll leave them to do the work the way they see best).

A Modularity rethink for Fedora

Posted Jan 4, 2018 22:59 UTC (Thu) by zdzichu (subscriber, #17118) [Link] (1 responses)

Sadly this is completely non-answer. If you want to use older version of something, you do "dnf downgrade something-older.rpm". If the app really needs older version and has it correctly specified in rpm metadata, following "dnf upgrade" won't replace something-older.rpm with something-newer.rpm because something-newer.rpm won't fullfil deps.
In other world, after years of hearing about modularity, we still doesn't know (understand?) how this differs from rpm packages.
And I'm a Fedora developer for almost a decade.

A Modularity rethink for Fedora

Posted Jan 8, 2018 16:33 UTC (Mon) by mattdm (subscriber, #18) [Link]

But if you're relying on "dnf downgrade something-older.rpm" and something-older.rpm isn't in the repo anymore, then what?

> In other world, after years of hearing about modularity, we still doesn't know (understand?) how this differs from rpm packages.

I think it's more useful to think of modules in comparison to "comps" groups — what you see with "yum grouplist" — rather than in comparison to RPM. Comps groups compare to RPM in that they are sets of RPMs. Modules are also sets of RPMs.

The important thing is that Modularity gives a way for packagers to easily (with the new design) create multiple version streams for a set of packages, and a way for users to easily consume those different streams. With _just_ RPMs, traditionally, we've done that in ways that don't scale — for example, by creating whole new packages as "package-compat" or "nameversion" as the package name, or SCLs.

A Modularity rethink for Fedora

Posted Jan 4, 2018 21:25 UTC (Thu) by bandrami (guest, #94229) [Link] (1 responses)

Isn't making developers and packagers "go through the very painful process of tracking down which modules provide a dependency that they need" the entire point of modularizing things?

A Modularity rethink for Fedora

Posted Jan 8, 2018 16:35 UTC (Mon) by mattdm (subscriber, #18) [Link]

> Isn't making developers and packagers "go through the very painful process of tracking down which modules provide a dependency that they need" the entire point of modularizing things?

I hope not. The point is to give users more options without exploding contributor work.