Packages, software centers, and AppData
At Flock 2014 in Prague, Red Hat's Richard Hughes presented an update of his work with AppData and AppStream, the under-the-hood elements that power the graphical GNOME Software Center package-installation tool used by Fedora. Although it might seem like orchestrating package information in an easy-to-browse installation tool would be a straightforward task, the process actually involves considerable work to handle all of the corner cases, ill-defined boundaries, and mutability that arise in modern software packaging.
The primary question, Hughes said, is "what is an application?" Fedora's previous graphical software installer, PackageKit, assumed a rather formal and absolute definition: application = package. But this is not really true, he said: there is often an N:N relationship between packages and applications as users think of them. Some packages install multiple applications, other applications require multiple packages, while other packages only include libraries, and some things that users regard as applications do not even appear anywhere in the menus (for example, alternate interface modes that can be launched with a special option or flag). Ultimately, the PackageKit model did not really do what users wanted, so it was time to step back and re-examine the subject.
AppData
For the purposes of a graphical installer, he said, it was decided that anything that includes a .desktop file that does not set the nodisplay property to true is an "application." This position, he noted, makes command-line only tools the domain of existing command-line package managers, rather than the graphical installer. But while the .desktop file provides much of the information that an installation tool needs (such as the application name and useful metadata like categories), it does not provide enough to describe the application in an "app store"–style installer. Additional things like screenshots and longer descriptions of the application are required.
Ubuntu had already addressed this problem in its own Software Center, adding this rich metadata to .desktop files. The GNOME Software Center team looked at this, Hughes said, but ultimately decided that stuffing too much into these files would be problematic—particularly if every .desktop file needed to include the same rich text duplicated in every translation. Instead, he started the AppData specification, which pulls out this sort of metadata into a separate XML file. AppStream is the framework that, using AppData, powers GNOME Software Center.
And keeping long descriptions and screenshot information in a separate file has other advantages beyond keeping .desktop files compact, he said. For one thing, a separate AppData file provides a place for the distributor to override package names in those instances where multiple applications all want to register for the same generic name (e.g., GNOME Calculator, KDE Calculator, and Xfce Calculator all want to claim "Calculator"). For another, keeping long-text descriptions in structured XML allows them to be more easily translated, and it allows translatable captions to be added for screenshots.
In theory, the AppData file for any given application would best be written upstream, where the developers can keep it up to date and provide the best screenshots and descriptions. Consequently, when the AppData specification had stabilized, Hughes set out to personally email more than 400 upstream application projects and asked them to help update their AppData files.
In keeping with this preference for upstream data management, AppData files include fields for a project homepage URL and an "update contact" email address. Hughes said he expects to use the contact information twice a year at most. Other URL fields are also available, such as one for a "help site" and one for a donations page, although so far these URLs are not exposed in Software Center's interface.
The weirdos
The trouble with AppData, however, is that it does not capture all of the user-installable things that people want. There are quite a few corner cases left out, each with its own set of challenges: GStreamer codecs, fonts, and input methods, for example, not to mention all of the plugins expected by users of GEdit, Firefox, Eclipse, and so forth.
So the definition of "application" has to be adjusted accordingly, to mean "anything that includes a .desktop file that does not set the nodisplay property to true ... plus these exceptions." In an attempt to tackle this problem without confusing the meaning of the existing AppData XML file, Hughes created a separate file type called MetaInfo that would associate these corner cases with an application. Software Center can see from a MetaInfo file that (for example) a particular package contains a GEdit plugin and not a standalone program; the software center can then provide a link to the plugin on its GEdit screen, where it will be most useful.
Finding all of the important corner cases and creating the appropriate MetaInfo files for them, however, has been a difficult task. Several steps are involved, Hughes said. First, RPM packages are unpacked and parsed to see which ones contain useful metadata. This is akin to a "poor man's Mock," he said. The process takes about 20 minutes to decompress and analyze all of Fedora, he said, saturating four CPU cores and 100% of the SATA bandwidth.
From the RPM analysis, alternate executables can be found (for example, Hughes explained, some photo editors can also be launched in a separate "just import pictures from the camera" mode; similarly some application packages may ship with auxiliary tools). The analysis can also find GNOME Shell extensions, menu bar applets, data packages intended for use with a specific application (like GIMP brush packs) and plugins.
Several of the corner cases still require separate treatment, though. Fonts are processed to generate a thumbnail image and screenshot of the basic character set. Codecs must be "fudged" on a case-by-case basis, though. For patent-encumbered codecs like MP3, Hughes explained, Fedora will include the codec in its AppStream search results, but the page for the codec will point users to a wiki page explaining that the codec cannot be installed automatically (and, hopefully, allowing the user to figure out what to do next).
Devil in the details
One interesting side effect of the RPM decompression and analysis process, though, is that there are a lot of opportunities to amass other interesting metadata about packages at the same time. For example, Hughes said, the analysis tool currently notes which applications have translations available, which integrate with the desktop notification area, which provide hooks for GNOME Shell's search feature, which have GObject Introspection files, which have (or lack) documentation, and which use outdated versions of important dependencies like GTK+.
For now, not all of this information is used in Fedora's version of Software Center. However, applications are given "extra stars" internally if they include useful features, and some applications are blacklisted if they are so out of date that they are of questionable value (such as GTK+1 applications). Hughes admitted that this blacklisting approach was controversial in some circles, but said that ultimately he had to make a call about where to put the cut-off. There are a few other criteria that can exclude a package from Fedora's Software Center, such as not having an icon of at least 48-by-48 pixels. If the upstream project and the community are not willing to take the steps necessary to produce that one icon, even after being asked, he said, one wonders how much interest there is in the program.
All together, Hughes said, the AppData and MetaInfo files produced by processing the entire Fedora package collection weigh in at 9MB, uncompressed. After collecting the data, naturally, he ran it through a validation process. At the moment, about 35.4% of Fedora's packages have the desired rich-text descriptions—but, he cautioned, considerably more have added them upstream, so the number will be much higher for the next Fedora release cycle.
The result, eventually, will be a much richer Software Center experience for Fedora users. A well-maintained AppData database will allow the software center to show useful recommendations, link to plugins and other add-ons, and enable users to see which applications are the most up to date. Hughes listed a few things that he still has yet to tackle, including a working ratings system (specifically, one that cannot be easily gamed by malicious users), integration with Fedora Account System (FAS) accounts (so that users can more easily install the same apps on multiple machines), and a mechanism for useful cross-application recommendations (such as noting that most Inkscape users also use MyPaint).
So, while there may be much more work to be done, the basic framework is well in place to align what the software installation tool sees as an "application" with what the user really wants.
[The author would like to thank the Fedora project for travel
assistance to attend Flock 2014.]
| Index entries for this article | |
|---|---|
| Conference | Flock/2014 |
