|
|
Subscribe / Log in / New account

Tracking package updates with release-monitoring.org

By Jonathan Corbet
January 18, 2017

linux.conf.au 2017
Nick Coghlan started his 2017 linux.conf.au talk by saying that securing network services is a hard problem; indeed, it is one of the hardest problems that we are facing today. As the use of free software continues to grow, the old methods for managing this problem are breaking down. Fortunately, at the same time, new techniques are being developed to address some of the resulting problems.

The OWASP top ten list for 2013 listed the ten most significant sources of web-service vulnerabilities seen at that time. Number nine on that list is "using components with known vulnerabilities". Clearly, updating the components used by a web service when vulnerabilities are fixed is an important part of any site's security model. But, as the number of components grows, keeping up with them all can be a challenge. As a result, update management has become a key security concern; if an update is not actually deployed into running systems, it will not be helpful.

There are, Coghlan said, three possible approaches to the security problem. The traditional method is the "hardened bunker"; the attack surface of the service to be protected is minimized to the greatest extent possible and the software is frozen. When components are updated, patches are backported into the bunker and applied, with a lot of attention paid to not breaking things.

Increasingly, though, automation enables another way, one which he called the "moving target". Services running in containers can be easily replaced at any time; new containers can be created at will, and unwanted containers can simply be deleted. Rapid container turnover alone can help, since a successful attacker [Nick Coghlan] cannot create a permanent foothold when the compromised "systems" are ephemeral. But, in a world with automated testing and continuous-integration systems, we can do better. Web services can be automatically rebuilt whenever a new version of some component is released, ensuring that they are always running current software with all known fixes applied.

The third way, incidentally, is called the "sitting duck"; that's what happens when no effort is made to keep a service secure. Coghlan did not recommend this approach.

Linux distributions were born in the era when publishing software meant putting up a tarball somewhere. The distributors then came up with their own packaging formats that facilitated the creation of an integrated system. At that time, most upstream projects could not do automated testing, so quality assurance was, to an extent, left to downstream users and, at the higher levels, to commercial vendors. Thus the long-term support distribution model was created, where distributors would offer a curated, quality-controlled set of packages.

But times change. The cost of running web services has dropped dramatically, and the availability of those services has grown correspondingly. Services like GitHub Continuous Integration are made available for free for open-source projects. Increasingly, those projects are requiring that the regression tests pass before a change can even be considered for merging. In addition, most modern language communities have their own publication systems making it easy for new releases to get out to users. All of this makes some interesting new things possible.

Consider Libraries.io, an upstream-project monitoring service aimed at developers. Such a service is clearly useful to those trying to turn their web services into moving targets. It was launched relatively recently — March 2015 — and is now monitoring over 2 million projects on 33 separate publishing platforms. One could compare that to Open Hub (700,000 projects), Freshmeat/Freecode (50,000), Debian (50,000), or Fedora (20,000). In recent years, with the advent of services like Libraries.io, there has been a growth of about two orders of magnitude in the number of packages being watched.

That growth makes the moving-target model possible, raising questions about the role that Linux distributors will play in the future. Coghlan emphasized that the hardened-bunker model remains appropriate for some types of services and will not be going away anytime soon. But a lot of software deployments are moving to on-demand models; this is even true in the embedded devices area. Platforms like resin.io are designed to allow devices to become moving targets as well. In this era, a distributor's role shifts beyond quality assurance toward enabling automatic quality assurance done by others.

Thus release-monitoring.org, a monitoring service focused on redistributors of software. It performs release tracking, but also manages mappings between upstream names and package names used by distributors. The service is based on two free components:

  • Anitya performs the release tracking, handles mappings to package names, and emits events when changes are observed. Unlike Libraries.io, Anitya watches both libraries and application projects. It has a series of plugins that can monitor different publication platforms; the simplest of those takes a URL and a regular expression to fish out the release information.

    Anitya understands the concept of upstream ecosystems; each of those has its own namespace. That prevents it from trying to track, for example, two separate packages called "requests" on the PyPI platform, while allowing "requests" packages to exist on other platforms. It mostly handles mappings to Linux distributions at the moment, but it was designed to be more flexible so that it can, for example, track packages shipped by commercial language vendors.

  • Fedmsg is a message bus used to distribute notifications. The name was originally short for "Fedora Message Bus", but that was changed to "Federated Message Bus" after Debian started using it as well. It is written in Python, and based on the Twisted framework. It has message-source authentication built in, based on either GnuPG or X.509 keys. Fedmsg was designed to work without a broker process if desired, but a brokered configuration is also possible: the fedmsg-relay utility can be used to that end when users want to set up a single endpoint and hide the details of the underlying system.

This system is up and running now; interested users can create an account, submit projects to monitor, and use fedmsg to get events for projects of interest.

This work is just getting started, though, and the list of future enhancements is not small. To begin with, release-monitoring.org does not have anything close to the 2 million projects covered by Libraries.io; it would be nice to close that gap. Needless to say, the current manual-entry method for adding projects is an impediment to that goal, so work is underway to add an automated registration mechanism. The immediate goal is to ensure that all packages shipped by Fedora are covered; it should be possible to add Debian's packages as well without too much trouble.

A possible future addition would be a backend for obtaining information from Libraries.io. While the monitoring of projects is "heavy lifting" that somebody has to do, it's not clear that everybody needs to do it. Getting the actual release information from Libraries.io seems like a more efficient way to go. A less-certain addition is the ability to track the versions of projects shipped by downstream distributors. That would help users understand what has actually been packaged, and would help distributors track their performance in keeping up with their upstreams.

Coghlan concluded by saying that open-source software has exploded in the last ten years. Our old ways of tracking and packaging all this software simply are not keeping up anymore. New techniques offer some intriguing possibilities, though; the next few years are going to be interesting.

[Your editor would like to thank linux.conf.au and the Linux Foundation for assisting with his travel to the event.]

Index entries for this article
Conferencelinux.conf.au/2017


to post comments

release-monitoring.org page

Posted Jan 19, 2017 10:53 UTC (Thu) by jnareb (subscriber, #46500) [Link] (3 responses)

A nitpick: you can see that release-monitoring.org is in beta stage; the page title is "Anitya" instead of "Release monitoring" or "release-monitoring.org", or "Monitoring project releases".

release-monitoring.org page

Posted Jan 19, 2017 14:53 UTC (Thu) by mattdm (subscriber, #18) [Link]

It's not in a "beta" stage — it's in production and does real work in Fedora. Of course, there's always room for improvement and it's certainly not in a "final" stage.

release-monitoring.org page

Posted Jan 20, 2017 0:29 UTC (Fri) by lsl (subscriber, #86508) [Link] (1 responses)

I guess one is its name and the other is its purpose. "Monitoring project releases" would be a horrible name, don't you think?

release-monitoring.org page

Posted Jan 25, 2017 11:44 UTC (Wed) by ncoghlan (guest, #85242) [Link]

Right, the actual open source project for the web service is Anitya: https://github.com/fedora-infra/anitya/

The service and the project co-evolved though, so some elements use the project name, while others use the name of the shared public instance.

Tracking package updates with release-monitoring.org

Posted Jan 25, 2017 17:17 UTC (Wed) by ScottMinster (subscriber, #67541) [Link]

Two million packages? I figured that would mean a lot of overlap in functionality, but it looks like there's even a lot of plain duplicates. As an example, I searched for ImageMagick and found 722 hits! Sure, you'd expect variations like IM wrapper for Perl, Python, etc, but there were 195 Node.js (npm) hits and 339 Go hits. Many of them appeared to be github forks of one another. At least a number of the npm hits appear to be different wrappers (duplicate functionality if not code) or projects that use ImageMagick.

It's clear to me at least that there is still is value in having a distribution comb through all of that to find the "right" version of the library. Having a high absolute count is not really desirable if they are all duplicates.


Copyright © 2017, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds