User: Password:
Subscribe / Log in / New account

Fedora checking community health with EKG

Please consider subscribing to LWN

Subscriptions are the lifeblood of If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jake Edge
October 15, 2008

Measuring the health of communities is an interesting, difficult task. The Fedora project has recently started using a new tool, called EKG, to try to get an overview of the demographics of the free software projects that are sponsored by the distribution. EKG is still young, but already provides some interesting information. Because it is GPL-licensed, as is the Fedora norm, it can be picked up by other distributions or interested parties to target their own projects.

At its core, EKG is a few Ruby scripts that process mailing list data so that graphs can be produced. Currently, it produces both pie charts and line graphs that indicate the number of Red Hat posters versus those from elsewhere. A portion of the most recent set of graphs can be seen at right.

[EKG output]

Red Hat's Michael DeHaan has taken on development of EKG to use as a tool to measure how well various projects are building a community separate from Red Hat. There are lots of free software projects that have been released by Red Hat—or Fedora, which often amounts to the same thing—but may or may not be seen as useful tools outside of Fedora. By looking at the mailing list traffic, particularly over time, some idea of which projects are building a community, and which aren't, can be derived. As the project page puts it:

The premise is simple... what are the demographics behind open source projects that we run in Fedora?
  • Who posts
  • Who contributes
  • What projects are most active?
  • What projects need a little help?

Mailing lists are just one measure of the health of a project, of course, so DeHaan is looking at other metrics. Commits to the project repository—along with the identities of the commiter—would seem an obvious choice. Better graphs with more useful information on each axis as well as time series of the pie charts are also on the "to do" list. He is also looking at derived statistics that will allow direct comparison of different projects by using equations that in some way model success.

It is difficult to draw any conclusions from the limited graphs that are currently available. One thing that does stand out, though, is the popularity of email addresses, which seem to account for around one-quarter of posts. One can also certainly see projects that are completely dominated by "inside" (i.e. Red Hat) folks. The JBoss lists are a good example.

Projects are trying various ways to measure how well they are doing their job; EKG is another way to do that. For the kernel, the statistics on each release are gathered by LWN, as well as over longer periods by the Linux Foundation. Ubuntu has its Upstream Report which looks at how well bugs are getting to upstream bug trackers. Undoubtedly other projects have their own ways of trying to measure their impact.

As yet, there is no mailing list for EKG development. We look forward to the day when EKG is applied to its own development list. It would seem that some kind of "metahealth" measurement of the community might be able to be derived from that data.

(Log in to post comments)

Copyright © 2008, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds