By Jake Edge
October 15, 2008
Measuring the health of communities is an interesting, difficult task. The
Fedora project has recently started using a new tool, called EKG, to try to get an overview of
the demographics of the free software projects that are sponsored by
the distribution. EKG is still young, but already provides some
interesting information. Because it is GPL-licensed, as is the Fedora
norm, it can be picked up by other distributions or interested parties to
target their own projects.
At its core, EKG is a few Ruby scripts that process mailing list data so
that graphs can be produced. Currently, it produces both pie charts and
line graphs that indicate the number of Red Hat posters versus those from
elsewhere. A portion of the most
recent set of graphs can be seen at right.
Red Hat's Michael DeHaan has taken on development of EKG to use as a tool
to measure how
well various projects are building a community separate from Red
Hat. There are lots of free software projects that have been released by
Red Hat—or Fedora, which often amounts to the same thing—but
may or may not be seen as useful tools outside of Fedora. By looking at
the mailing list traffic, particularly over time, some idea of which
projects are building a community, and which aren't, can be derived. As
the project page puts it:
The premise is simple... what are the demographics behind open source
projects that we run in Fedora?
- Who posts
- Who contributes
- What projects are most active?
- What projects need a little help?
Mailing lists are just one measure of the health of a project, of course,
so DeHaan is looking at other metrics. Commits to the project
repository—along with the identities of the commiter—would seem
an obvious choice. Better graphs with more useful information on each axis
as well as time series of the pie charts are also on the "to do" list.
He is also looking at derived statistics that will allow direct comparison
of different projects by using equations that in some way model success.
It is difficult to draw any conclusions from the limited graphs that are
currently available. One thing that does stand out, though, is the
popularity of gmail.com email addresses, which seem to account
for around one-quarter of posts. One can also certainly see projects that
are completely dominated by "inside" (i.e. Red Hat) folks. The JBoss lists
are a good example.
Projects are trying various ways to measure how well they are doing their
job; EKG is another way to do that. For the kernel, the statistics on each
release are gathered by LWN, as well as over longer
periods by the Linux Foundation. Ubuntu has its Upstream Report which looks at
how well bugs are getting to upstream bug trackers. Undoubtedly other
projects have their own ways of trying to measure their impact.
As yet, there is no mailing list for EKG development. We look forward to
the day when EKG is applied to its own development list. It would seem
that some kind of "metahealth" measurement of the community
might be able to be derived from that data.
(
Log in to post comments)