LWN.net Logo

FOSDEM: Infrastructure as an open source project

February 8, 2012

This article was contributed by Koen Vervloesem

The open source development model has many interesting properties so it's not surprising it has also been applied in domains other than software. In his talk at FOSDEM (Free and Open Source Software Developers' European Meeting) 2012 in Brussels, Ryan Lane explained how the Wikimedia Foundation is treating their infrastructure as an open source project, which enables the community to help run the Wikimedia web sites, including the popular Wikipedia.

Ryan Lane is an Operations Engineer at the Wikimedia Foundation and the Project Lead of Wikimedia Labs, a project aimed at improving the involvement of volunteers in operations and software development for Wikimedia projects. These projects, like Wikipedia, Wikibooks, and Wikimedia Commons, are well-known because of their large community of volunteers contributing content. Moreover, MediaWiki, the wiki software originally developed for Wikipedia and now also used in many other wikis, is an open source project.

In the early days, Wikimedia volunteers had not only their say on content and software, but also on infrastructure. There was no staff doing operations as the server infrastructure was all managed by volunteers. However, in the meantime operations was professionalized, and now it's all done by staff. Ryan's message in his talk was: "We want to change this, because operations is currently a bottleneck: it doesn't scale as well as software. That's why we had the idea to re-open our infrastructure to volunteers." But how do you give volunteers access to an infrastructure?

Puppet repositories

Wikimedia has already shared a lot of knowledge about its infrastructure on wikitech. This public wiki describes their network and server infrastructure in detail, including the open source software they use, such as Ubuntu, Apache, Squid, PowerDNS, Memcached, MySQL, and the configuration management tool Puppet to maintain a consistent configuration for all their servers.

Ryan's approach to open up Wikimedia's infrastructure even more was twofold. First, Wikimedia's system administrators spent a few weeks to clean up Wikimedia's Puppet configuration. After stripping all private and sensitive information, they published the Puppet files in a public Git repository. The sensitive stuff was moved to a private repository that is only available to Wikimedia staff and volunteers with root access.

But Ryan wanted more than just sharing knowledge about how Wikimedia manages its servers (the information in wikitech and the public Puppet repository): he wanted to treat operations as a real open source project where community members could edit Wikimedia's server architecture just like they did with Wikimedia's content and software. So he had to build a self-sustaining operations community around Wikimedia. For this to happen without sacrificing the reliability of Wikimedia's servers, a group of volunteers created a clone of the production cluster, which is mostly set up now. Thanks to this, staff and community operations engineers can push their changes to a test branch of the Puppet repository to try out new things on the cloned cluster. After a code review of the changes by the staff operations engineers, the code is evaluated by a test suite. If the code passes the tests, the changes are pushed to the production branch of the Puppet repository and hence the production systems are managed by the new Puppet configuration.

Wikimedia Labs is using OpenStack as a private cloud to run their server instances (virtual machines). At the moment, there are 83 instances running in the test cluster, managed by various Puppet classes, including base (for the configuration that applies to every server instance), exim::simple-mail-sender (for every server that has to send email), nfs::server (for an NFS server), misc::apache2 (for a web server), and so on.

Managing projects

There are also 47 projects defined in the Wikimedia Labs project, each of them implementing a specific task such as adding a new feature, adding monitoring, or puppetizing infrastructure that has been set up manually in the past. For instance, there are projects for bots, the continuous integration tool Jenkins, Nginx, Search, Deployment-prep which implements the clone of the production infrastructure, and so on. Each project has a project page on the wiki with documentation, the group members, and other information.

The interesting thing about these project pages is that most of the information is automatically generated. For example, when a server instance is running for a project, this instance is automatically shown at the bottom of the wiki page. And when someone types the command !log <project> <message> on the #wikimedia-labs IRC channel, it is automatically logged on the project page under the heading "Server Admin Log", which are subdivided by day. That way, a volunteer server administrator can explain what he did so other volunteers who are maybe living in a different timezone on the other side of the world can follow what is happening in the project.

The power of the community

So now that anyone has been able to push changes from ideas to production on Wikimedia's cluster for a couple of months, what are the results? According to Ryan, there are 105 users now in the Wikimedia Labs project who have contributed a variety of Puppet configurations:

One volunteer puppetized our existing Nagios monitoring setup (which was not managed by Puppet) in a very neat way. The bot infrastructure has also been improved much by volunteers. And at the San Francisco hackathon in January 2012 we had a project created, implemented, tested, and deployed to production during the hackathon. We have a custom UDP logging module written for nginx, and it had a couple of bugs in the format. Abe Music built an instance, installed our nginx source package, added the change to fix the formatting, then pushed them up for review. We reviewed the change, then pushed it to production. All of this happened during the hackathon.

So has this ambitious experiment been successful? According to Ryan, the original goal to lessen the bottleneck of the operations team definitely succeeded. However, he points out that the bottleneck has shifted: "We have to do these code reviews now, but fortunately it takes less time to review code than it does to make a lot of changes." Another issue Ryan sees is trust: "Giving out root to volunteers is dangerous, so we have to audit our infrastructure often. Moreover, there's always the danger of social engineering: newcomers can try to build trust to have us give them sensitive information about our infrastructure." But luckily the staff can count on a core of community people whom they trust to do these code reviews and audits.

All in all, Ryan thinks that the same model as Wikimedia Labs uses can also be used in other organizations to set up a volunteer-driven infrastructure. In particular, non-profits or software development projects that rely on a big infrastructure could profit from treating operations as an open source project. In addition to being able to tap into the potential of technical talents in the community, opening operations is also a great way to identify skilled and passionate people to hire for a staff position.


(Log in to post comments)

FOSDEM: Infrastructure as an open source project

Posted Feb 9, 2012 2:19 UTC (Thu) by nirik (subscriber, #71) [Link]

Fedora Infrastructure has been doing this for years. ;)

Our puppet repo is not (yet) public, but it's on my list to make it so.

http://fedoraproject.org/wiki/Infrastructure/GettingStarted

Also, I can't resist a plug: I'm going to be talking about Fedora Infrastructure at tomorrow nights Boulder Linux Users Group meeting. ;) http://lug.boulder.co.us/ for more info.

FOSDEM: Infrastructure as an open source project

Posted Feb 9, 2012 9:04 UTC (Thu) by misc (subscriber, #73730) [Link]

Mageia also have been doing this from the start ( ie 1.5 years, but that's more impressive to say "from the start" ) and the code is public ( since we just started at the right moment, technology wise, we were able to separate password from core ).

http://svnweb.mageia.org/adm/puppet/

We even took the time ( even if not perfect ) to split module in 2 groupes, for the reusable one and the specific one.

AFAIK, Debian also use puppet, and publish part of the configuration ( likely for the same reason as everybody else, the need to aduit before publishing, etc ).

FOSDEM: Infrastructure as an open source project

Posted Feb 9, 2012 20:52 UTC (Thu) by sciurus (subscriber, #58832) [Link]

I attended a Puppet conference last month, and the Puppet Labs employees were practically begging the attendees to publish modules to http://forge.puppetlabs.com/ rather than just sticking them on github.

FOSDEM: Infrastructure as an open source project

Posted Feb 10, 2012 4:16 UTC (Fri) by corvus (subscriber, #82) [Link]

This was a great article with a lot of useful links, and I'm really glad Wikimedia is doing this!

Ryan and Roan have been very visible in the OpenStack community. We're doing similar things with code review and testing using Gerrit and Jenkins, and I'm excited that we have a lot of direct cross-pollination going on there right now. Thanks!

We're running into more and more people who are doing infrastructure-as-a-project and it seems like we all want to share our puppet modules with each other. But when it gets down to it, we've all written fairly non-generalizable modules that work for us but aren't useful to others without a lot of work. It just seems to be something that's encouraged by puppet's syntax and the way people use it.

As another commenter alluded, perhaps if the module forge were full of robust easy to use and extend modules for many everyday tasks, things would be different. I think an ideal configuration management system would be designed around sharing from the start. Puppet keeps evolving, maybe we just need to figure out the best way to use it in projects like this.

This seems like an area with a lot of opportunity.

Here's our puppet repo: https://github.com/openstack/openstack-ci-puppet

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds