|
|
Subscribe / Log in / New account

The state of the OSU Open Source Lab

By Jake Edge
March 26, 2019

SCALE

The Oregon State University Open Source Lab (OSU OSL) has been a longtime hosting site for a wide variety of free and open-source software (FOSS) projects. At SCALE 17x, OSL director Lance Albertson gave an overview of what the lab does, some of its history, and its role in mentoring undergraduates at OSU. There are a lot of facets to the lab and its work, most of which flies under the radar, which is why Albertson came to Pasadena, CA to fill attendees in.

Background

OSL acts as a FOSS hosting company, providing free or low-cost hosting to a variety of projects. It offers colocation or virtual machines (VMs) in a private cloud. It can also provide access to a wide array of different CPU architectures. Beyond that, the lab is a distribution and mirroring site for multiple projects.

Something that is not as well known, he said, is that OSL mentors undergraduates. This allows them to gain real-world experience with production systems. A number of alumni, including the founders of CoreOS, from the lab have moved into work in high-profile jobs in the industry. The lab has one full-time employee, Albertson, and typically six to ten undergraduates.

[Lance Albertson]

It was started in 2003 by Scott Kveton and Jason McKerr, who worked for the OSU information services department. "The cloud did not exist" back then, so the lab offered colocation hosting for FOSS projects. Three early projects were Gentoo, Debian, and Freenode. In those early days, the existence of the lab spread by word of mouth among the projects, eventually attracting kernel.org, the Apache Software Foundation, Drupal, and the Linux Foundation.

Its initial funding came from OSU, based on the cost saving by the university from switching to FOSS. Google and RealNetworks were early sponsors. OSL moved to the college of engineering in 2013. Its ongoing funding model is to get corporate donations; IBM, Google, and Facebook are big donors. It also has hosting contracts with the Linux Foundation, Drupal, and the Open Source Robotics Foundation. Other companies donate hardware or bandwidth and there are individual donors as well. At this point, OSL gets no direct funding from OSU or the state of Oregon, which makes fundraising a yearly challenge.

The role of the lab is to be a neutral hosting facility and to foster relationships between FOSS projects and companies, Albertson said. It provides a stable, physical home for core FOSS projects that is flexible to the needs of each project. It gives access to less-common hardware and CPU architectures, including OpenPOWER and, soon, RISC-V, along with compute and storage resources, such as software mirroring and continuous integration and deployment (CI/CD). The lab also helps projects with their systems engineering needs and helps train the next generation of open-source leaders.

He put up a list of new projects (which can be seen in his PDF slides or the YouTube video of the talk), which showed around a dozen new projects for general hosting and double that for OpenPOWER hosting. The list of current projects was an eye chart over two slides, totaling up to almost 200 projects. That list does not include subprojects and several of the listed projects (e.g. the Linux Foundation and Apache Software Foundation) have lots of subprojects.

Students

Many of the alumni from the lab have landed in prominent positions in the industry, including at CoreOS (as mentioned earlier), the Linux Foundation, Microsoft, Amazon, Apple, Tesla, Red Hat, and more. Students at the lab interact with open-source projects on a daily basis; the lab runs a help desk that is staffed by the students, so they handle requests via email, IRC, Slack, and so on. OSL is a "Chef shop", he said, so students spend a lot of time creating new cookbooks and maintaining existing ones.

One of the more important pieces is that students get hands-on experience with hardware at the lab. It is relatively difficult to get that kind of experience these days, since many companies are using public clouds instead of their own data centers. The students get to learn about the quirks of installing (and retiring) real hardware, which is valuable. In addition, students handle all of the support tickets for OSL for a week on a rotating basis. This ensures they get wide experience with all of the different systems in the lab.

The hiring process consists of an open-book quiz, with basic questions about Linux; there are simple Bash and Chef exercises as well. After that is an in-person interview with both technical and non-technical questions. Applicants are not expected to necessarily know the answers; how they think about the problem will help assess their problem-solving abilities.

Once a student has been hired, there is an onboarding process that includes a walkthrough guide and some Chef training. Students start working with test cookbooks and work up to changes to the production cookbooks, which involve pull requests with review from more senior students. After two to three months, students get added into the support-ticket rotation.

One of the challenges is that "summer is coming". That means some students will graduate, some will get internships or other opportunities, and some can work full time at the lab. Students with internships may come back to work for another year or two at OSL or they may get part time remote work with the company where they interned. It is something that he has to juggle each year since summer is when some of the larger projects get tackled because the students can work full time when they are not taking classes.

Platforms

He then went into all of the various platforms that OSL handles. The current and new systems in the lab are running CentOS 6 or 7 for servers and Debian 8 ("jessie") or 9 ("stretch") for staff workstations. They are all managed by multi-platform Chef cookbooks, which have both unit and integration tests. There is also a pile of legacy systems that are CentOS 6 or Gentoo Linux managed by CFEngine.

The lab does not have a hardware budget, he said; instead it relies on in-kind donations. In 2012, Intel donated a bunch of servers that had been hosting MeeGo, for example. EMC donated hardware in 2016, as did Facebook, while Hudson Trading donated pallets of 10Gb switches in 2018. The lab has a wish list, which includes 1U/2U compute and storage nodes, large (>3TB) SATA hard drives and SSDs, 40Gb end-row switches, and 1Gb top-of-rack switches.

The core services that OSL provides to FOSS projects include mailing lists, email forwarding, DNS, and web application hosting. It also provides systems engineering consulting to the projects. Projects can either have managed or unmanaged hosting. The managed projects have systems that are kept up to date, with services configured and managed by OSL students, all via Chef. On the other end, unmanaged projects get a VM and have to manage all aspects of the system; the lab only requires an account with sudo privileges for troubleshooting and emergencies.

For software mirroring, there is a three-server cluster, with hosts in Corvallis at OSU, Chicago, and New York. Round-robin DNS spreads the load across the cluster, which handles an average of 1.7Gbps across the three nodes. It can store up to 15TB, of which 12TB are currently being used for more than 100 projects. The hardware is overpowered for what is needed, but came from a donation by IBM: POWER8 systems with 256GB of RAM.

There are more than 300 colocated hosts for various projects; certain projects (e.g. Gentoo, Linux Foundation, Apache Software Foundation) have their own project racks. These are all in a data center that is shared with OSU. Of the 70 racks in that data center, OSL uses 32, though not all of them are full.

The lab has recently "dived into Ceph", Albertson said. It has built two storage clusters using Ceph: a five-node cluster for OpenPOWER OpenStack and an eight-node cluster for x86 OpenStack and other OSL services. The primary use for both is as block storage for OpenStack. There are future plans to investigate object storage for Ceph.

OSL runs two different cloud platforms: OpenStack and Ganeti. The primary platform is Ganeti, which came out of Google, but the company has moved away from it. It is stable, easy to maintain, and came about before OpenStack even existed; OSL has been using it since 2009. But Ganeti has no public API, so it is poor at providing self-service.

OpenStack, on the other hand, has gotten really stable but it is difficult to maintain, he said. OSL started deploying it in 2013. OpenStack does have a public API and is "really good for self-service". It started as a PowerPC little-endian (i.e. ppc64le) cluster, but there is now an x86 cluster available as well. It is used by multiple projects including the Linux Foundation, GCC, and the GNOME project.

The main production cluster runs Ganeti and provides VMs for multiple projects. There are also project clusters that are managed by the lab, such as for the Python Software Foundation and CiviCRM.

Collaboration with IBM

OSL has collaborated with IBM on a variety of projects over the last ten years. One way or another, OSL has had some hardware available for Power ISA builds. That really started to pick up after the release of POWER8, which resulted in an OpenStack cluster of five POWER8 systems (with around 225 VMs) and three POWER9 systems (with around 22 VMs). Over 100 projects are using the cluster; many of the ppc64 and ppc64le binaries that attendees likely use were built on this cluster, he said. That hardware has either been donated or loaned by IBM. There are also bare-metal POWER machines for the GCC compile farm, Debian, and FreeBSD.

There is a collaboration between OSL and the OSU Center for Genome Research and Biocomputing (CGRB) to provide GPU access on OpenPOWER systems to FOSS projects. The hardware is managed by CGRB and projects can access it via Son of Grid Engine, which is a kind of batch scheduler for high-performance computing.

OSL has created a Jenkins portal that will allow projects to submit CI jobs to the GPU hardware. Albertson is working on using OpenStack Zun to provide a container facility for shell access to the GPUs. It is not feasible to share GPUs via VMs due to limitations with PCI passthrough. Right now, he is using older GPUs provided by CGRB, but if the project is successful, IBM will consider making some more advanced hardware available.

Beyond that, there are two LPARs on an IBM ZSeries (s/390x) system at Marist College in New York that can be used by projects. OSL has a Jenkins portal and Docker images can be submitted to be run. There is some AIX hardware that OSL is hosting, but does not manage. Select projects are given access to those for building and testing on AIX.

Recent work

The lab manages 130 or so hosts with Chef. Over the last year, OSL moved to Chef 13 and will start to migrate to Chef 14 soon. There is an ongoing move from CFEngine to Chef that started in 2013; he believes it will finally be done this year, as a lot of progress was made in 2018.

There is a compile farm based on Open Compute Project (OCP) hardware that was donated by Facebook. There are 90 compute nodes, each with 140GB of RAM, 3TB of disk, and a 10Gb NIC. The GCC compile farm was the first project to start using these nodes, but now there are multiple projects using them, including Debian and Fedora RISC-V builds using QEMU. 59 of the 90 nodes have been allocated at this point.

That hardware actually sat idle for a few years because there were logistical problems installing the seven foot tall racks with special power requirements. The OSU data center is on the second floor, but the elevators were not tall enough to transport the racks. That was solved when OSL moved into a new building (for other reasons) and repurposed a ground-floor room as mini-data-center for the OCP racks. He is still seeking around $150K in donations for cooling upgrades.

Albertson closed his talk with a laundry list of goals for the coming year. That included upgrades of various tools like OpenStack, Ceph, and Chef. Beyond that, there is a need to start replacing the aging OSL core network. CentOS 8 is on the horizon, so OSL needs to get ready to migrate systems to that, perhaps eliminating CentOS 6 along the way. An OpenStack cluster of Arm systems may be in the offing as well; an Arm startup has approached the lab to replicate what it has done with IBM, but with Arm systems.

The talk provided a detailed look at the innards of longtime FOSS ecosystem player. OSL provides a lot of compute—and human—power throughout the FOSS world, but it is somewhat rarely seen or heard from. Albertson helped bridge that gap; one can only hope that OSL continues to prosper for a long time to come.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Pasadena for SCALE.]

Index entries for this article
ConferenceSouthern California Linux Expo/2019


to post comments

The state of the OSU Open Source Lab

Posted Mar 27, 2019 8:04 UTC (Wed) by quozl (guest, #18798) [Link]

Recently I was contacted by one of the OSL team to help with archiving and decommissioning a forum instance for One Laptop per Child that had lost a DNS CNAME years ago. We had forgotten about it. It had apparently just sat there waiting for requests that never came. I was really impressed with the professionalism and care with which the ticket was handled. Well done!

The state of the OSU Open Source Lab

Posted Mar 28, 2019 2:49 UTC (Thu) by taggart (subscriber, #13801) [Link]

Great to hear what the folks at OSL are up to, cool stuff! IMO they are an example of what _all_ public universities should be doing to both train students for the real world and fulfill their service charter. If only it was as easy to fork their organization as it is a repo in git! :)

The state of the OSU Open Source Lab

Posted Mar 28, 2019 12:32 UTC (Thu) by error27 (subscriber, #8346) [Link]

Back in the day when my employer sold rackmounted systems we always had to ask about the door heights and elevators before we sold a system. They come with wheels on the bottom but they weigh the same as a small car when they're fully loaded.

At a trade show one of our customers said that they received a rack that was too tall for the door. It's not that hard to remove all the computers and then lean it over and carry it through. But instead of that they decided that they weren't ever going to need the top shelves of the rack so they took a chainsaw and chopped it to the right height.

This a customer so of course you congratulate them on their resourcefulness but inside you're weeping.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds