Fedora harnesses the power of idle computers with Nightlife
Bryan Che, a member of the product management team at Red Hat, recently introduced Fedora Nightlife, a project he hopes will motivate people to donate their computer's downtime to processing data for scientific research and other socially beneficial work. The heavy lifting will be done by the University of Wisconsin-Madison's Condor workload management system which will be responsible for the scheduling and logistics of donated computer power and, in the end, Che hopes to build a network of more than a million nodes of Fedora systems to help process data for everything from Web-indexing projects to medical research.
"[W]e have begun talking with the guys over at Wikia about helping them index the Web for their open source search engine," says Che. "It would be great if we could help with tasks for the Fedora infrastructure team at some point with things like automated builds or tests. There is a lot of scientific research that requires lots of computing power, and there are lots of students who could use access to a grid for research. I'd love to have all sorts of projects like these participate."
Che says that the scope and type of projects that join will largely be dictated by the community, and he's hoping to draw on its collective expertise to "shape Nightlife into a useful community service." His end goal, however, isn't just to make computer resources available but to also develop a basis for larger infrastructure projects. Che notes, "For example, much of the high performance computing (HPC) jobs these days are done on Linux — and particularly Fedora or Red Hat. This puts us in a prime position to be able to shape and build out an entire open source stack for research computing on grids. Today, many people depend upon proprietary (and often costly) libraries for their scientific research or even enterprise computing. Nightlife will provide us a great forum to engage these users to see what are their needs and provide them with a fully open source solution that they can use for their valuable research."
Naturally, security is of primary importance when individual computers are clustered together or outside data is inserted into a system for processing. Che says the Nightlife team takes security very seriously and has a number of measures in place to protect users' computers and ensure the application code is safe as well.
"[W]e will require that projects that want to leverage Nightlife must distribute their packages and source code through Fedora," explains Che. "This will allow us to inspect what the applications are doing and make sure there isn't anything malicious. On the execution side, one of the capabilities that we've added to Condor recently is integration with our libvirt virtualization technology. This will enable people to execute Nightlife jobs entirely within a virtual machine bubble that is shielded from their physical computers.
"We are also looking at taking advantage of SELinux technology, which we've developed with the NSA, as a mechanism for tightly locking down jobs so that they can only perform tasks for which they are explicitly granted permission."
Che is quick to point out that although Fedora has committed plenty of resources to Nightlife, it is not Fedora-specific — indeed it's not even Linux-specific. Since Condor supports executing processes on many different platforms, Mac OS, Windows, Unix, and Linux distributions of any flavor are capable of donating resources. Not all features will be available on non-Linux platforms, however, if they lack certain underlying technologies. For instance, Windows lacks a built-in hypervisor for running virtual environments and doesn't support SELinux for lock-downs.
"I would welcome anyone to donate spare capacity to Nightlife [and] I'd hope that people from all sorts of platforms join us," encourages Che. "[T]here isn't any reason why other communities couldn't participate with us and even start adding some of these capabilities to a Nightlife client for their platforms. From a development standpoint, the upstream code lives in the Condor project at the University of Wisconsin. So, anyone can contribute at that project as well without having any involvement with Fedora."
When the project was announced last week, some community members were puzzled as to why Fedora chose to use Condor instead of BOINC, a similar project developed by University of California-Berkeley. Che points out that, though the two efforts have a lot in common, they each have an entirely different focus. He says BOINC's mission is "very much focused on enabling desktops/laptops to provide computing capacity as part of a larger grid [while] Condor is more general-purpose; it can take idle capacity and utilize it well, but it is primarily a good resource scheduler for dedicated grids."
While some people's comparisons of Condor and BOINC focus on the technology behind the projects, others see similarities between the Condor and Nightlife projects themselves. In actuality, they are really quite different. "Condor's client can use a BOINC client to process data as backfill (when there are no other jobs to run)," notes Che. "So, there is no need to view these projects as competitive. Indeed, one possibility is to use Nightlife to increase the number of machines participating in BOINC." Of course, a low barrier to entry is also important for widespread adoption of Nightlife. Since many enterprises and researchers already run Condor for their dedicated grids, Che says it was a logical choice for the project.
Dr. Keith Laidig can easily see the intrinsic value of Nightlife and how it will benefit the scientific community at large. He runs the computing infrastructure for the computational biophysics group in the Department of Biochemistry at University of Washington, and regularly relies on outside computing power to crunch data for researchers. Under the direction of Professor David Baker, about four years ago the group created Robetta, an automated prediction server that farms out work to other systems via Condor which has proven "quite successful at keeping the wait times [for research results] down to the range of 'months'."
Laidig recently told the Nightlife community, "If we had access to more computing power, even that available from modest periods of inactivity, we could put that power to work to address many pressing issues in bio-medical research such as HIV/AIDS vaccine design, improvement of existing drugs and/or design new drugs, and creation of new methods to harness biology to address issues such as carbon sequestration."
As Laidig explained to LWN, reducing the wait times for results to even a matter of weeks is not out of the question. "Given sufficient computing power, the processing time would drop even further. In principle, the processing could take a day or less — depending on computing power, queue depth, etc."
Laidig says it's hard to estimate just how much donated computer access his lab would need in order to see an appreciable rise in research turn-around time, but he estimates they currently use around 300 - 400 processors running around the clock to maintain the current work flow. "Should we gain, say, 1,500 machines that could work for 8 hours... we'd be matching that — taking into account overhead. Now, I'd like to increase that by a factor of ten or more."
Though he would be happy to see Nightlife flourish, Laidig notes there are some things to consider before committing your computer's resources to the project. "Not to throw a wet blanket on things, but [there are] issues that folks should keep in mind. Their gear would be using electricity and generating heat. There are also network bandwidth considerations as well — some data-sets necessary to undertake distributed work can be sizable (100 MBs) which can soak up resources. There's the local disk space usage, too.
"Folks should be made aware of the 'costs' of contributing. Then, should their desire to contribute outweigh the costs, they should join up!"
Some community members have indeed expressed concerns about the energy consumption associated with idling computers and suggest that the ecological harm of running the CPUs and fans of an unattended machine outweighs the benefit of charity in the name of science. In response to an animated discussion about Nightlife at Slashdot, one enterprising commenter tested how much energy his idle computer uses and discovered it was upwards of $70 per year. Che responded to the criticism by acknowledging that although cycle harvesting can be viewed as a "waste of energy," it can, in fact, save energy in the long run. In addition to the notion that energy to process data will eventually be used at some point or another anyway, Nightlife also distributes energy consumption over a wide geographical area, thereby reducing the overall energy burden on a single data center or location.
Future plans for Nightlife include making it a first-boot option for Fedora so when a user does a fresh install, they are prompted to donate computer power to the project. Of course, before Che can attain his million-node goal, there are several smaller goals to accomplish along the way. "At the earliest, we wouldn't be able to start reaching numbers at this level until after Fedora 10 — and that's probably pushing it."
| Index entries for this article | |
|---|---|
| GuestArticles | Hoover, Lisa |
