June 4, 2008
This article was contributed by Lisa Hoover
Bryan Che, a member of the product management team at Red Hat, recently
introduced Fedora
Nightlife, a project he hopes will motivate people to donate their
computer's downtime to processing data for scientific research and other
socially beneficial work. The heavy lifting will be done by the University
of Wisconsin-Madison's Condor
workload management system which will be responsible for the scheduling and
logistics of donated computer power and, in the end, Che hopes to build a
network of more than a million nodes of Fedora systems to help process data
for everything from Web-indexing projects to medical research.
"[W]e have begun talking with the guys over at Wikia about helping them index the Web
for their open source search engine," says Che. "It would be great if we
could help with tasks for the Fedora infrastructure team at some point with
things like automated builds or tests. There is a lot of scientific
research that requires lots of computing power, and there are lots of
students who could use access to a grid for research. I'd love to have all
sorts of projects like these participate."
Che says that the scope and type of projects that join will largely be
dictated by the community, and he's hoping to draw on its collective
expertise to "shape Nightlife into a useful community service." His end
goal, however, isn't just to make computer resources available but to also
develop a basis for larger infrastructure projects. Che notes, "For
example, much of the high performance computing (HPC) jobs these days are
done on Linux — and particularly Fedora or Red Hat. This puts us in a
prime position to be able to shape and build out an entire open source
stack for research computing on grids. Today, many people depend upon
proprietary (and often costly) libraries for their scientific research or
even enterprise computing. Nightlife will provide us a great forum to
engage these users to see what are their needs and provide them with a
fully open source solution that they can use for their valuable
research."
Naturally, security is of primary importance when individual computers
are clustered together or outside data is inserted into a system for
processing. Che says the Nightlife team takes security very seriously and
has a number of measures in place to protect users' computers and ensure
the application code is safe as well.
"[W]e will require that projects that want to leverage Nightlife must
distribute their packages and source code through Fedora," explains
Che. "This will allow us to inspect what the applications are doing and
make sure there isn't anything malicious. On the execution side, one of the
capabilities that we've added to Condor recently is integration with our
libvirt virtualization technology. This will enable people to execute
Nightlife jobs entirely within a virtual machine bubble that is shielded
from their physical computers.
"We are also looking at taking advantage of SELinux technology, which we've
developed with the NSA, as a mechanism for
tightly locking down jobs so that they can only perform tasks for which
they are explicitly granted permission."
Che is quick to point out that although Fedora has committed plenty of
resources to Nightlife, it is not Fedora-specific — indeed it's not
even Linux-specific. Since Condor supports executing processes on many
different platforms, Mac OS, Windows, Unix, and Linux distributions of any
flavor are capable of donating resources. Not all features will be
available on non-Linux platforms, however, if they lack certain underlying
technologies. For instance, Windows lacks a built-in hypervisor for running
virtual environments and doesn't support SELinux for lock-downs.
"I would welcome anyone to donate spare capacity to Nightlife [and] I'd
hope that people from all sorts of platforms join us," encourages
Che. "[T]here isn't any reason why other communities couldn't participate
with us and even start adding some of these capabilities to a Nightlife
client for their platforms. From a development standpoint, the upstream
code lives in the Condor project at the University of Wisconsin. So, anyone
can contribute at that project as well without having any involvement with
Fedora."
When the project was announced last week, some community
members were puzzled as to why Fedora chose to use Condor instead of BOINC, a similar
project developed by University of California-Berkeley. Che points out
that, though the two efforts have a lot in common, they each have an
entirely different focus. He says BOINC's mission is "very much focused on
enabling desktops/laptops to provide computing capacity as part of a larger
grid [while] Condor is more general-purpose; it can take idle capacity and
utilize it well, but it is primarily a good resource scheduler for
dedicated grids."
While some people's comparisons of Condor and BOINC focus on the
technology behind the projects, others see similarities between the Condor
and Nightlife projects themselves. In actuality, they are really quite
different. "Condor's client can use a BOINC client to process data as
backfill (when there are no other jobs to run)," notes Che. "So, there is
no need to view these projects as competitive. Indeed, one possibility is
to use Nightlife to increase the number of machines participating in
BOINC." Of course, a low barrier to entry is also important for widespread
adoption of Nightlife. Since many enterprises and researchers already run
Condor for their dedicated grids, Che says it was a logical choice for the
project.
Dr. Keith Laidig can easily see the intrinsic value of Nightlife and how
it will benefit the scientific community at large. He runs the computing
infrastructure for the computational biophysics group in the Department of
Biochemistry at University of Washington, and regularly relies on outside
computing power to crunch data for researchers. Under the direction of
Professor David Baker, about four years ago the group created Robetta, an automated prediction
server that farms out work to other systems via Condor which has proven
"quite successful at keeping the wait times [for research results] down to
the range of 'months'."
Laidig recently told
the Nightlife community, "If we had access to more computing power,
even that available from modest periods of inactivity, we could put that
power to work to address many pressing issues in bio-medical research such
as HIV/AIDS vaccine design, improvement of existing drugs and/or design new
drugs, and creation of new methods to harness biology to address issues
such as carbon sequestration."
As Laidig explained to LWN, reducing the wait times for results to even
a matter of weeks is not out of the question. "Given sufficient computing
power, the processing time would drop even further. In principle, the
processing could take a day or less — depending on computing power,
queue depth, etc."
Laidig says it's hard to estimate just how much donated computer access
his lab would need in order to see an appreciable rise in research
turn-around time, but he estimates they currently use around 300 - 400
processors running around the clock to maintain the current work
flow. "Should we gain, say, 1,500 machines that could work for 8
hours... we'd be matching that — taking into account overhead. Now,
I'd like to increase that by a factor of ten or more."
Though he would be happy to see Nightlife flourish, Laidig notes there
are some things to consider before committing your computer's resources to
the project. "Not to throw a wet blanket on things, but [there are] issues
that folks should keep in mind. Their gear would be using electricity and
generating heat. There are also network bandwidth considerations as well
— some data-sets necessary to undertake distributed work can be
sizable (100 MBs) which can soak up resources. There's the local disk space
usage, too.
"Folks should be made aware of the 'costs' of contributing. Then, should
their desire to contribute outweigh the costs, they should join up!"
Some community members have indeed expressed concerns about the
energy consumption associated with idling computers and suggest that the
ecological harm of running the CPUs and fans of an unattended machine
outweighs the benefit of charity in the name of science. In response to an
animated discussion about Nightlife at Slashdot, one enterprising
commenter tested how much energy his idle computer uses and discovered
it was upwards of $70 per year. Che responded
to the criticism by acknowledging that although cycle harvesting can be
viewed as a "waste of energy," it can, in fact, save energy in the
long run. In addition to the notion that energy to process data will
eventually be used at some point or another anyway, Nightlife also
distributes energy consumption over a wide geographical area, thereby
reducing the overall energy burden on a single data center or location.
Future plans for Nightlife include making it a first-boot option for
Fedora so when a user does a fresh install, they are prompted to donate
computer power to the project. Of course, before Che can attain his
million-node goal, there are several smaller goals to accomplish along the
way. "At the earliest, we wouldn't be able to start reaching numbers at
this level until after Fedora 10 — and that's probably pushing
it."
(
Log in to post comments)