August 2, 2007
This article was contributed by Donnie Berkholz
O'Reilly's annual OSCON in Portland, Ore., is perhaps the only major
conference in North America that spans the entire spectrum of open-source
communities. This makes it a great opportunity to learn from people who may
be encountering the same sorts of problems in a vastly different
environment. Other events such as FOSDEM or LCA already provide this kind
of environment, but
for those of us who are US-based, it's helpful to have one with a lower
travel budget. I highly recommend giving a talk if you're going so you get
in free, though, since registration costs hover around US$1000 and up. It's
clearly not a nonprofit conference.
Numerous groups met preceding the main part of the conference, one of them a
group of people involved with running a variety of free/open-source
projects. At the foundations
summit, most of the discussion centered around dealing with the issues
facing nonprofits, such as trademarks, fundraising and bookkeeping. But in
the same way as a full conference, the "hallway track" here was the most
useful. As the number of people grows, the discussion gets slower and
slower, but meeting the people involved with other foundations is
invaluable. The summit ended Tuesday, and next day, the exhibit hall and
regular sessions began.
In his session, Arjan van de Ven talked about efforts to reduce power use,
focusing on a few main problems to avoid in your code. The first, not
surprisingly, was polling. There is no excuse for polling, with the advent
of things like inotify. He said, "Frequent polling causes spattergroit."
His second enemy was timers. It costs power to keep moving your CPU in and
out of idle states, so you want to group timer events together rather than
having them randomly spread throughout time by a number of programs. On the
kernel side, you can use round_jiffies() or
round_jiffies_relative(), and in
userland, you can use glib's g_timeout_add_seconds() —
not g_timeout_add(). Some work is underway to add this
functionality to glibc as well. You don't want the entire Internet doing
this at the same time, however, so each computer must group its events at a
slightly different time.
Arjan's final enemy was disk I/O. Since disks have moving parts, they consume
a lot of power (at least until solid-state disks grow more
common). High-speed links such as SATA and SCSI also eat power when not in
power-saving mode. Gotchas here include opening files, even when in cache,
because of the access time update (use the O_NOATIME flag to open() when
possible), and looking for files or directories that don't exist (even when
using inotify, this always goes to disk).
A special case of this is media playback. The key is avoiding constant
spinups of DVDs as well as hard drives by using large buffers — Arjan
suggested 20 minutes of video or a minute of audio. Also, decode in large
batches so you can be idle longer.
Tools such as powertop and strace are key in tracking down the
culprits. Powertop can tell you where to look, and strace can tell you more
about what any programs are doing. Near the end, Arjan showed a graph of how
tuning and recent fixes dropped a Fedora 7 default installation from a
power consumption of 21W down to about 15.5W. That just a few fixes dropped
it by so much shows how broken things were, but we're now on the right
track. A good goal is to aim for 50 or less wakeups a second, because
getting below that level generally doesn't gain you much more.
A man with the job title "Disruptive Innovator" gave a talk with about 550
slides in 45 minutes. Rolf Skyberg of Ebay applied Maslow's hierarchy of
needs to technology to try to explain how users behave. The first level is
survival, the second is security, and the third is belonging. Computer
programs apparently haven't managed to get any higher up on the scale
yet. In terms of programs, survival means the program runs without
segfaults; security means the program is useful; and belonging means the
program is pretty. The more energy users spend finding the basics (help,
logging in, etc.), the less they have to spend doing something useful. But
one thing worth remembering is that people using a program may have higher needs
than you expected. For example, the iPod isn't just useful, it's pretty. And
people really care about that prettiness despite the lack of features like
an FM transmitter, a recorder, etc. that many other, less popular MP3
players have.
Luke Kanies talked about Puppet, a server automation tool he wrote in
Ruby. It's a replacement for earlier popular tools such as cfengine. He
really promoted the architecture, because any component in the entire system
can be replaced and reused separately. Puppet's made of three main layers:
server, networking and client. The server layer contains a compiler, a
file server, a certificate authority and a report handler. The networking is
XMLRPC over HTTPS. The client layer includes a resource abstraction layer,
transactions and a resource server. Each of these individual components can
be ripped out and replaced if you don't like it. You could change the
configuration language, use a different method of communication, or whatever
else your heart desires.
The resource abstraction layer contrasts the most with other tools such as
cfengine. It abstracts all the concepts like "install a package," "add a
user," "add a group" and so forth so you can run Puppet on any Linux or
other Unix-like OS and retain a simple configuration file without
OS-specific details. The layer supports about 10 different distributions and
other operating systems, and it's not difficult to add more.
Work is underway to create a library of Puppet config files (or recipes) to
reduce all the duplication, and that should greatly ease adoption of
Puppet. Puppet seems like a well-thought-out and extensible tool, so it will
be interesting to watch where it goes.
Clinton Nixon talked about dealing with legacy PHP code, but many of the
points are generally applicable to refactoring any code. His three primary
suggestions were to separate the controller and the view, even if you don't
have a solid MVC architecture; to call methods instead of including code
that runs from the include file; and to get rid of global variables.
His rules for view code were that control structures, printing, and
display-specific, unnested functions were allowed, but assignment and other
function calls were prohibited. He suggested beginning by drawing a line at
the top of the code and adding a comment that says "view code below here,"
then gradually migrating controller code above the line until you can move
it to a separate file. For loops, encapsulate the variables in an
object. Once you've gotten to this point, you may find duplicated views that
you can factor out.
Untangling a web of included files is a process of figuring out the inputs
and outputs, wrapping the entire file in a method, then refactoring. The
nice part about this style of refactoring is that the code always
works. There's never a point where you check in the code and it's broken.
Finally, he recommended two books: Working effectively with legacy code, by
Michael Feathers, and Refactoring by Martin Fowler. Although the Fowler
book is a classic, he recommended the newer book by Feathers because it's
more approachable.
At the close of the sessions Thursday, Dave Jones gave his now-infamous
"User Space Sucks" talk. Since most people have gotten the basic idea of
this talk, I'm only going to mention the new information. Dave re-ran his
tests a week ago on Fedora 7 to look at disk I/O during the
bootstrap process, and he
found that it had actually gotten even worse since FC6. Counts of stat(),
open() and exec() calls had either increased or stayed the same. But the
problem has grown harder, because the offenders no longer stand out in the
same way as the originals.
OSCON always provides some entertaining and educational talks, provided
you've got a way to get into them. But its free content isn't too shabby
either. The exhibit hall, all of the BOFs and parties (of which there are
many), and the accompanying OSCAMP (like FooCamp, BarCamp, etc.) and FOSCON
(mostly about Ruby) are all gratis. It stands nearly alone in the U.S. as a
conference that spans across all of the open-source world, although a niche
certainly exists for a lower-margin meeting like FOSDEM or LCA on this side
of the ocean.
(
Log in to post comments)