By Jonathan Corbet
January 26, 2010
The 2010 edition of linux.conf.au was held on January 18 to 22 in
Wellington, New Zealand. A number of the talks from this event have been
covered elsewhere on LWN, with more to come; this article will talk about
several other sessions and your editor's impressions of the conference as a
whole. In brief: it was a highly successful event which easily lived up to
the high standards set by LCA.
One often goes to conferences to see the speakers perform. It's a rare
event, however, which gets them up on stage together to do a Maori war
dance. The speakers' dinner on Tuesday night featured plenty of good food,
"Fiasco" wine, and a group which gave lessons on how to do the Haka (which only coincidentally
sounds a lot like "hacker"). Much noise was made, much fun was had, and,
much to the participants' chagrin, videos were made.
Benjamin Mako Hill presented the Wednesday morning keynote. He started off
with a discussion of the open source/free software divide, noting that he
is very much in the free software camp. The open source side, he said,
emphasizes practical benefits, whereas freedom has inherent benefits.
The rest of his talk was dedicated to one specific benefit (a rather
practical one, in your editor's opinion) that comes with free software:
freedom from antifeatures.
Antifeatures are behaviors added to proprietary software as a way of
exerting some sort of control over users. It can be a simple matter of
extracting money from users - requiring them to pay more to have
advertising or spyware features removed, for example. It can be a matter
of market segmentation; see, for example, the several versions of Windows
Vista offered by Microsoft or the removal of raw image support from some
Canon cameras. Vendors may be trying to secure monopolies; software which
detects third-party batteries in devices and disables the power-saving
features is an example. "Protecting copyrights" is another; there are, he
says, no Facebook fan clubs for dongles or the unskippable tracks at the
beginning of DVDs.
In all of these cases, the cited behavior works against the interests of
the people actually using that software; these features are not something
that users have requested. They are all also features which are entirely
unsustainable in the free software world. Even if a free software project
were to implement this sort of antifeature - something which happens rarely
- others will quickly disable it; see the Okular cut-and-paste story
for an example. Software
freedom means the freedom to remove functionality we don't want.
Mako has set up a wiki site
where he is collecting interesting examples of software antifeatures.
How can we make a community which is more welcoming? Matthew Garrett
addressed this question from a number of viewpoints, without necessarily
coming to a lot of conclusions. The problem, he says, is that, as a
community, we tend to be hostile - even if truly unprovoked aggressive behavior
is relatively rare. We tend to value code over everything else, and we
value technical excellence above behavioral excellence. The result is that
the community is not terribly functional as a whole; it has not gained the
behavioral standards that one would normally associate with a community,
and we're getting big enough that we really need to do something about it.
In general, we don't hate each other; we can get together at conferences
and not punch each other in the face. It has only happened to him once at
LCA, Matthew says, and he deserved it.
So what do we do? Codes of conduct can help, but only if we are willing to
enforce them. We need to decide whether we are willing to tolerate
poisonous people if they are technically strong enough. There should be a
greater willingness to point out unacceptable behavior; Matthew would
especially like to see respected community members doing more of this.
What works best, though, may be the simple power of positive examples.
Glyn Moody's keynote focused on the power of sharing, and how ideas from
our community have spread out and influenced the wider world. For
example, consider open access to scientific results, which have been
increasingly bottled up by the publishing industry. The ArXiv.org repository was announced within a
week of when Linus announced his first kernel release; since then, open
access has become an increasingly strong force in the scientific community.
Related to that was the race to completely sequence the human genome. A
company called Celera was a late entry with a scary agenda: sequence the
genome, then patent as much of it as possible. In the end, though, a lone
hacker named Jim Kent was able to bash out a system which solved the
problem first, using a 200-system Linux cluster. He won the race by a few
days and put the results into the public domain, heading off the patent
threat.
Project Gutenberg - which predates Linux by some years - is an interesting
example. Despite having significant resources, this project only had ten
books online by 1991. By 1997, though, that number had expanded to 1000.
The spread of the Internet clearly helped in this regard, but a wider
understanding of the importance of freely-available information also
helped.
Sharing is moving into a number of other realms; Glyn described sites like
Facebook and Twitter as simply a means for the sharing of lives. Openness
is also moving into government - to an extent. The use of a Creative
Commons license for the content on the Change.gov site was a clear sign that things
are changing. Still, things are not really open; it's the traditional
power structure with a bit of data released - "shared source government."
The final part of the talk went rather far afield into the areas of climate
change, environmental problems, and the financial crisis. In the end, Glyn
said, these problems are all the result of a failure to share. Our
community, he said, has shown how sharing is done, and we've exported that
knowledge widely. Now we need to find a way to apply it to these larger
problems. That is quite the challenge; your editor can't wait to see the
patches that result.
Andrew "Tridge" Tridgell is concerned about a different threat: patent
attacks on free software. These attacks, he fears, are only going to
become more common; the community as a whole needs to learn how to defend
itself. Patent defense, Tridge says, begins with the developers.
To that end, developers should learn how to read patents, a process which isn't
obvious from the outset. Many developers have come to the conclusion that
looking at patents can be dangerous - triple damages for willful
infringement and all that. Tridge's point is that most free software
projects cannot withstand even single damages. There is no point in
worrying about a triple death when a single death is enough. So, rather
than walking through the minefield with a blindfold on, it's better to take
the blindfold off and step around the mines.
There are three ways to defend against patent claims. Developers tend to
turn to prior art, but that is a difficult and dangerous way to go;
establishing prior art can be much harder than most people expect.
Invalidating patents is even worse; that can almost never be done
successfully. The best defense, he says, is finding ways to not infringe
on the patent in the first place. The cost is low, the certainty is
higher, and it can lead to a stronger defense for free software in
general. Non-infringement, normally, is achieved through a combination of
careful reading of the patent and the crafting of workarounds where needed.
The problem is that the GPL requires broad licensing of patents; if a
patent is not licensed for all users of the code, that code cannot be
distributed. There are good reasons for this requirement, but it also can
make us into an attractive target: a company which wishes to settle a
patent suit cannot stop with buying a license for itself; it must buy a
license for the entire community. That's the sort of situation which makes
patent trolls dream of dollar signs.
The situation changes, though, when we find an effective workaround for a
patent. That workaround essentially invalidates the patent, eliminating
the threat. When proprietary companies find workarounds, they tend to keep
them to themselves; there's no point in helping their competition avoid the
payment of royalties. In the free software world, though, we can
distribute workarounds broadly, to the point that proprietary software
companies can pick them up too. That will kill the value of the patent
entirely, drying up any associated revenue stream. After a few episodes
like that, the free software community will
look like the "toughest, meanest kid on the block," and patent trolls will
be inclined to leave us alone.
Workarounds must be done rigorously, though, with help from lawyers. That
is a challenge: the legal community is not known for open sharing of
information on topics like this. We need a forum where engineers and
lawyers from competing companies can talk openly about patents, but such a
forum does not yet exist.
Josh Berkus updated attendees on the state of PostgreSQL; it is, he
says, an exciting time for the project. He started by announcing that the
upcoming release will be named 9.0, not 8.5 as had been previously
expected. That's because this release contains a number of features which
they hadn't thought would be ready by now; these include hot standby,
streaming replication, a 64-bit Windows port, the new DO()
statement, and more. The dot-zero number also reflects the fact that some
of these features "might not work perfectly" in this release.
The PostgreSQL development process has changed in the last couple of years
in response to the difficult 8.2 cycle which dragged out for six months
longer than anybody had expected. It has proved difficult to manage
committer and reviewer time for PostgreSQL. The way it works now is that,
every other month, the project enters a "commitfest," at which point the
outstanding patch queue is emptied. Patches may be merged, rejected, or
deferred, but, anyway, some sort of disposition is decided upon. This
process helps to ensure that patches move through the system, it allows
contributors to see which patches are stalled and why, and it should help
to train new reviewers and committers for the future.
The final commitfest for 9.0 goes through the end of January; after that
the project goes into stabilization mode, with the final release expected
sometime around June or July.
One widely-anticipated feature for 9.0 is hot standby. This feature works
by taking the transaction logs from the primary database server and copying
them to one or more standby systems. Those systems fold the logs into
their copy of the database. The result is that the backup systems may be
slightly behind the primary database, but they stand ready to take over at
any time. While they are in standby mode, they are able to handle
read-only queries, helping to distribute the load somewhat.
A related new feature is streaming replication. It aims to solve the same
problem as hot standby, with some changes: streaming replication is
for sites which are concerned about never losing any data, want minimal (as
in a few seconds) downtime should a failover be necessary, and which are
less concerned about multi-node scalability. Such sites can set up
replicated servers which receive transaction log data almost immediately
after each transaction completes. The replicated servers are thus very
close to the state of the primary server. This feature works, though, Josh
notes, the administration is a bit awkward in 9.0.
The "explain" feature has been enhanced in 9.0. In addition to the
semi-human-readable version that PostgreSQL has used for some time,
"explain" can now output its results in XML, JSON, or YAML format. This
change is meant to make it easier for graphical frontends to interpret the
output, but developers are starting to discover that some of the formats
(YAML in particular) are easier to read than the classic format.
Finally, Josh talked about the project's upcoming transition to git for its
source code management. They are hoping to free themselves of CVS in the
next development cycle, but a couple of developers are still dragging their
feet. It seems that this little problem will be overcome sooner or later.
Meanwhile, the PostgreSQL project appears to be in good shape and getting
better.
In conclusion: LCA 2010 was a busy and interesting event. Your
editor's main grumble was that the schedule was so full of useful talks
that he never got to go out and enjoy the beautiful, sunny weather which
only occurred while the conference was in session. LCA retains the things
that make it special: interesting talks on a wide variety of topics, a
unique mix of people, lots of fun, and a generally friendly atmosphere.
Also notable was the presence of more women than at any other event you
editor has ever seen - and the fact that nobody even felt the need to
comment on it.
Even an article of this length - along with the other half-dozen LWN
articles coming from this conference - cannot cover all of the interesting
things that happened there. Also noteworthy were Selena Deckelmann's
lightning talk on using free software to help overturn a rigged African
election, Gabriella Coleman's keynote on free software culture, Patrick
Brennan's talk on Albany Senior High School, which abruptly
shifted to Linux in 2009, Joel Stanley's push for hardware designed
explicitly to run free software, and, needless to
say, the traditional Penguin Dinner, even if memories from that particular
event tend to be a bit fuzzy.
LCA 2010 organizers Andrew and Susanne Ruthven are to be commended on their
stewardship of this venerable event. LCA might not have been in Australia
this year, but they managed to keep all that makes LCA worthwhile while
bringing it to an interesting new venue. For added fun - since organizing
a conference like LCA is evidently not enough work on its own - they also
threw having a baby into the mix and still kept everything together (with a
lot of help from the rest of the organizing team, needless to say). They
are probably more than ready to pass the baton on to next year's organizing
team, which announced that LCA 2011 will return to Brisbane, Australia,
probably in early February.
Comments (16 posted)
By Jonathan Corbet
January 22, 2010
Taras Glek works for Mozilla, but he is not a browser hacker; instead, he
works on GCC and other tools aimed at making the browser development
process better. It is, he says, a good job. While carrying out his
duties, Taras has been able to put a new GCC feature to work in ways which
may prove to be useful well beyond Mozilla.
Development tools are important; they can help us to produce software more
quickly and with far fewer problems. Unfortunately, Taras says, we are
stuck in the stone age of software development, using tools from the
1970's. Our code base is growing, though, to the point that developers
often cannot understand the entirety of even a single application. We need
some way to amplify our capabilities so that we can continue to make more
powerful applications; static analysis tools can bring some of those
capabilities.
Static analysis, in essence, treats the code as data which is then the
subject of further analysis. It has often been seen as a backwater, an
area of primarily academic interest. When static analysis tools have found their way
into more common use, it has generally been in their ability to find
certain classes of bugs. But there's more that can be done with these
tools: finding API abuse, generating library bindings, improved code base
visualization, and more. Static analysis has been put to use with Mozilla
to find dead code; thousands of lines of code have been found to be
completely unused, despite the fact that engineers were putting their time
into maintaining it.
The Mozilla project has an especially strong need for good tools. It is a
huge code base (1.7 million lines of C++ and 1 million lines of
JavaScript); humans just do not scale to that amount of code. This code
base is under constant optimization work, so refactorings are frequent.
Without some help, keeping this code in good condition is a major challenge.
Much of Taras's work seems to be aimed at mitigating some of the pains that
come with C++ development. One of those pains is that the language is just
about impossible to parse; the parser must actually instantiate types
before it can complete its job. So anybody who wants to analyze C++ code
must first find a decent parser for it. The available options are
limited. The LLVM compiler is promising, but it's going to be another year
or two before it's really ready for prime time. The Elsa tool can be used, but it's
essentially unmaintained and not really guaranteed to be correct.
The one other option - one which is known to have a complete C++ parser -
is GCC. But the GCC code has a bit of a nasty reputation, so Taras started
off using Elsa for his work. Eventually, though, he turned back to GCC for something
more solid, and hasn't looked back - the hairiness of GCC has, perhaps, been
exaggerated. But, more to the point, the upcoming GCC 4.5 release is,
he says, "the most exciting release ever." The reason for that is the
long-delayed addition of the plugin API, which became possible once the runtime library license
exemption finally went into place. With this API, analysis code can
easily hook into the compiler and inspect code at whatever stage of the
process best suits its needs.
Beyond plugins, GCC has a few other features which make it suitable for
static analysis work. The ability to attach attributes to objects in the
compiled code makes it easy to pass hints through to later processing
steps. The new pass manager brings a relatively modern structure to a
compiler which did not originally have one. And the GIMPLE intermediate
representation provides much of the rest of what's needed for code which
needs to inspect other code.
There are a few interesting plugins in the works.
One of them is the LLVM compiler, which can be plugged in to perform the
back-end functions for GCC. Another is milepost,
which uses a brute-force approach to figure out the optimal settings of the
command-line flags for a specific body of code. Then, there are "the
hydras," which are Taras's work.
These plugins take an interesting approach, in that the actual
analysis work is done in JavaScript scripts. The idea was originally seen
as amusing - "wouldn't it be fun to put Spidermonkey into GCC?" - but it
has actually worked out well. JavaScript is a relatively nice, concise
language which makes it easy to implement the needed capabilities.
The first plugin is Dehydra, so named
because the control flow graph in Mozilla somewhat resembles a Hydra
monster. Dehydra produces a JSON-like representation of the objects found
in a C++ program; individual JavaScript scripts can then use this
representation to analyze the program. The Treehydra plugin,
instead, provides a JavaScript interface to the GIMPLE representation of
the program; it can be used for more traditional sorts of static analysis
tasks.
One of the pains that come with large C++ programs is that simply finding
code can be difficult. It's not always clear which method will be invoked
in a specific situation, even in the absence of things like macro tricks.
To help with this problem, Dehydra has been used as the base of a source browsing tool
called DXR; it's like
LXR, but with a great deal of semantic
information thrown in. DXR users
can find types defined by macros, look up parent class information, and so
on. There's also a call graph tool which can find all the callers of a
specific method; that's important in C++, where overloading can make
grep thoroughly unusable for this kind of task.
It is, Taras says, "Eclipse-like stuff," except that, unlike Eclipse,
it scales to a Mozilla-size code base.
Various other tools have been written. The final.js script (a
dozen lines of code which can be seen on this
page) looks
for C++ methods tagged with the "final" attribute; any attempt
to override those methods will result in a compilation error. It is, in
other words, a port of the Java final keyword to C++. A checker
which might be interesting in other environments - including the kernel -
is flow.js, which can add a constraint that all exits from a
function must flow through a specific label. Consider this common kernel
pattern:
if (something wrong)
goto out;
/* Do some real work */
out:
release_locks();
free_memory();
cancel_self_destruct()
return something;
It's a common mistake to add a return statement to the middle of a
function like this, shorting out the cleanup code; flow.js can
catch errors like that at compile time.
Additional modules include must-override.js, which can mark
methods which must be overridden (but which cannot be virtual);
outparams.js, which ensures that any output function parameters
have been set on a successful return from the function, and
stack.js, which enforces a requirement that specific classes only
be instantiated on the stack, since the garbage collector is not prepared
to deal with them. Taras is also working on a checker for variables which
shadow class members - a mistake which GCC does not catch now.
For the time being, this work is mostly used within the Mozilla project,
though Taras would clearly like to see users from the wider community. He
looks forward to a day when libraries are distributed with a plugin which
ensures that the library is being used correctly. Another nice feature
would be a distribution-wide DXR, enabling cross-package source browsing.
For now, though, we have a set of tools that serves as a good proof of the
concept that GCC plugins can be used for static analysis.
Comments (53 posted)
By Jonathan Corbet
January 27, 2010
Bright purple hair seems certain to make Liz Henry distinct from the crowd,
but it's another attribute that she came to linux.conf.au 2010 to talk
about: her wheelchair. It is, in essence, a machine to move her body
around. It's not surprising that she would like it to be easy to fix or to
hack on, but that is not how things are. Cars can be fixed easily; anybody
with a few skills can start a car repair business. But this cannot be done
with wheelchairs, which are much simpler devices. A wheelchair is a
medical device, so the normal rules don't apply. Liz would like to
change those rules; she also wants the rest of us to understand why we want
to change them too.
People with disabilities may seem like a distinct group, but the fact of
the matter is that almost all of us will be people with disabilities at
some point in our lives. The average human, Liz says, will spend about
eight years coping with some sort of disability. The result is a huge
business, fueled by large amounts of money from insurance companies and
government. That business is not greatly concerned with empowering
disabled people; that's something we're going to have to take care of
ourselves. We cannot depend on nanobots to keep us going as our bodies
age; instead, we should be designing and coding for our future now.
People who want to hack their own disability solutions will find relatively
little useful information online. Why? Possible reasons include profit
motives in a highly lucrative industry, the perceived need for the
intervention of medical experts when creating solutions, and concerns about
liability should things go wrong. Disabled people also tend to be pushed
into the role of passive charity recipients and isolated from each other.
So what disability solutions exist come from the "medical industrial
complex." Most of us will need these solutions at some point, and we'll
want to be able to hack on them; the medical industrial complex is not much
interested in helping us to do that.
The best progress which has been made so far is in the areas of vision,
speech, and gaming. We're seeing less in mobility, so far. But, even
there, simple hacks exist: it's common to see users of walkers who have fitted
tennis balls over the feet to make them glide properly. (Your editor
notes, with amusement, that Walmart is selling
walker tennis balls for a mere $28 - the price of dozens of normal
balls). This is a hack which is easily done, easily noticed, and easily
copied, so it has spread widely. Pockets for crutches made of duct tape
were another example presented in the talk.
A good example of how things fall down can be seen in the area of ramps. A
ramp is not a complex device, but ramps must still be built properly if
they are not to collapse or dump their users on the floor. Information on proper
ramp building is discouragingly rare on the net, and what is there is not
open to contributions. Other bits of interesting information - such as the
soda
bottle prosthesis - are available, but what we're seeing, still, is
relatively small attempts. There's no real model for building community
around this kind of information yet.
Disability-friendly software, too, is not an easy hack; accessibility tends
to be treated as a last-minute add-on. Web site accessibility, too, is
often an afterthought, and tends to be user-focused. This approach tends to
lead to sub-standard solutions, but it also fails to lead to a free,
do-it-yourself culture. We need good accessibility for developers too.
Liz talked about a number of projects aimed at making life better (and more
hackable) for people with disabilities. Consider voice synthesis and
screen reading: much of what's happening in this area is proprietary, but
there are also projects like Festvox, Fire Vox, NVDA, and the tools at Full Measure (Speakup was not mentioned). Other
interesting projects include:
Liz also mentioned the BBC
accessible newsreader; she wishes that the BBC would release the code
so that it could be incorporated into content management systems and made
widely available.
On the other side, there are antifeatures which make life harder for those
who would hack better solutions. These include systems which people with
disabilities cannot contribute to and one-off solutions which cannot be
extended or improved upon. Especially harsh words were reserved for those
who exploit vulnerable people; there is an awful lot of incredibly
expensive assistive technology out there. "Freaking out about liability"
is also an antifeature; Liz feels that many of those concerns are greatly
overblown. Selling out to industry - going for patents and profit rather
than making technology available - is also a step in the wrong direction.
As an example of good and bad ways of doing things, Liz contrasted the Free
Wheelchair Mission and Whirlwind Wheelchair
International. The former makes dirt-cheap wheelchairs out of lawn
chairs and bicycle wheels, then ships them by the container load to poor
countries. It seems like a good idea, but dumping all those cheap chairs
devastates any local market that may have developed. When the chairs break
(which tends to happen soon), there's nobody left to help keep them going.
Whirlwind, instead, is focused on partnering with local industry and
sharing information, creating a more hackable solution with more people to
hack on it.
The core message from the talk was that disabled people are hackers by
necessity; we should bring them in, get their input, and enable them to
create their own solutions. Their solutions will become our solutions. We
should, Liz says, prepare to open-source our way out of the retirement
prisons which are waiting for us.
Comments (5 posted)
Page editor: Jonathan Corbet
Next page: Security>>