LWN.net Logo

LWN.net Weekly Edition for January 28, 2010

An LCA 2010 overview

By Jonathan Corbet
January 26, 2010
The 2010 edition of linux.conf.au was held on January 18 to 22 in Wellington, New Zealand. A number of the talks from this event have been covered elsewhere on LWN, with more to come; this article will talk about several other sessions and your editor's impressions of the conference as a whole. In brief: it was a highly successful event which easily lived up to the high standards set by LCA.

[Haka] One often goes to conferences to see the speakers perform. It's a rare event, however, which gets them up on stage together to do a Maori war dance. The speakers' dinner on Tuesday night featured plenty of good food, "Fiasco" wine, and a group which gave lessons on how to do the Haka (which only coincidentally sounds a lot like "hacker"). Much noise was made, much fun was had, and, much to the participants' chagrin, videos were made.

Benjamin Mako Hill presented the Wednesday morning keynote. He started off with a discussion of the open source/free software divide, noting that he is very much in the free software camp. The open source side, he said, emphasizes practical benefits, whereas freedom has inherent benefits. The rest of his talk was dedicated to one specific benefit (a rather practical one, in your editor's opinion) that comes with free software: freedom from antifeatures.

Antifeatures are behaviors added to proprietary software as a way of exerting some sort of control over users. It can be a simple matter of extracting money from users - requiring them to pay more to have advertising or spyware features removed, for example. It can be a matter [Benjamin Mako Hill] of market segmentation; see, for example, the several versions of Windows Vista offered by Microsoft or the removal of raw image support from some Canon cameras. Vendors may be trying to secure monopolies; software which detects third-party batteries in devices and disables the power-saving features is an example. "Protecting copyrights" is another; there are, he says, no Facebook fan clubs for dongles or the unskippable tracks at the beginning of DVDs.

In all of these cases, the cited behavior works against the interests of the people actually using that software; these features are not something that users have requested. They are all also features which are entirely unsustainable in the free software world. Even if a free software project were to implement this sort of antifeature - something which happens rarely - others will quickly disable it; see the Okular cut-and-paste story for an example. Software freedom means the freedom to remove functionality we don't want.

Mako has set up a wiki site where he is collecting interesting examples of software antifeatures.

How can we make a community which is more welcoming? Matthew Garrett addressed this question from a number of viewpoints, without necessarily coming to a lot of conclusions. The problem, he says, is that, as a community, we tend to be hostile - even if truly unprovoked aggressive behavior [Matthew Garrett] is relatively rare. We tend to value code over everything else, and we value technical excellence above behavioral excellence. The result is that the community is not terribly functional as a whole; it has not gained the behavioral standards that one would normally associate with a community, and we're getting big enough that we really need to do something about it.

In general, we don't hate each other; we can get together at conferences and not punch each other in the face. It has only happened to him once at LCA, Matthew says, and he deserved it.

So what do we do? Codes of conduct can help, but only if we are willing to enforce them. We need to decide whether we are willing to tolerate poisonous people if they are technically strong enough. There should be a greater willingness to point out unacceptable behavior; Matthew would especially like to see respected community members doing more of this. What works best, though, may be the simple power of positive examples.

Glyn Moody's keynote focused on the power of sharing, and how ideas from our community have spread out and influenced the wider world. For example, consider open access to scientific results, which have been increasingly bottled up by the publishing industry. The ArXiv.org repository was announced within a week of when Linus announced his first kernel release; since then, open access has become an increasingly strong force in the scientific community.

Related to that was the race to completely sequence the human genome. A company called Celera was a late entry with a scary agenda: sequence the genome, then patent as much of it as possible. In the end, though, a lone hacker named Jim Kent was able to bash out a system which solved the problem first, using a 200-system Linux cluster. He won the race by a few days and put the results into the public domain, heading off the patent threat.

[Glyn Moody] Project Gutenberg - which predates Linux by some years - is an interesting example. Despite having significant resources, this project only had ten books online by 1991. By 1997, though, that number had expanded to 1000. The spread of the Internet clearly helped in this regard, but a wider understanding of the importance of freely-available information also helped.

Sharing is moving into a number of other realms; Glyn described sites like Facebook and Twitter as simply a means for the sharing of lives. Openness is also moving into government - to an extent. The use of a Creative Commons license for the content on the Change.gov site was a clear sign that things are changing. Still, things are not really open; it's the traditional power structure with a bit of data released - "shared source government."

The final part of the talk went rather far afield into the areas of climate change, environmental problems, and the financial crisis. In the end, Glyn said, these problems are all the result of a failure to share. Our community, he said, has shown how sharing is done, and we've exported that knowledge widely. Now we need to find a way to apply it to these larger problems. That is quite the challenge; your editor can't wait to see the patches that result.

Andrew "Tridge" Tridgell is concerned about a different threat: patent attacks on free software. These attacks, he fears, are only going to become more common; the community as a whole needs to learn how to defend itself. Patent defense, Tridge says, begins with the developers.

To that end, developers should learn how to read patents, a process which isn't obvious from the outset. Many developers have come to the conclusion that looking at patents can be dangerous - triple damages for willful infringement and all that. Tridge's point is that most free software [Andrew Tridgell] projects cannot withstand even single damages. There is no point in worrying about a triple death when a single death is enough. So, rather than walking through the minefield with a blindfold on, it's better to take the blindfold off and step around the mines.

There are three ways to defend against patent claims. Developers tend to turn to prior art, but that is a difficult and dangerous way to go; establishing prior art can be much harder than most people expect. Invalidating patents is even worse; that can almost never be done successfully. The best defense, he says, is finding ways to not infringe on the patent in the first place. The cost is low, the certainty is higher, and it can lead to a stronger defense for free software in general. Non-infringement, normally, is achieved through a combination of careful reading of the patent and the crafting of workarounds where needed.

The problem is that the GPL requires broad licensing of patents; if a patent is not licensed for all users of the code, that code cannot be distributed. There are good reasons for this requirement, but it also can make us into an attractive target: a company which wishes to settle a patent suit cannot stop with buying a license for itself; it must buy a license for the entire community. That's the sort of situation which makes patent trolls dream of dollar signs.

The situation changes, though, when we find an effective workaround for a patent. That workaround essentially invalidates the patent, eliminating the threat. When proprietary companies find workarounds, they tend to keep them to themselves; there's no point in helping their competition avoid the payment of royalties. In the free software world, though, we can distribute workarounds broadly, to the point that proprietary software companies can pick them up too. That will kill the value of the patent entirely, drying up any associated revenue stream. After a few episodes like that, the free software community will look like the "toughest, meanest kid on the block," and patent trolls will be inclined to leave us alone.

Workarounds must be done rigorously, though, with help from lawyers. That is a challenge: the legal community is not known for open sharing of information on topics like this. We need a forum where engineers and lawyers from competing companies can talk openly about patents, but such a forum does not yet exist.

Josh Berkus updated attendees on the state of PostgreSQL; it is, he says, an exciting time for the project. He started by announcing that the upcoming release will be named 9.0, not 8.5 as had been previously expected. That's because this release contains a number of features which they hadn't thought would be ready by now; these include hot standby, streaming replication, a 64-bit Windows port, the new DO() statement, and more. The dot-zero number also reflects the fact that some of these features "might not work perfectly" in this release.

The PostgreSQL development process has changed in the last couple of years in response to the difficult 8.2 cycle which dragged out for six months longer than anybody had expected. It has proved difficult to manage committer and reviewer time for PostgreSQL. The way it works now is that, every other month, the project enters a "commitfest," at which point the outstanding patch queue is emptied. Patches may be merged, rejected, or deferred, but, anyway, some sort of disposition is decided upon. This [Josh Berkus] process helps to ensure that patches move through the system, it allows contributors to see which patches are stalled and why, and it should help to train new reviewers and committers for the future.

The final commitfest for 9.0 goes through the end of January; after that the project goes into stabilization mode, with the final release expected sometime around June or July.

One widely-anticipated feature for 9.0 is hot standby. This feature works by taking the transaction logs from the primary database server and copying them to one or more standby systems. Those systems fold the logs into their copy of the database. The result is that the backup systems may be slightly behind the primary database, but they stand ready to take over at any time. While they are in standby mode, they are able to handle read-only queries, helping to distribute the load somewhat.

A related new feature is streaming replication. It aims to solve the same problem as hot standby, with some changes: streaming replication is for sites which are concerned about never losing any data, want minimal (as in a few seconds) downtime should a failover be necessary, and which are less concerned about multi-node scalability. Such sites can set up replicated servers which receive transaction log data almost immediately after each transaction completes. The replicated servers are thus very close to the state of the primary server. This feature works, though, Josh notes, the administration is a bit awkward in 9.0.

The "explain" feature has been enhanced in 9.0. In addition to the semi-human-readable version that PostgreSQL has used for some time, "explain" can now output its results in XML, JSON, or YAML format. This change is meant to make it easier for graphical frontends to interpret the output, but developers are starting to discover that some of the formats (YAML in particular) are easier to read than the classic format.

Finally, Josh talked about the project's upcoming transition to git for its source code management. They are hoping to free themselves of CVS in the next development cycle, but a couple of developers are still dragging their feet. It seems that this little problem will be overcome sooner or later. Meanwhile, the PostgreSQL project appears to be in good shape and getting better.

In conclusion: LCA 2010 was a busy and interesting event. Your editor's main grumble was that the schedule was so full of useful talks that he never got to go out and enjoy the beautiful, sunny weather which only occurred while the conference was in session. LCA retains the things that make it special: interesting talks on a wide variety of topics, a unique mix of people, lots of fun, and a generally friendly atmosphere. Also notable was the presence of more women than at any other event you editor has ever seen - and the fact that nobody even felt the need to comment on it.

[Suzanne and Andrew Ruthven] Even an article of this length - along with the other half-dozen LWN articles coming from this conference - cannot cover all of the interesting things that happened there. Also noteworthy were Selena Deckelmann's lightning talk on using free software to help overturn a rigged African election, Gabriella Coleman's keynote on free software culture, Patrick Brennan's talk on Albany Senior High School, which abruptly shifted to Linux in 2009, Joel Stanley's push for hardware designed explicitly to run free software, and, needless to say, the traditional Penguin Dinner, even if memories from that particular event tend to be a bit fuzzy.

LCA 2010 organizers Andrew and Susanne Ruthven are to be commended on their stewardship of this venerable event. LCA might not have been in Australia this year, but they managed to keep all that makes LCA worthwhile while bringing it to an interesting new venue. For added fun - since organizing a conference like LCA is evidently not enough work on its own - they also threw having a baby into the mix and still kept everything together (with a lot of help from the rest of the organizing team, needless to say). They are probably more than ready to pass the baton on to next year's organizing team, which announced that LCA 2011 will return to Brisbane, Australia, probably in early February.

Comments (16 posted)

LCA: Static analysis with GCC plugins

By Jonathan Corbet
January 22, 2010
Taras Glek works for Mozilla, but he is not a browser hacker; instead, he works on GCC and other tools aimed at making the browser development process better. It is, he says, a good job. While carrying out his duties, Taras has been able to put a new GCC feature to work in ways which may prove to be useful well beyond Mozilla.

Development tools are important; they can help us to produce software more quickly and with far fewer problems. Unfortunately, Taras says, we are stuck in the stone age of software development, using tools from the 1970's. Our code base is growing, though, to the point that developers often cannot understand the entirety of even a single application. We need [Taras Glek] some way to amplify our capabilities so that we can continue to make more powerful applications; static analysis tools can bring some of those capabilities.

Static analysis, in essence, treats the code as data which is then the subject of further analysis. It has often been seen as a backwater, an area of primarily academic interest. When static analysis tools have found their way into more common use, it has generally been in their ability to find certain classes of bugs. But there's more that can be done with these tools: finding API abuse, generating library bindings, improved code base visualization, and more. Static analysis has been put to use with Mozilla to find dead code; thousands of lines of code have been found to be completely unused, despite the fact that engineers were putting their time into maintaining it.

The Mozilla project has an especially strong need for good tools. It is a huge code base (1.7 million lines of C++ and 1 million lines of JavaScript); humans just do not scale to that amount of code. This code base is under constant optimization work, so refactorings are frequent. Without some help, keeping this code in good condition is a major challenge.

Much of Taras's work seems to be aimed at mitigating some of the pains that come with C++ development. One of those pains is that the language is just about impossible to parse; the parser must actually instantiate types before it can complete its job. So anybody who wants to analyze C++ code must first find a decent parser for it. The available options are limited. The LLVM compiler is promising, but it's going to be another year or two before it's really ready for prime time. The Elsa tool can be used, but it's essentially unmaintained and not really guaranteed to be correct.

The one other option - one which is known to have a complete C++ parser - is GCC. But the GCC code has a bit of a nasty reputation, so Taras started off using Elsa for his work. Eventually, though, he turned back to GCC for something more solid, and hasn't looked back - the hairiness of GCC has, perhaps, been exaggerated. But, more to the point, the upcoming GCC 4.5 release is, he says, "the most exciting release ever." The reason for that is the long-delayed addition of the plugin API, which became possible once the runtime library license exemption finally went into place. With this API, analysis code can easily hook into the compiler and inspect code at whatever stage of the process best suits its needs.

Beyond plugins, GCC has a few other features which make it suitable for static analysis work. The ability to attach attributes to objects in the compiled code makes it easy to pass hints through to later processing steps. The new pass manager brings a relatively modern structure to a compiler which did not originally have one. And the GIMPLE intermediate representation provides much of the rest of what's needed for code which needs to inspect other code.

There are a few interesting plugins in the works. One of them is the LLVM compiler, which can be plugged in to perform the back-end functions for GCC. Another is milepost, which uses a brute-force approach to figure out the optimal settings of the command-line flags for a specific body of code. Then, there are "the hydras," which are Taras's work. These plugins take an interesting approach, in that the actual analysis work is done in JavaScript scripts. The idea was originally seen as amusing - "wouldn't it be fun to put Spidermonkey into GCC?" - but it has actually worked out well. JavaScript is a relatively nice, concise language which makes it easy to implement the needed capabilities.

The first plugin is Dehydra, so named because the control flow graph in Mozilla somewhat resembles a Hydra monster. Dehydra produces a JSON-like representation of the objects found in a C++ program; individual JavaScript scripts can then use this representation to analyze the program. The Treehydra plugin, instead, provides a JavaScript interface to the GIMPLE representation of the program; it can be used for more traditional sorts of static analysis tasks.

One of the pains that come with large C++ programs is that simply finding code can be difficult. It's not always clear which method will be invoked in a specific situation, even in the absence of things like macro tricks. To help with this problem, Dehydra has been used as the base of a source browsing tool called DXR; it's like LXR, but with a great deal of semantic information thrown in. DXR users can find types defined by macros, look up parent class information, and so on. There's also a call graph tool which can find all the callers of a specific method; that's important in C++, where overloading can make grep thoroughly unusable for this kind of task. It is, Taras says, "Eclipse-like stuff," except that, unlike Eclipse, it scales to a Mozilla-size code base.

Various other tools have been written. The final.js script (a dozen lines of code which can be seen on this page) looks for C++ methods tagged with the "final" attribute; any attempt to override those methods will result in a compilation error. It is, in other words, a port of the Java final keyword to C++. A checker which might be interesting in other environments - including the kernel - is flow.js, which can add a constraint that all exits from a function must flow through a specific label. Consider this common kernel pattern:

    if (something wrong)
    	goto out;
    /* Do some real work */

  out:
    release_locks();
    free_memory();
    cancel_self_destruct()
    return something;

It's a common mistake to add a return statement to the middle of a function like this, shorting out the cleanup code; flow.js can catch errors like that at compile time.

Additional modules include must-override.js, which can mark methods which must be overridden (but which cannot be virtual); outparams.js, which ensures that any output function parameters have been set on a successful return from the function, and stack.js, which enforces a requirement that specific classes only be instantiated on the stack, since the garbage collector is not prepared to deal with them. Taras is also working on a checker for variables which shadow class members - a mistake which GCC does not catch now.

For the time being, this work is mostly used within the Mozilla project, though Taras would clearly like to see users from the wider community. He looks forward to a day when libraries are distributed with a plugin which ensures that the library is being used correctly. Another nice feature would be a distribution-wide DXR, enabling cross-package source browsing. For now, though, we have a set of tools that serves as a good proof of the concept that GCC plugins can be used for static analysis.

Comments (53 posted)

LCA: HackAbility

By Jonathan Corbet
January 27, 2010
Bright purple hair seems certain to make Liz Henry distinct from the crowd, but it's another attribute that she came to linux.conf.au 2010 to talk about: her wheelchair. It is, in essence, a machine to move her body around. It's not surprising that she would like it to be easy to fix or to hack on, but that is not how things are. Cars can be fixed easily; anybody with a few skills can start a car repair business. But this cannot be done with wheelchairs, which are much simpler devices. A wheelchair is a medical device, so the normal rules don't apply. Liz would like to change those rules; she also wants the rest of us to understand why we want to change them too.

People with disabilities may seem like a distinct group, but the fact of the matter is that almost all of us will be people with disabilities at some point in our lives. The average human, Liz says, will spend about eight years coping with some sort of disability. The result is a huge business, fueled by large amounts of money from insurance companies and government. That business is not greatly concerned with empowering disabled people; that's something we're going to have to take care of ourselves. We cannot depend on nanobots to keep us going as our bodies age; instead, we should be designing and coding for our future now.

[Liz Henry] People who want to hack their own disability solutions will find relatively little useful information online. Why? Possible reasons include profit motives in a highly lucrative industry, the perceived need for the intervention of medical experts when creating solutions, and concerns about liability should things go wrong. Disabled people also tend to be pushed into the role of passive charity recipients and isolated from each other. So what disability solutions exist come from the "medical industrial complex." Most of us will need these solutions at some point, and we'll want to be able to hack on them; the medical industrial complex is not much interested in helping us to do that.

The best progress which has been made so far is in the areas of vision, speech, and gaming. We're seeing less in mobility, so far. But, even there, simple hacks exist: it's common to see users of walkers who have fitted tennis balls over the feet to make them glide properly. (Your editor notes, with amusement, that Walmart is selling walker tennis balls for a mere $28 - the price of dozens of normal balls). This is a hack which is easily done, easily noticed, and easily copied, so it has spread widely. Pockets for crutches made of duct tape were another example presented in the talk.

A good example of how things fall down can be seen in the area of ramps. A ramp is not a complex device, but ramps must still be built properly if they are not to collapse or dump their users on the floor. Information on proper ramp building is discouragingly rare on the net, and what is there is not open to contributions. Other bits of interesting information - such as the soda bottle prosthesis - are available, but what we're seeing, still, is relatively small attempts. There's no real model for building community around this kind of information yet.

Disability-friendly software, too, is not an easy hack; accessibility tends to be treated as a last-minute add-on. Web site accessibility, too, is often an afterthought, and tends to be user-focused. This approach tends to lead to sub-standard solutions, but it also fails to lead to a free, do-it-yourself culture. We need good accessibility for developers too.

Liz talked about a number of projects aimed at making life better (and more hackable) for people with disabilities. Consider voice synthesis and screen reading: much of what's happening in this area is proprietary, but there are also projects like Festvox, Fire Vox, NVDA, and the tools at Full Measure (Speakup was not mentioned). Other interesting projects include:

Liz also mentioned the BBC accessible newsreader; she wishes that the BBC would release the code so that it could be incorporated into content management systems and made widely available.

On the other side, there are antifeatures which make life harder for those who would hack better solutions. These include systems which people with disabilities cannot contribute to and one-off solutions which cannot be extended or improved upon. Especially harsh words were reserved for those who exploit vulnerable people; there is an awful lot of incredibly expensive assistive technology out there. "Freaking out about liability" is also an antifeature; Liz feels that many of those concerns are greatly overblown. Selling out to industry - going for patents and profit rather than making technology available - is also a step in the wrong direction.

As an example of good and bad ways of doing things, Liz contrasted the Free Wheelchair Mission and Whirlwind Wheelchair International. The former makes dirt-cheap wheelchairs out of lawn chairs and bicycle wheels, then ships them by the container load to poor countries. It seems like a good idea, but dumping all those cheap chairs devastates any local market that may have developed. When the chairs break (which tends to happen soon), there's nobody left to help keep them going. Whirlwind, instead, is focused on partnering with local industry and sharing information, creating a more hackable solution with more people to hack on it.

The core message from the talk was that disabled people are hackers by necessity; we should bring them in, get their input, and enable them to create their own solutions. Their solutions will become our solutions. We should, Liz says, prepare to open-source our way out of the retirement prisons which are waiting for us.

Comments (5 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: Encrypting users' web data with Grendel; New vulnerabilities in acroread, dokuwiki, kernel, phpgroupware,...
  • Kernel: fincore(); Graphics drivers; Back to the drawing board for utrace?; Replacing ptrace().
  • Distributions: Fedora launches Cloud SIG; Debian Edu lenny rc1; Guitar-ZyX LiveOS 0.4.1; EOL for Debian 4.0; Brockmeier leaves Novell, new community manager wanted; Ubuntu changes to Yahoo; ArchCon2010.
  • Development: Numerical computations with Scilab 5.2, GNOME in 2010, KDE Tech Talks new versions of SpamAssassin, Kamailio, Apache Lenya, Ardour, Mozilla Lightning, Veusz, KDE, PyQt, SIP, Urwid, Wine, GNUmed, Pyspread, Firefox, GCC, IcedTea6, Parrot, simavr, LDTP, Git.
  • Announcements: Opensource.com launch, Red Hat state of union, EU clears Oracle/Sun, MS sues TiVo, Kindle Dev Kit, SUSE Appliance Toolkit, Operating System Tracing, LSE migrates to Linux, Misa Guitar, GUADEC cfp, OSCON cfp, Cloud Computing Forum, FTC Privacy Roundtable, Global Ignite Week, Python Ireland, SCALE schedule, Thailand MiniDebCamp.
Next page: Security>>

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds