LWN.net Logo

Advertisement

E-Commerce & credit card processing - the Open Source way!

Advertise here

OLPC's software update problem

From the outside, much of the work going on at the One Laptop Per Child project appears to be oriented toward hardware. Successive test versions of the now well-known little green computer have been produced, each with more powerful components and (presumably) fewer glitches than those which came before. Work on getting suspend/resume functioning properly - critical for the laptop to meet its power use goals - is heading toward the final stages. It looks like a nice machine.
Advertisement

The software side of the OLPC project is just as interesting as the hardware. The project has been occasionally criticized, though, for concentrating on hardware and being slow to get its software together. Much of this criticism is not really warranted; work on the Sugar environment has been underway for quite some time, and there are a number of interesting applications coming together for this platform. In an area or two, however, it does seem like problems are being addressed a little later than might have been optimal.

One of those areas, as evidenced by a series of discussions on the project's mailing list, is the issue of software updates. The OLPC project plans to deploy millions of laptops into environments where skilled system administrators are scarce. It seems certain that, sooner or later, there will be a need to update the software installed on those systems - perhaps urgently. It is reasonable to expect that the children using these laptops might just not be entirely diligent in checking for and installing updates. So something with a relatively high degree of automation will be required.

There are some additional complications which must be taken into account. The OLPC project has decided to dispense with Linux-style package managers in favor of a whole-image approach. OLPC has the resources to fund some fairly strong servers and network bandwidth, but putting together the resources which can handle pushing an update to millions of laptops at the same time might still be a challenge. In fact, simply coping with update-availability queries from that many laptops would require significant resources. So how will OLPC handle software updates? It turns out that they still don't really know.

Discussions started when Alexander Larsson showed up on the list with an announcement that he was working on the software update task. His proposal was an interesting combination of tools. In this scheme, a system image looks a lot like a git repository; it contains a "manifest" which (like a git index) has a list of files associated with SHA1 hashes of their contents. Updating a system involves getting a new manifest, seeing which files have changed, grabbing their contents, and dropping them in place. The actual safe updating of the system image is done by way of the Bitfrost security model which was announced last February.

Alex's proposal uses the Avahi resource discovery protocol to find updates. Once one system on a given network (often the school server) obtains a copy of the update, it advertises it via Avahi. All laptops on the network can then notice the availability of the update and apply it. Once a laptop has the update, it, too, can make that update available over the mesh network, facilitating the distribution of the update to all systems on the net.

Ivan Krstić, the author of Bitfrost, has a different approach. It starts by taking advantage of one of the OLPC's more controversial features: the phone-home protocol. Laptops have to make regular contact with special servers to check whether they have been stolen; laptops which have been reported stolen can be shut down hard by the anti-theft server. Ivan's update proposal has the laptops checking for software updates while doing the "am I stolen?" check; the servers will be able to reply that the laptop remains with its owner, but that it is running old software and should update.

If the laptop needs an update, it will attempt to obtain the necessary files (using rsync) from the school server. If these attempts fail for a day or so, the laptop will eventually fall back to an "upstream master server" for the update files. The use of rsync allows updates to be transferred in a relatively bandwidth-friendly manner. Only changed parts of changed files need be transmitted over the net. It also has the advantage of being a known quantity; there is no doubt that rsync can be made to work in this setting. There is some concern that rsync tends to be resource intensive on the server side, meaning that those upstream master servers would probably have to be relatively powerful systems. If all goes well, though, the load on those servers would be mitigated by distributing updates through the school servers and staggering updates over time.

Ivan's proposal has also been criticized because it requires the use of central servers rather than distributing updates through the mesh network. He responds:

It requires a server because I think it's outrageous to consider spending engineering time on inventing secure peer-to-peer OS upgrades, never before done in a mainstream system, over a network stack never before used in a mainstream system, two months before we ship.

As an aside, this conversation also brought out some serious unhappiness about the use of Linux-VServer in Bitfrost. The (seemingly permanent) out-of-tree status of Linux-VServer makes it harder to support over the long term; it seems that the project may well move to a different solution once it has shipped its first set of systems.

Back on the update front, yet another proposal was posted by C. Scott Ananian. In this scheme, each laptop will occasionally poll a master server to see if an update is available; this poll might take the form of a DNS lookup. The more systems there are on the local network, the less frequently these polls will happen.

If a laptop discovers that an update is available, it will start pulling it down from the master server. This update will be divided into a number of small chunks, each of which is independently checksummed and signed. As those chunks come in, the receiving laptop will send them out to a multicast address on the local mesh; all other laptops in the area should then see it and grab a copy as it goes by. Once all of the required pieces have been received, the update can be applied. If a laptop misses a segment as it goes by, it will eventually time out and start actively grabbing (and rebroadcasting) pieces itself.

Which approach will be adopted is not clear; if the project has decided on a proposal (or a combination of them), that decision has not been posted on a public list. Time is tight, though, and a rock-solid solution will have to be in place before the first production systems ship. It is, after all, risky to count on being able to fix the remote update system (remotely) after the fact.

For a more general view of the state of OLPC software, a look at this message from Walter Bender (the OLPC president for software and content). A lot is happening, but a number of desired features (including the famous "view source" key) will not be functioning when the first systems ship. The OLPC software, he says, is a work in progress - much like the rest of our software. The "progress" part is clearly happening, though, and OLPC appears to be on course to deliver a system which will bring computing power and network connectivity to millions of children - and which will change our views of how that should be done.

Comments (7 posted)

The conclusion of the GPLv3 process

The GNU General Public License has always been a controversial document. To some, it is at the core of what free software should be. To others, it is a needlessly complex license (at best) or an intrusive and unwelcome attempt to control how others use "free" software. Regardless of how one feels about it, the GPL has, since version 2 was written, become an important piece of regulation for the software industry. So it is not surprising that the effort to create a new major release of the GPL created some conflict. In fact, the surprising part might be just how little conflict there was.

In early 2005, before the rewrite process really took off, Eben Moglen gave a talk which discussed what was coming. There were, he said, four completely different sets of goals which a new license had to meet:

  • The GPL is a worldwide copyright license - a relatively rare thing in an industry where licenses tend to be specifically written to a particular country's laws.

  • It is a code of industry conduct, describing how players in the free software world can be expected to deal with each other. At this stage in the development of the industry, a new code of conduct cannot be imposed without extensive consultations with the affected companies.

  • It is a political document, the constitution of the free software movement.

  • Finally, the GPL is very much the product of Richard Stallman's thought. Mr. Moglen was clear from the outset that any revision of the license would have to be acceptable to Richard Stallman.

That is a wide set of criteria to satisfy; this is not a challenge that just anybody would want to take on.

Regardless of what one thinks of the final result, one cannot fault Eben Moglen for not having thought hard about the process. Several committees were formed to represent the interests of different constituencies. Lawyers from all over the world were called together to work on language with truly global applicability. Major industry players were brought together on regular conference calls to discuss the progress of the license. Several draft releases were made - each with supporting documentation - and a mechanism by which anybody could make comments was created. Meetings were held all over the planet.

The final result was released on June 29. There are few who would call this result perfect; Mr. Moglen says:

It is a little too long; it is a little too complex. It divides cases where they might with some analytical clarity have been merged, and it merges cases that might with some analytical clarity have been divided. It isn't one man's work of art -- it's a community's work of self-definition. And in that process, it replicates an early version of a 21st century reality which is that if in the 21st century what is produced is produced by communities, not by individuals and not by factories, then under 21st century conditions, what produces law is communities, not individuals and not the factories we call legislatures.

The process would appear to have met all of the objectives set out for it. The language of GPLv2 is very much oriented toward U.S. law; GPLv3 makes it global. The free software industry, for the most part, has made a show of welcoming the new license; this appears to be a code of conduct that it can live with. The people who identify themselves strongly with the free software movement seem to be quite happy with this license. And, one expects, Richard Stallman is not overly displeased with what he got.

Others in the community have been very vocally unhappy with GPLv3. To them, this license overreaches, trying to regulate how people use the software instead of just how they distribute it. It has too many legal kludges and special cases. It has, in the view of some people, failed to live up to the Free Software Foundation's promise that revisions of the GPL would be "similar in spirit" to GPLv2. Instead, they say, the FSF has taken this rewrite as an opportunity to force its views on a world which may not otherwise be ready to adopt those views.

The good news is that those people, and the projects they represent, need not move to GPLv3. Version 2 of the license remains valid and usable; despite its American-style language it appears to be enforceable over much of the world. Nobody is trying to force any project to change to a license it does not like.

Expect spirited discussions within some projects as they try to decide whether to move to the new license or not. But the wider discussion is done, and GPLv3 is a reality. It will take years to see what the effect of this new license is. The patent licensing and anti-DRM clauses may well cause some companies to reconsider the use of free software in their products; in the worst case we could be seeing the beginning of the BSD comeback. As worst cases go, that one can only be seen as relatively benign.

This rewrite has probably gone as well as it could have, given the parameters within which the FSF operates. Never before has the FSF sought so much input - and actually acted on it. Whether one likes the end result or not, it is appropriate to thank the FSF for putting in its best effort, and especially to thank Eben Moglen for devoting so much of his life to such a difficult project.

Comments (13 posted)

Linux Symposium 2007 - a summary

The 2007 [Ottawa] Linux Symposium has run its course. All of the casualties from the closing party (perhaps made more numerous by the new practice of sending around waiters with trays full of shots of tequila) should have found their way home by now. Your editor has returned from this year's event; here's his summary of what took place.

[Greg KH] Greg Kroah-Hartman has been digging through the kernel source repositories for statistics much like your editor has. The resulting numbers are similar, though Greg has cranked through the full 2+ years of history in the mainline git repository and, thus, has a longer-term sort of view. Among other things, he concluded that, in that time, the kernel developers have averaged almost three changes per hour - every hour - during that time. About 2000 lines of code are added every day. That is a pace of development which is matched by few - if any - projects anywhere in the world. Greg also notes that the number of developers involved is growing with each release. This, he says, is a good sign; the kernel community is bringing in new developers, important to keep the process healthy.

Those interested in the detailed numbers can find them in Greg's paper (all of the OLS papers are available online). What many people found as interesting as the numbers, however, was Greg's chain-of-trust poster. He took the signed-off-by path from every patch in 2.6.22 and plotted all of them as a big graph. The result, showing the approximately 900 developers who got patches into 2.6.22, was a plot some 40 feet long which crashed almost every printer he tried to print it on. The plot for the entire git repository history would have been nice, but, Greg says, it would have printed out at 250 feet.

[Kernel poster]

One might have expected the plot to look like a nice, neat tree showing how patches move up through the subsystem maintainers toward the mainline. In fact, says Greg, it's "a mess." The interactions between kernel developers are broad and do not fit into any sort of simple hierarchy; it is a loose and flexible system. Greg encouraged all developers represented on the plot to sign their little bubbles; after the poster has run up some frequent-flier miles and acquired enough signatures, it will be auctioned off for some good cause. Over the course of the conference, just over 100 developers added their signatures.

Jon "maddog" Hall is not quite the ubiquitous figure at Linux conferences that he was a few years ago. So it was nice to see him show up at OLS this year. Maddog remains an engaging and amusing speaker. His topic this time [Maddog] was how we are really going to get Linux systems to the masses - especially in the urban environments which house much of the population in the developing world. His answer is thin clients. He would like to see most users working with small, low-power, fanless boxes with a nice screen and the ability to talk with a central server which hosts software, user files, and more. All running Linux, of course.

His vision for where this could go is ambitious: he would like to see 150 million of these thin clients deployed in Brazil, for example, supported by as many as 2 million servers. This would bring affordable computing to almost all of Brazil's city dwellers in an ecologically sensible way while providing about 2 million technical jobs. And it could all be done through private initiatives. If this sort of development can be made to happen, says Maddog, we may truly achieve the potential offered by computers and by free software.

Martin Bligh has an interesting job: he gets to find out what causes the occasional machine to go wrong in the middle of the massive Google network. It can be a real pain when, on occasion, one machine out of thousands will crash or slow way down in a non-reproducible way. And only in production, of course. Martin described a few such problems and how they were tracked down through the use of a set of tracing tools used at Google. Finding this kind of problem requires the ability to collect data in a flexible manner without disrupting ongoing operations. Google has developed the tools to do this sort of tracing; much of the resulting work will be merged into LTTng project and made available to the community.

The keynote speaker this year was James Bottomley, who spoke on the topics of diversity and evolution. Diversity is the stream of new ideas which are always being directed toward any active free software project; evolution is the (sometimes harsh) process which selects the ideas which actually work. Evolution in this context is selecting mostly on the patience and [James Bottomley] innovation of the development community - not necessarily on the usefulness of a given patch. KAIO (kernel asynchronous I/O support) was given as an example here.

Maintainers play a vital part in the evolutionary process. The key to being a good maintainer - one who helps move the community forward - is to not reject changes out of hand but to work with developers to bring things up to kernel standards. Being a maintainer, says James, is not about saying "no"; it is about saying "no, but..."

Fragmentation is often raised by proprietary vendors as a way of scaring people away from Linux. Bringing up fragmentation is a way of calling up memories of the Unix wars, where fragmentation truly was a damaging phenomenon for just about everybody involved. In the free software world, though, we don't have fragmentation; instead, we have forking. James claims that forking is an essential source of diversity; it's necessary for continued innovation. No project, he says, is truly open unless it can fork. In the end, openness and evolution drive forks to merge back together, propagating the good ideas that resulted.

One final topic was nearly inevitable: closed-source drivers. Unlike some other speakers, James was unwilling to characterize such drivers as being either illegal or immoral. Instead, he looked at the costs involved in keeping drivers closed source - costs for both the vendor and the users - and concluded that closed-source drivers are simply "bloody stupid." Happily, he says, some vendors are figuring this out. He announced that Adaptec has become the first vendor to make use of the Linux Foundation's NDA program to provide information for the creation of free drivers for its products.

This year marks the first time in some years that the Kernel Summit was not held just before the Linux Symposium started; many people expressed concerns that kernel developers would stay away this year and OLS would not be as interesting an event. There was a reduction in the number of high-profile kernel developers this year, though quite a few were still in evidence. The 100 signatures on the 2.6.22 poster make an effective demonstration that OLS is able to attract kernel developers even without the summit. One result of the change may be that a few more relatively new and inexperienced developers were able to present this year; that should be seen as a good thing.

Something that fewer people worried about, but which may have hurt the conference more was the absence of the desktop developers summit. Desktop developers were generally absent, making the 2007 Linux Symposium, if anything, even more kernel-centered than in previous years. Bringing together developers from all over our wider community is an important function of a conference; one hopes that the desktop folks will be back next year.

On the other hand, it was a pleasure to see the large "Linux on Cell" contingent sent by Sony. The embracing of Linux by a company which has not always been known for its openness can only be a good thing, and nobody was complaining about the frequent giveaways of Playstation 3 systems - though your editor, with his usual luck, failed to win one. The Cell architecture seems destined to do interesting things, especially if the companies which are working with it continue to promote and support the use of Linux.

Back to the topic of next year: 2008 will be the tenth Linux Symposium; the organizers are clearly already thinking about how they can make it the best one yet. There is thought of moving it out of Ottawa to another Canadian city, and some possible changes to the organization of the event, including a track-oriented schedule and tutorial days, have been mentioned. This is all good; OLS is probably due for a makeover after all of these years. The 2007 event has shown that OLS can be successful on its own, without leaning on the kernel summit; perhaps 2008 will show us where this important community event can go in the future.

Comments (9 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: Open proxy honeypots; New vulnerabilities in avahi, firebird, gimp, wireshark...
  • Kernel: OLS: Three talks on power management; Group scheduling for CFS; fallocate().
  • Distributions: Package management in Gentoo Linux; Slackware 12.0, Gutsy Gibbon Tribe 2; Fedora Board elections, Gentoo Council elections; FC5 EOL
  • Development: The launch of the Apache mod_atom module, improving GNOME, new versions of Allmydata-Tahoe, ddrescue, conntrack-tools, jwhois, Ardour, QjackCtl, Lightning and Sunbird, gEDA/gaf, Kicad, SQL-Ledger, cairo, Wine, gv, eSpeak, Gran Paradiso, Cpio, WengoPhone, SBCL, libmatheval.
  • Press: Moglen on GPLv3, the impermanence of proprietary data formats, aKademy coverage, OLS coverage, Google Desktop for Linux, Red Hat finances booming, Red Hat talked patents with Microsoft, the Power of Google Gears, Troubleshooting Linux Audio, new ATI control panel, OO.o calc options, Massachusetts adds Office OpenXML as an open format.
  • Announcements: FFII prize against OOXML, Six questions on OOXML, GPLv3 released, OpenMoko update, myFUNAMBOL portal, Linspire announces document translation effort, Motorola announces OpenSAF consortium, Palamida IP analyzer goes GPLv3, OLS papers.
Next page: Security>>

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.