LWN.net Logo

LWN.net Weekly Edition for March 15, 2012

A look in on Apache OpenOffice

By Jake Edge
March 14, 2012

The Apache OpenOffice (AOO) project is in the final stretch toward its first release, AOO 3.4, but there are still some hurdles to clear. The current focus is largely on identifying and fixing the "release blocker" bugs that are being found in various developer snapshots. All of that is pretty normal for a project getting ready for a release, but AOO also needs to handle a few other loose ends. Because it is an Apache incubator project (a "podling" in Apache terms), it must undergo an intellectual property (IP) review and get approval before making the release.

The IP concerns stem from the change in license after Oracle donated the OpenOffice.org code to Apache. All Apache projects must release all of their code under the Apache Software License (ASL); any OpenOffice.org code that came directly from Oracle is easily switched from the LGPL, but code from other projects that has been incorporated into the office suite may not be available under the ASL. That has led AOO to carefully audit all of the code used and to remove or replace any non-ASL pieces. The IP review will then vet those changes to try to ensure that nothing has been missed. The process is well documented on the incubator site.

In fact, there is a truly eye-opening amount of documentation at the Apache incubator site that describes, sometimes in great detail, the life of a podling. It covers such things as how the podling should get set up in terms of organization and infrastructure, how it should prepare for a release and get IP clearance, along with the steps needed to eventually graduate to a full Apache Software Foundation (ASF) project. On one hand, documenting all of these processes is important and useful, but the sheer level of bureaucracy has to be daunting to some.

A podling first needs to get set up in the Apache infrastructure, which means setting up mailing lists and a Subversion repository for its code, but it must also learn "The Apache Way". From all of the documentation, as well as the gentle prodding from AOO mentors and other longtime Apache members on the ooo-dev mailing list, it is clear that the ASF is quite happy with its policies and procedures—not surprising given its level of success over the years. But all of that "extra" effort has certainly delayed the release of 3.4, to the point where frustration among users and developers is becoming evident.

The last release of OpenOffice.org (3.3) was more than a year ago in January 2011. Since that time, Oracle donated the code to the ASF in June, but it has taken the better part of a year to get close to a new release. That's not to say that the project has been idle—far from it as documented in Rob Weir's timeline—but it is a lot of work to move a project of its size to a new home. In the meantime, though, there hasn't been a lot of time to add new features.

New for 3.4

The most talked about new feature of AOO 3.4 is the native scalable vector graphics (SVG) import feature. OpenOffice.org had an external filter that used six GPL/LGPL libraries, which needed to be replaced. The new code has an SVG interpreter in the core, which provides better SVG support while also reducing the memory footprint and startup time—not to mention removing non-ASL code. While the feature was available as a filter in OpenOffice.org (and natively in Go-OO-derived versions of the suite including LibreOffice), it is new for AOO.

As the 3.4 release notes draft points out, there are two classes of updates: those that came from Oracle with the 3.4 beta in progress at the time of the transfer and those that have been added by AOO contributors since then. The Oracle contributions are largely incremental improvements to existing functionality, while those created for AOO may be more visible to users. Certainly SVG import fits in there, but there is also a new color picker dialog, new regular expression engine, support for line caps (i.e. how lines terminate and connect visually), and more.

All of the features and bug fixes are to the good, though they have been a long time in coming. For Linux systems, the AOO 3.4 release is likely to largely be a non-event as most distributions switched to LibreOffice (LO) long ago. Most of the features from the 3.4 beta are already present in LO; should any of the AOO additions be of interest, they can be adopted as well, of course. It is in the Windows world (and to a lesser extent Mac OS X) that any rivalry between AOO and LO will really play out.

Apache OpenOffice and LibreOffice

It's clear that a rivalry does still exist between the projects, and that the bad blood between them has not been cleared up. A recent effort by Simon Phipps to clarify some facts about AOO seems to have run aground at least partly because of the unhappiness between the projects. After posting his query to the mailing list, Phipps was asked to put it into FAQ form on the wiki, which he did, but that doesn't seem to have helped. It could be argued that his wording was insufficiently neutral—many have—but his attempt was meant to answer questions that are commonly asked in various forums, mailing lists, and so on. His biggest mistake, it seems, was mentioning LO as a possible interim solution until the 3.4 release is ready. Eventually, Phipps gave up trying to work on the FAQ after Weir rewrote most of it.

Some users are understandably concerned that no releases of any form of "OpenOffice" have been made for more than a year now. Undoubtedly they are interested in new features, but bugs, particularly security bugs, haven't been addressed in that time either. It may be that there are no known security problems with OOo 3.3, but there is reason to believe otherwise. Some suggested the proprietary IBM Lotus Symphony (which is based on the OpenOffice code) as an alternative in the interim but that doesn't appear in the draft FAQ either at this point.

That conversation, which spreads itself out over at least three threads, is indicative of the tension between the two projects. There seems to be a fair amount of energy being expended in fairly pointless—quite possibly counter-productive—arguments about which project is the rightful owner of the "OpenOffice" brand and community going forward, along with combating things "in the press" and elsewhere that are deemed to be FUD. What's really needed, as is often pointed out, is to focus on the release. Right now, anyone asserting that AOO is superior to other alternatives is missing an important point: there is no AOO currently available and that won't change for a bit.

That is not to say that there aren't provocations from some on the LO side—there are. But at this point, the split has happened and there is no going back, so dwelling on it seems like wasted effort. It's likely that as the projects mature, there will be less sniping; it's rare to see KDE and GNOME engage in that sort of thing these days, for example. Once there is an AOO release, and the project graduates to a full-fledged Apache project, assuming that happens, some of the bad blood may start fading away.

Progress toward graduation

At least two of the podling mentors believe that progress is being made toward graduation. Ross Gardler listed numerous steps the project has taken toward that goal, concluding:

In summary, yes I think the AOO project is well on its way to graduation. A release is a pre-requisite to graduation as that is the point at which the ASF is able to assert that the code is fully license compliant. Once the first release is complete I imagine graduation will not be far behind.

I look forward to seeing AOO code allowing the further adoption of ODF alongside other great ODF related projects.

Joe Schaefer agreed, though he is "concerned about the level of commit activity being on the low-side". He hopes to see that pick up post-release as the project heads toward a 4.0 release. But, both Schaefer and Gardler are concerned about another problem, "learning to play nice with those not fully aligned to 'the one true vision'", as Gardler put it. There is a strong chorus of anti-LO sentiment that pervades the mailing list at times, even when it may not be in the best interest of OpenOffice users. That chorus is often led by Weir, who is one of the prime movers behind AOO and perhaps the most prolific mailing list poster.

As Schaefer pointed out, that is not an "Apache-esque" view of things: "At Apache we aren't in competition with other projects, we provide our work for the public benefit and leave discretion about adoption to the public." But Weir disagrees with that view. In the end, Weir's tone and demeanor seems to sometimes grate on contributors and potential contributors as well as on some of the project's mentors.

In the end, though, as many point out, it will come down to the code. Can AOO get a solid release out the door, and then continue that success down the road? That, much more than any branding question, is going to determine the long-term success of the project. At this point, it seems that there are only a handful of release blocking bugs, and the first release candidate may be imminent. But, so far, there have been no comments on an attempt to get the wider Apache community to start looking at the IP issues, so that may still take some time.

While it is in many ways unfortunate that the LO/AOO split ever occurred, the projects can certainly benefit from competition. Even if code can really only flow one way (and divergence is likely to limit that eventually), good ideas can certainly flow both ways. There is plenty that these two communities can work together on: ODF interoperability and enhancements, security issues in the shared code, promoting free office suite alternatives, and so on. One hopes we will see more of that in the future.

Comments (11 posted)

Vagrant 1.0: Virtual machines at your fingertips

March 14, 2012

This article was contributed by Koen Vervloesem

If you want to get up and running quickly with virtual machines, Vagrant could come in handy. After two years of development, the project has announced Vagrant 1.0, which is the first stable release and the first release for which the developers promise backward compatibility.

Vagrant starts from the idea that many developers do their development and/or testing in virtual machines to work with different distributions, avoid reboots or polluting their main workstation operating system with conflicting dependencies or simply bad packages. But having a couple of virtual machines installed requires managing them. And each time you have to install a fresh developer VM, you have to spend some time installing and configuring it. That's where Vagrant comes in: it's a tool that can automatically set up pre-configured virtual machine instances for developing and testing purposes, based on one of many VM templates. According to the 1.0 release announcement, Vagrant is in use by Mozilla, LivingSocial, EventBrite, Yammer, Disqus, and many more organizations.

For the moment, Vagrant is focused on the creation of virtual machines for Oracle's VirtualBox, so you need VirtualBox installed (version 4.0 or higher). The Vagrant web site offers rpm, deb and Arch Linux packages of version 1.0 for 32 and 64-bit x86 Linux, as well as packages for Mac OS X and Windows. Alternatively, you can also install Vagrant with Ruby's package manager RubyGems (gem install vagrant), as Vagrant is written in Ruby.

Getting started

The project has published excellent and up-to-date documentation on its web site, as well as a "Getting Started" guide. Vagrant is controlled through subcommands of the vagrant command and the configuration is done per project (preferably with each project in a separate directory) in a Vagrantfile, which has a similar goal as a Makefile for a development project. A Vagrantfile is actually a file containing Ruby code, which configures the project's virtual machine. Vagrant is able to create an initial Vagrantfile with the vagrant init command, which results in a Vagrantfile that documents the most common configuration options in long comments.

Another important concept is that of "base boxes." Instead of creating a virtual machine instance from scratch, Vagrant bases its instances on templates, which are called base boxes. A base box is basically a tar ball containing a root file system and a VM configuration with things like RAM and disk size. With a "vagrant box add" command you can download a base box from an HTTP URI or a local filesystem and copy it to your local Vagrant installation. After that, you can use this base box as a template for any of your projects by specifying its name in the Vagrantfile. The Vagrant web site contains a 32-bit, 259 MB Ubuntu Lucid Lynx base box, as well as a 64-bit variant.

When running vagrant up for the first time in a project, Vagrant creates a virtual machine based on the base box and starts a headless instance using VirtualBox. Now you can do some work with the virtual machine. You can suspend and resume it ("vagrant suspend" and "vagrant resume"), or you can completely halt it with "vagrant halt", which shuts down the VM. If the virtual machine has been shut down, a "vagrant up" doesn't re-create the machine but reboots it instead. Another option is to completely delete the virtual machine with "vagrant destroy", which of course deletes the whole VM image and thus the files included in it. After a virtual machine is deleted, a "vagrant up" command will re-create it based on the configuration in the Vagrantfile.

An advantage of Vagrant is that these virtual machines are easily shareable, for instance with co-workers: you can package a virtual machine with the "vagrant package" command. Beyond that, you can create your own base boxes for your favorite Linux distribution. That way you can package a complete development environment in a Vagrant box and distribute it to others who can use this reproducible environment with a single command.

Configuration management

If the virtual machines you could create with Vagrant were limited to copies of the base boxes, this wouldn't be so useful, as you would have to create a base box for every configuration you need. Thankfully, Vagrant allows you to provision your virtual machines using the configuration management systems Puppet or Chef. This allows you to use a base box with the very basic functionality that all your virtual machines need, and then add extra packages and configuration changes using a Puppet manifest or a Chef cookbook that you refer to in the Vagrantfile.

Provisioning is done when you enter "vagrant up" or "vagrant reload" (which reloads the VM's complete configuration in the Vagrantfile), but you can also use vagrant provision to reload only the Puppet or Chef configuration after you have changed it. Vagrant can provision your virtual machines even if you don't want to run a Puppet or Chef server: it calls these modes Chef Solo provisioning and Puppet provisioning. The only thing you have to do is add the location of your manifests or cookbooks to the Vagrantfile. Of course Vagrant is also able to provision your virtual machines using an existing Puppet or Chef server.

Talking to the virtual machine

But Vagrant isn't just about creating virtual machines based on templates and provisioning them. Its most powerful idea is that it sets up some channels to communicate with your virtual machines. For instance, it provides SSH access: with the command vagrant ssh, it logs you into the virtual machine so you'll be able to enter commands. X11 forwarding isn't enabled by default, but this can be configured in the Vagrantfile, which could come in handy if you have X installed in the virtual machine and you want to run graphical programs using ssh -X.

Moreover, Vagrant automatically configures your project's directory as a VirtualBox shared folder and mounts it in the virtual machine on /vagrant. The virtual machine has both read and write access to this directory, so you can easily use this to exchange files between your host system and the virtual machine. If the performance of the VirtualBox shared folder is not enough (which is typically the case when you have thousands of files), you can also set up NFS shared folders.

Vagrant also allows you to configure port forwarding in the Vagrantfile. For example, you could fire up a virtual machine with a test web server, forward its port 80 to a port on your host system, and then easily access the web server using a localhost URI, so you don't have to remember the virtual machine's IP address. It's also possible to create a multi-VM environment with multiple virtual machines (for instance a web and a database server) communicating with each other.

Development

Vagrant is open source, as it uses the MIT License. The code is on GitHub, and its README file offers some help on how to contribute to Vagrant. There's the #vagrant IRC channel on Freenode and the mailing list for questions; the project also has an issue tracker for reporting bugs.

Vagrant was started in January 2010 by Mitchell Hashimoto and John Bender. The first release was version 0.1.0 on March 7, 2010, and exactly two years later it saw a 1.0 release. Vagrant development is not backed by any single company, but it's sponsored by Engine Yard and Kiip and has attracted contributions from over a hundred individuals during those two years. One of these outside contributions is the Veewee tool, created by Patrick Debois to make building your own base boxes easier:

Veewee tries to automate this and to share the knowledge and sources you need to create a basebox. Instead of creating custom ISOs from your favorite distribution, it leverages the 'keyboardputscancode' command of Virtualbox to send the actual 'boot prompt' keysequence to boot an existing iso.

Veewee comes with a lot of templates for various Linux distributions, including CentOS, Debian, Fedora, Arch Linux, Gentoo, openSUSE, Ubuntu, and so on, as well as FreeBSD, OpenBSD, OpenIndiana, and even Windows. The best thing about these templates is that you can see how they are made, so you can adapt them to your needs.

Other contributors have created plugins for Vagrant. A simple:

    gem list -r | grep vagrant
command reveals more than a dozen RubyGems for Vagrant plugins. For example, Igor Sobreira has created the vagrant-screenshot plugin to take a screenshot from a running virtual machine to help debug booting issues. And Tyler Croy has integrated Vagrant with the continuous integration tool Jenkins.

The Vagrant project welcomes any contribution: code, documentation, as well as financial aid. There's a rather detailed explanation about how companies can support the project financially, by donating, sponsoring, and paying for specific feature implementations or bug fixes. The project is also very open about its current and future costs.

The future

While previous Vagrant releases regularly changed the syntax of the Vagrantfile, which could lead to some frustrations if you were an early adopter, the 1.0 release marks the end of this time of experimenting, according to the release announcement:

Equally important is that Vagrant 1.0 is the first release where backwards compatibility for the Vagrantfile will be maintained for the far future. Backwards incompatible changes to the Vagrantfile will no longer happen (exactly how this will be achieved will be revealed in the future, as I've devised a way to do so without compromising innovation).

Currently Vagrant only supports VirtualBox, but the plan is to support additional hypervisors, such as KVM, VMWare Fusion, VMWare vSphere, and so on. If you need extra functionality, you can add it using Vagrant's plugin system. All in all, the basic idea of distributable boxes coupled to the extensibility thanks to plugins makes Vagrant a handy tool for development and testing. Add to this the excellent documentation and the ecosystem of Veewee templates, and Vagrant may well be able to save you a lot of time.

Comments (5 posted)

OIN expands its coverage

By Jonathan Corbet
March 13, 2012
The Open Invention Network recently announced the expansion of its "Linux System Definition," meaning that a larger range of software is now covered by the group's patent license agreement. New packages on the list include Git, OpenJDK and WebKit; that list has also been updated to cover current versions of the listed packages. This expansion is welcome, but it also highlights some of the limitations of what an organization like OIN can accomplish.

OIN is meant to be a sort of patent club that reduces the risk of patent litigation for its members. OIN members sign on to the organization's patent license agreement, granting a license to their patents to all other members for use with Linux. There is a set of patents owned by OIN itself; companies gain access to those patents by signing the agreement. But the real value in OIN membership is meant to be protection from other OIN members; no member may assert patent claims against another member (with some exceptions - see below) without risking the loss of its own patent use rights under the agreement. The list of OIN licensees makes it clear that a lot of companies, including Cisco Systems, Collabora, Canonical, Google, HP, IBM, Mozilla, NEC, Novell, Oracle, Philips, Red Hat, Sony, and Twitter, see value in this arrangement.

That said, there are some obvious limitations to the benefits of OIN membership. It is sometimes said that members may use the full set of licensed patents in their defense, but there is nothing in the agreement that allows that use. No OIN member is required to use their patents (or to allow them to be used) in a counterattack against a patent aggressor. Indeed, if one OIN licensee (call it "EvilCorp") sues another ("NiceCorp"), a third licensee (that we'll call "ConcernedCorp") still cannot, by the agreement, withdraw the patent license it granted to EvilCorp - though, interestingly, the license for patents owned by OIN itself can be withdrawn in this situation.

In other words, OIN reduces the chances of being attacked by its other members, along with reducing the chances that such an attack would succeed. It offers no real counterattack capability at all. The agreement also only covers OIN licensees; it says nothing about their customers, who could still be the target of an attack.

The license agreement only applies to the "Linux System," a well-defined list of programs that must be used with the Linux kernel. That list contains almost 1900 programs making up the bulk of what one might expect to find on a typical Linux system, though certain types of applications - mplayer and VLC, for example - are notably missing. The agreement applies to specific versions of these programs; the 3.1.0 kernel is on the latest list, for example. "Successor releases" are also covered with an interesting exception:

to the extent such later release contains modifications to existing functionality for: compatibility (e.g., standards compliance or porting), performance enhancements (e.g., increasing execution speed, code maintainability, security or bug resistance), usability, and localization and internationalization, but to the extent the later release contains new functionality which does not exist in such component, the portion of the later release providing such new functionality is not included...

So just about anything can be tossed in as long as it's a bug fix or a performance or usability enhancement; as soon as it crosses the line into adding "new functionality" the coverage ceases. One can easily imagine a future court case hinging on whether a change is a usability improvement (covered) or a new feature (not covered). To be covered, the code must be distributed by the project's maintainer. Private changes are not covered, but the unchanged code remains covered in private versions.

There are some exceptions, though, even with regard to the exact versions of packages on the list. Anything that implements something that looks like a digital video recorder, DVD player or recorder, or an electronic program guide is excluded. Anything involving codecs is also excluded except for those found on this list; GIF, PNG, and FLAC are all covered, as is "RAW" (whatever that means), but many others, including some intended to be unencumbered, are absent from the list. Codecs remain a patent minefield, and OIN has not attempted to solve that problem.

While Philips and Sony are OIN licensees, they have carved out some additional exceptions for themselves. These include anything having to do with Blu-ray, "receiver functionality," anything related to DRM, or "digital display technology." And those are the small ones. These companies also except anything having to do with wireless networking - including both WiFi and networking through a cellular network. "Camera functionality" - anything capable of capturing an image - is excluded. There is also an exception for "technology for human-computer interaction, including interaction and appearance of applications, and remote control technology." For good measure, Philips also excludes virtualization.

In other words, Philips and Sony want the protection of OIN for everything not directly related to their product areas, but they want the ability to sue for anything else. And OIN is willing to accept them on those terms, evidently thinking that half a license is better than none. It is worth noting that both of those companies are listed as "founding members," a title which, presumably, does not come for free. The fact that no other companies have joined with such conditions suggests that they are expensive indeed; that is probably a good thing.

With all these exceptions, one might well wonder how much benefit actually derives from OIN membership. The fact that both Oracle and Google are members has not prevented Oracle from filing patent suits against Google (albeit relating to code that is not on OIN's list). Outright patent trolls will, of course, not be interested in OIN membership and will not be bound by its license. Similarly, companies like Apple and Microsoft have, thus far, declined the opportunity to be a part of OIN. All told, there is no evidence that the OIN has ever prevented a patent shakedown.

That said, one must recognize that any such evidence would be most difficult to find. No company will announce that it would have asserted its patents against another had it not been for those meddling OIN kids. It will always be difficult to measure the success of an organization like OIN; one can only try to read between the lines when looking at what companies do and don't do. For example, Microsoft's settlement of the Tom Tom suit, evidently on relatively favorable terms, happened shortly after Tom Tom joined OIN. Whether there is causality there or merely correlation is only really known to Microsoft's lawyers, but some people have certainly seen a connection.

Legal organizations like OIN are about reducing risk; in that regard OIN, by gathering together a long list of companies that are willing to license their patents for use with Linux, has almost certainly succeeded. It is also important as a very public statement by those companies that the free software commons (or, at least, a significant subset thereof) should be a sort of patent commons as well. OIN is certainly not a solution to the software patent problem, but it is a useful mitigating factor in a world where software patents continue to exist. So the updating and expansion of its list of covered software can only be a good thing.

Comments (8 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: CAP_SYS_ADMIN: the new root; New vulnerabilities in freetype, glibc, Mozilla products, python-pam, ...
  • Kernel: Kernel competition in the enterprise space; The trouble with stable pages; A deep dive into CMA.
  • Distributions: Running Android on x86; Arch, CentOS, Dream Studio, Skolelinux, ...
  • Development: OpenSSL and IPv6; bzr, Firefox, gnuplot, laborejo, ...
  • Announcements: EFF on Ubuntu 12.04 privacy options, InfoWorld on OIN, ...
Next page: Security>>

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds