By Jonathan Corbet
November 9, 2010
Back when the 2010 Linux Plumbers Conference was looking for presentations,
the LibreOffice project had not yet announced its existence. So Michael
Meeks put in a vague proposal for a talk having to do with OpenOffice.org
and promised the organizers it would be worth their time. Fortunately,
they believed him; in an energetic closing keynote, Michael talked at
length about what is going on with LibreOffice - and with the free software
development community as a whole. According to Michael, both good and
bad things are afoot. (Michael's
slides
[PDF] are available for those who would like to follow along).
Naturally enough, LibreOffice is one of the good things; it's going to be
"awesome." It seems that there are some widely diverging views on the
awesomeness of OpenOffice.org; those who are based near Hamburg (where
StarDivision was based) think it is a wonderful tool. People in the rest
of the world tend to have a rather less enthusiastic view. The purpose of
the new LibreOffice project is to produce a system that we can all be proud
of.
Michael started by posing a couple of questions and answering them, the
first of which was "why not rewrite into C# or HTML5?" He noted with a
straight face that going to a web-based approach might not succeed in
improving the program's well-known performance problems. He also said that
he has yet to go to a conference where he did not get kicked off the
network at some point. For now, he just doesn't buy the concept of doing
everything on the web.
Why LibreOffice? Ten years ago, Sun promised the community that an
independent foundation would be created for OpenOffice.org. That
foundation still does not exist. So, quite simply, members of the
community got frustrated and created one of their own. The result, he
says, is a great opportunity for the improvement of the system; LibreOffice
is now a vendor-neutral project with no copyright assignment requirements.
The project, he says, has received great support. It is pleasing to
have both the Open Source Initiative and the Free Software Foundation
express their support, but it's even more fun to see Novell and
BoycottNovell on the same page.
Since LibreOffice launched, the project has seen 50 new code contributors
and 27 new translators, all of whom had never contributed to the project
before. These folks are working, for now, on paying down the vast pile of
"technical debt" accumulated by OpenOffice.org over the years. They are
trying to clean up an ancient, gnarled code base which has grown
organically over many years with no review and no refactoring. They are
targeting problems like memory leaks which result, Michael said, from the
"opt-in approach to lifecycle management" used in the past. After ten
years, the code still has over 100,000 lines of German-language comments;
those are now being targeted with the help of a script which repurposes the
built-in language-guessing code which is part of the spelling checker.
OpenOffice.org has a somewhat checkered history when it comes to revision
control. CVS was used for some years, resulting in a fair amount of pain;
simply tagging a release would take about two hours to run. Still, they
lived with CVS for some time until OpenOffice.org launched into a study to
determine which alternative revision control system would be best to move
to. The study came back recommending Git, but that wasn't what the
managers wanted to hear, so they moved to Subversion instead - losing most
of the project's history in the process. Later, a move to Mercurial was done,
losing history again. The result is a code base littered with
commented-out code; nobody ever felt confident actually deleting anything
because they never knew if they would be able to get it back. Many code
changes are essentially changelogged within the code itself as well. Now
LibreOffice is using Git and a determined effort is being made to clean
that stuff up.
LibreOffice is also doing its best to make contribution easy. "Easy hacks"
are documented
online. The project is making a point of saying:
"we want your changes." Unit tests are being developed. The crufty old
virtual object system - deprecated for ten years - is being removed. The
extensive pile of distributor patches is being merged. And they are
starting to see the addition of interesting new features, such as inline
interactive formula editing. There will be a new mechanism whereby
adventurous users will be able to enable experimental features at run time.
What I really came to talk about was...
There is a point in "Alice's Restaurant" where Arlo Guthrie, at the
conclusion of a long-winded tall tale, informs the audience that he was
actually there to talk about something completely different. Michael did
something similar after putting up a plot showing the increase in outside
contributions over time. He wasn't really there to talk about a desktop
productivity application; instead, he wanted to talk about a threat
he sees looming over the free software development community.
That threat, of course, comes from the growing debate about the ownership
structure of free software projects. As a community, Michael said, we are
simply too nice. We have adopted licenses for our code which are entirely
reasonable, and we expect others to be nice in the same way. But any
project which requires copyright assignment (or an equivalent full-license
grant) changes the equation; it is not being nice. There is some
behind-the-scenes activity going on now which may well make things worse.
Copyright assignment does not normally deprive a contributor of the right
to use the contributed software as he or she may wish. But it
reserves to the corporation receiving the assignments the right to make
decisions regarding the complete work. We as a community have
traditionally cared a lot about licenses, but we have been less concerned
about the conditions that others have to accept. Copyright assignment
policies are a barrier to entry to anybody else who would work with the
software in question. These policies also disrupt the balance between
developers and "suit wearers," and it creates FUD around free software
license practices.
Many people draw a distinction between projects owned by for-profit
corporations and those owned by foundations. But even assignment policies
of the variety used by the Free Software Foundation have their problems.
Consider, Michael said, the split between emacs and xemacs; why does xemacs
continue to exist? One reason is that a good chunk of xemacs code is owned
by Sun, and Sun (along with its successor) is unwilling to assign copyright
to the FSF. But there is also a group of developers out there who think
that it's a good thing to have a version of emacs for which copyright
assignment is not required. Michael also said that the FSF policy sets a
bad example, one which companies pushing assignment policies have been
quick to take advantage of.
Michael mentioned a study entitled "The Best
of Strangers" which focused on the willingness to give out personal
information. All participants were given a questionnaire with a long list
of increasingly invasive questions; the researchers cared little about the
answers, but were quite interested in how far participants got before
deciding they were not willing to answer anymore. Some
participants received, at the outset, a strongly-worded policy full of
privacy assurances; they provided very little information. Participants
who did not receive that policy got rather further through the
questionnaire, while those who were pointed to a questionnaire on a web
site filled it in completely. Starting with the legalese ruined the
participants' trust and made them unwilling to talk about themselves.
Michael said that a similar dynamic applies to contributors to a free
software project; if they are confronted with a document full of legalese
on the first day, their trust in the project will suffer and they may just
walk away. He pointed out the recently-created systemd project's policy,
paraphrased as "because we value your contributions, we require no
copyright assignments," as the way to encourage contributors and earn their
trust.
Assignment agreements are harmful to the hacker/suit balance. If you work
for a company, Michael said, your pet project is already probably owned by
the boss. This can be a problem; as managers work their way into the
system, they tend to lose track of the impact of what they do. They also
tend to deal with other companies in unpleasant ways which we do not
normally see at
the development level; the last thing we want to do is to let these
managers import "corporate aggression" into our community. If suits start
making collaboration decisions, the results are not always going to be a
positive thing for our community; they can also introduce a great deal of
delay into the process. Inter-corporation agreements tend to be
confidential and can pop up in strange ways; the freedom to fork a specific
project may well be compromised by an agreement involving the company which
owns the code. When somebody starts pushing inter-corporation agreements
regarding code contributions and ownership, we need to be concerned.
Michael cited the agreements around the open-sourcing of the openSPARC
architecture as one example of how things can go wrong. Another is the
flurry of lawsuits in the mobile area; those are likely to divide companies
into competing camps and destroy the solidarity we have at the development
level.
Given all this, he asked, why would anybody sign such an agreement? The
freedom to change the license is one often-cited reason; Michael says that
using permissive licenses or "plus licenses" (those which allow "any later
version") as a better way of addressing that problem. The ability to offer
indemnification is another reason, but indemnification is entirely
orthogonal to ownership. One still hears the claim full ownership is
required to be able to go after infringers, but that has been decisively
proved to be false at this point. There is also an occasional appeal to
weird local laws; Michael dismissed those as silly and self serving. There
is, he says, something else going on.
What works best, he says, is when the license itself is the contributor
agreement. "Inbound" and "outbound" licensing, where everybody has the
same rights, is best.
But not everybody is convinced of that. Michael warned that there is "a
sustained marketing drive coming" to push the copyright-assignment agenda.
While we were sitting in the audience, he said, somebody was calling our
bosses. They'll be saying that copyright assignment policies are required
for companies to be willing to invest in non-sexy projects. But the fact
of the matter is that almost all of the stack, many parts of which lack
sexiness, is not owned by corporations. "All cleanly-written software,"
Michael says, "is sexy." Our bosses will hear that copyright assignment is
required for companies to get outside investment; it's the only way they
can pursue the famous MySQL model. But we should not let monopolistic
companies claim that their business plans are good for free software;
beyond that, Michael suggested that the MySQL model may not look as good as
it did a year or two ago. Managers will be told that only assignment-based
projects are successful. One only need to look at the list of successful
projects, starting with the Linux kernel, to see the falseness of that
claim.
Instead, Michael says, having a single company doing all of the heavy
lifting is the sign of a project without a real community. It is an
indicator of risk. People are figuring this out; that is why we're seeing
in increasing number of single-company projects being forked and
rewritten. Examples include xpdf and poppler, libart_lgpl and cairo, MySQL
and Maria. There are a number of companies, Novell and Red Hat included,
which are dismantling the copyright-assignment policies they used to
maintain.
At this point, Michael decided that we'd had enough and needed a brief
technical break. So he talked about Git: the LibreOffice project likes to
work with shallow clones because the full history is so huge. But it's not
possible to push patches from a shallow clone, that is a pain. Michael
also noted that git am is obnoxious to use. On the other
hand, he says, the valgrind DHAT
tool is a wonderful way of analyzing heap memory usage patterns and finding
bugs. Valgrind, he says, does not get anywhere near enough attention.
There was also some brief talk of "component-based everything" architecture
and some work the project is doing to facilitate parallel contribution.
The conclusion, though, came back to copyright assignment. We need to
prepare for the marketing push, which could cause well-meaning people to do
dumb things. It's time for developers to talk to their bosses and make it
clear that copyright assignment policies are not the way toward successful
projects. Before we contribute to a project, he said, we need to check
more than the license; we need to look at what others will be able to do
with the code. We should be more ungrateful toward corporations which seek
to dominate development projects and get involved with more open
alternatives.
One of those alternatives, it went without saying, is the LibreOffice
project. LibreOffice is trying to build a vibrant community which
resembles the kernel community. But it will be more fun: the kernel,
Michael said, "is done" while LibreOffice is far from done. There is a lot
of low-hanging fruit and many opportunities for interesting projects. And,
if that's not enough, developers should consider that every bit of memory
saved will be multiplied across millions of LibreOffice users; what better
way can there be to offset one's carbon footprint? So, he said, please
come and help; it's an exciting time to be working with LibreOffice.
(
Log in to post comments)