By Jake Edge
May 5, 2010
There are thousands of embedded devices running Linux today, with more
released hourly it seems. Many of those are in full compliance with the
licenses for the free software that they ship, but some, sadly, are not.
In most cases, it is probably due to ignorance, but sometimes arrogance or
even malfeasance play a role. A new Apache-licensed Binary Analysis Tool from
Armijn Hemel and Shane Coughlan is meant to help developers and others
interested in GPL compliance in determining whether Linux or BusyBox are
present in a particular device.
There are multiple levels to GPL compliance investigations. If the
device is not shipped with source, nor an offer to provide it, one can
assume that it contains no GPL code. In that case, just detecting the
presence of the Linux kernel or BusyBox is enough to identify a problem.
For devices that do ship or offer source, there is another step:
determining whether the source code and configuration that was provided
corresponds to the code on the device. That process was described by Hemel
and Coughlan in a series of LWN articles (part 1, part 2, and part 3).
The first step is to extract any filesystems that exist in
a firmware image, so that they can be investigated further. The Binary
Analysis Tool provides
the
bruteforce.py script to detect various kinds of filesystems,
including those that are compressed, and to extract them from the image.
It then digs down inside the filesystem to find "interesting" files. Right
now, the output is terse, but that is slated to change "in the near
future", according the README file.
Beyond that, there are scripts to look at BusyBox and kernel binaries to
extract configuration information. Running:
python busybox.py --binary=/path/to/busybox
on a BusyBox binary results in a list of configuration options that shows
which of the applets were built into the binary:
CONFIG_ADDGROUP=y
CONFIG_ADDUSER=y
CONFIG_ADJTIMEX=y
...
BusyBox configuration is important because it can be a clue as to whether or not
the source corresponds to the binary. In fact the tool provides an
automated way to compare the configuration found in a binary with one that
is included in the source:
busybox-compare-configs.py.
The tool uses a database of sorts for BusyBox configurations going back to
the 0.52 release. The busybox-version.py command can be used to
manually determine the version of a binary, or the other tools will do so
automatically—though it can be overridden on the command line. In
addition, the busybox.py script can check for applets in a binary
for which there is no configuration option in the official BusyBox sources,
which would indicate that additional code (for which source must be
released) has been added.
There are also scripts to extract configuration and strings from a Linux
kernel. extractkernelstrings.py is used on a provided kernel
source tree and generates a database of strings that should be present in
the kernel image. findkernelstrings.py then uses that database and
the kernel image file to find matches, and, more importantly, things that do
not match. Once again, this can lead to a determination that the source
code and shipped binaries are either not the same, or not configured in the
same way.
Due to various reverse engineering laws worldwide, the Binary Analysis Tool does
not do any kind of decompilation or disassembly of the code that it finds.
It strictly looks at the symbol tables and strings stored in the binaries
to do its work. For much the same reason, it does not try to "crack" any
encryption or DRM that might be protecting the firmware image or its contents.
The tool is still a bit rough around the edges, but does come with fairly
extensive documentation,
both as PDF Quick Start and User guides and various documentation files in
the source tree.
It comes as a tarball or can be grabbed
from an svn repository. The list of dependencies seems a bit
large for a program of this type. For the kernel strings database, it
includes the PyLucene Python library for
accessing the Java-based Lucene text
searching and indexing, which necessitates installing OpenJDK and Ant.
More obvious dependencies for things like python-magic for magic numbers,
e2tools and squashfs tools for accessing filesystems, and various
compression utilities are required as well.
The development of the Binary Analysis Tool was supported by the NLnet Foundation and the Linux Foundation, and it was
created by Hemel as part of his work at Loohuis Consulting and by
Coughlan at OpenDawn. It is still
being actively developed with releases scheduled for May and July.
Contributions
of bug reports, development time, or money to continue development are welcome.
While the scripts will be useful as a starting point for those who are
investigating GPL compliance, there is still quite a bit of work to be
done. The tool provides a framework for looking at two of the most common
GPL-licensed components appearing in embedded devices, but there are
others. It's no coincidence that that the tool focuses on BusyBox and the
Linux kernel, which have been
the most successful at
enforcing license compliance in the last several years. As other projects
are used more widely in embedded devices, there will be a need to
expand the coverage of tools like this.
There are uses for the tool beyond those of developers trying to ensure
that their code is used properly.
Embedded device manufacturers will also find it useful. There have been
numerous cases of OEMs getting code from their suppliers without the proper
source files—or even notice that it contains GPL code. Companies can
also test their competitor's products for compliance to help level the
playing field. Any tool that makes it easier to spot license compliance
problems is a boon for developers, users, and device makers.
Comments (8 posted)
May 5, 2010
This article was contributed by Nathan Willis
Koha is the
world's first open source system for managing libraries (the books and
periodical variety, that is), and one of the most successful. In the ten
years since its first release, Koha has expanded from serving as the integrated
library system (ILS) at a single public library in New Zealand to more
than 1000 academic, public, and private libraries across the globe. But
the past twelve months have been divisive for the Koha community, due to a
familiar source of argument in open source: tensions between community
developers, end users, and for-profit businesses seeking to monetize the
code base. As usual, copyrights and trademarks are the legal sticks, but
the real issue is sharing code contributions.
Koha was originally written in 1999 by New Zealand's Katipo Communications, spearheaded by developer Chris Cormack. Katipo was contracted to build an ILS for the Horowhenua Library Trust (HLT) to replace its aging (and Y2K-bug-vulnerable) system, and to release the code under an open source license. The name Koha is a Māori word for a reciprocal gift-giving custom.
The first public release was made in 2000. Over the years, Koha usage grew, and several businesses popped up to provide support and customization services for Koha-using libraries; as with many infrastructure applications, the ongoing support of an ILS is the real expense. An ILS not only serves as an electronic "card catalog" system for library patrons, but handles acquisitions, circulation tracking, patron account management, checkout, search, and integration with other cataloging systems for inter-library loan. Libraries do not change ILS vendors quickly or lightly.
One of these support businesses was US-based LibLime, founded in 2005 by Koha developer Joshua Ferraro. In 2007, LibLime purchased Katipo Communications' assets in Koha, including its copyright on the Koha source code, and took over maintenance of the koha.org web site. For several years, life continued on as it had before; koha.org was the home of the project, and LibLime participated in Koha's ongoing development as did several other support-based businesses, many individuals, and many libraries.
The fork
The first signs of trouble began to appear in mid-2009, when LibLime
announced that it would be providing its customers with a version of Koha
built from a private Git repository, instead of the public source code
maintained by the community as a whole. Many in the community regarded
this as an announcement that LibLime was forking the project, a claim that
Ferraro denied.
The company cited several factors as its reasons for maintaining a separate
code base, including the need to deliver on Koha contract work on its own
deadlines, lack of quality control in community code contributions, and
customer data it could not make public.
Ferraro stated that LibLime would publish its enhancements to Koha, that it was "100% committed to the open-source movement", and that its integration with the main code repository would be "seamless." However, no such publication took place; as of today, the most recent source code for LibLime's products that is available on the web site are from June of 2009, and the LibLime source code repository remains inaccessible to the public.
LibLime's enhanced version of Koha is named LibLime Enterprise Koha
(LLEK), runs on Amazon's EC2 cloud platform, and sports a list
of features not present in the 3.0.2 "community" release. Meanwhile, the community
has continued to develop Koha, making
point releases to the 3.0.x branch, and is readying a major update in
version 3.2.
Enough people in the Koha community were concerned about the project's
future and about practical matters like the web site and Git repository
that they decided to migrate to a new domain, koha-community.org, to be managed
by a committee and legally held by Koha's original sponsors, HLT. Those
migrating included Cormack, many other core developers, and several of the
other Koha support vendors.
2010 started off with a ray of hope for commercial and community reconciliation, when Progressive Technology Federal Systems, Inc. (PTFS), another Koha support vendor, announced in January that it was acquiring LibLime. PTFS was a relatively recent convert to the Koha community; it started out as a proprietary-only ILS vendor catering to government and military institutions. But it selected Koha as its open source product of choice in 2008, in part for its ability to integrate with PTFS's profitable digital content management products. PTFS engineers had been active on the mailing list and IRC channel, and submitted patches back to the community, so the community was optimistic that they would continue to participate, and the LLEK fork would be merged back into the main branch.
In April, PTFS asked the community — developers, documentation and translation teams, release managers — to return to the koha.org domain, and set up a new repository with the intent of merging the code. As community members
explained in the thread, they did not like those terms and instead asked PTFS to either turn the koha.org domain over to the community or to bring its code and participants to the koha-community.org site.
Unfortunately, what could have been a simple disagreement over hosting and domain name relevance deteriorated further. PTFS asked HLT's Koha committee for a conference call under a non-disclosure agreement, but the committee asked for a public email or IRC discussion instead. PTFS then responded with a press release (copied to the Koha mailing list) publicly criticizing the committee, calling it "new to business matters," "one-sided," and "inaccurate," and touting its own version of Koha as superior. Judging by the responses on the list, that action served only to further alienate the already-suspicious Koha community at large.
Code, Trademarks, Copyrights, and Names
Koha is far from the first project to go through such a divisive conflict. In fact, forks of free software projects are not wrong in and of themselves, and can lead to improvements in the code. What caused the major split between the Koha community and LibLime was the company's decision to keep its fork private and not give back. It promised to do so, but instead withdrew from the Koha community altogether.
Naturally there is no way to prevent individuals or companies from acting with hostility, but the Koha project was vulnerable to LibLime's behavior on a couple of fronts. First, as it recognized, LibLime controlled the ostensibly community-run koha.org site — prompting the community to re-launch the content in a new location.
What is more troubling is that, based on its actions, LibLime evidently believed that
it had the right to create a closed-source fork of Koha due to its
acquisition of Katipo Communications's Koha assets, including the latter
company's copyrights. But whether or not Katipo's copyrights constituted
the whole of Koha in 2009 when LibLime forked the project is questionable.
Cormack and other developers point to the Git repository's commit
statistics, which show the percentages by individual authors. How to
interpret those statistics is an open question, but there was no copyright
assignment required to participate in Koha development. In the absence of
such an agreement, Koha contributors retain copyrights for their work; as a
result, taking the code proprietary is not an easy option for
anybody.
It is still unclear whether or not LibLime provided the full source
code to its LLEK product to its paying customers, as is required by the
upstream Koha project's GPLv2+ license. Koha is written mostly in Perl,
which is presumably distributed in source form, but the GPL source
requirement does include all the source necessary to build the software,
include supporting libraries and compilation scripts — a
requirement that might affect support libraries needed to support LLEK's EC2
environment.
Muddying the waters still further is the issue of who can legally call their code "Koha" at all. LibLime filed for a registered US trademark on the name in October 2008; it was granted in May of 2009. European support vendor BibLibre filed for an EU trademark on "Koha" in December of 2008; it is still undergoing review. Finally, LibLime filed for the Koha trademark in New Zealand itself in February of 2010; it too is still undergoing review. Yet "Koha" has been used as the name of the open source project itself, not a vendor package or support product, since 2000.
The Software Freedom Law
Center's Karen Sandler said that such trademark-based disputes are
common, enough so that SFLC has published a primer
on the subject for projects. Without commenting on the specifics of the
Koha situation, she noted that although registration constitutes
"legal presumption of ownership," if another party can prove it
was using the mark first, it retains the right to use the mark. In addition,
she added,
Others can use a mark in a manner that does not imply
an official relationship or sponsorship so long as there's no likelihood of
confusion on the part of consumers. Factually referring to unmodified
software by a particular name, for example, is likely to be considered
clearly within permitted usage. This kind of use is called nominative
use.
The community's unstructured approach to the project in
past years does not make up for PTFS's very public missteps, however. The company may indeed have meant to put the community back together into a functioning whole when it initiated talks about the web site, but it clearly underestimated the ire that LibLime had earned through its actions over the previous year, and the derisive press release would be considered a mistake under any circumstances. If there was any hope of drawing the larger Koha community back to koha.org, it probably died when that message went out.
Cormack observed
on his blog that any vendor has the right to try and turn its Koha
offering into a superior product for customers in order to increase sales
— the harm was inflicted because of the way LibLime chose to
carry out that business decision.. Whether you agree with that or not, however, it seems that the project would have been better equipped to cope with LibLime's withdrawal from the community had the domain name, trademarks, and perhaps even copyrights been held by a trusted entity such as HLT. Taking those legal steps is something few projects seem to consider when things are running smoothly. They are no doubt time-consuming and tedious, perhaps even expensive. But so is trying to do them in a hurry, ten years after the project launches, with hostile players going after your name.
[ Thanks to Lars Wirzenius for pointing us toward this topic. ]
Comments (16 posted)
By Jonathan Corbet
April 30, 2010
On April 29, the University of Colorado held
a conference on
patents and free software. Your editor, having spent the morning
getting some significant dental work done, figured that an afternoon
devoted to software patents would appropriately continue the day in the
same theme - only
without the anesthetic. The following is not a comprehensive report of
the event; instead, it focuses on a few of the more interesting moments.
Pamela Samuelson is
a professor of law at the University of California at
Berkeley; she also serves on the boards of organizations like the
Electronic Frontier Foundation, the Electronic Privacy Information Center,
and Public Knowledge. At the conference, she presented some results on her
research into the idea of software patents as an incentive for innovation.
A survey was done back in 2008, with 15,000 surveys sent out to a large
number of firms. 1,333 of them - representing over 700 companies - came
back. The numbers that came out were interesting, if arguably
unsurprising.
According to this survey, 65% of software companies have no interest in
software patents; they do not see patents as an important part of doing
business. That compares with 82% of non-software companies which said they
were working toward the acquisition of patents. It is worth noting that
companies with venture capital backing had a higher level of interest in
software patents than those without.
When companies do go for software patents, their motivations tend to be to
enhance their reputation and make it easier to secure investments.
Preventing litigation was also cited as a reason. But, when it comes to
the question of what makes a software business successful, patents were at
the very bottom of the list. Being first to market was the most important
success factor. In summary: software patents are a weak incentive - at
best - toward innovation.
So, do software patents matter for new companies? Lawyer Jason Haislmaier
said that they can be important, especially with venture-backed companies, because
they are relatively attractive to investors. Venture capitalist Jason
Mendelson disagreed, though, saying that he didn't care about patents in
the companies that he evaluates. In fact, if a company is focused on
getting patents, he sees it as a reason not to invest: the company
should be putting resources into its products instead.
Stormy Peters,
director of the GNOME Foundation, noted that community developers tend to
be strongly anti-patent; a company with a patent-heavy focus may find it
hard to work with the community or hire developers.
Stormy also worries that the current trend toward cloud computing may make
the issue of open source software moot. The convenience of free web
services has, she says, distracted the community from the issue of
freedom. There needs to be a means by which truly free and open services
can be defined.
Patent litigation was the subject of a different panel.
Lucky Vidmar started
with the observation that patent suits against open source software still
tend to be rare, and that suits against individual developers are not
really happening. In general, he says, the lawsuits which have come about
have little to do with open source; they are just more in a long series of
software patent suits. But suits against open-source companies do tend to
get a lot of negative attention, something which potential plaintiffs may
well keep in mind.
Julie DeCecco, a litigator for Oracle (by way of Sun), noted that patent
litigation is very expensive. That alone makes it unlikely that open
source projects will be sued; the exposure to legal action is proportional
to the amount of money being made. "Follow the money," she says, and
you'll see where the lawsuits are happening. Attorney David
St. John-Larkin suggested that open source might be more vulnerable to
these suits due to the public nature of its development.
Jason Schultz and Jennifer Urban are both from the Samuelson Law,
Technology and Public Policy Clinic at Berkeley; Schultz previously did a
stint at the EFF. They presented a concept they have been working on as a
way of mitigating the software patent threat called the Defensive Patent
License, or DPL. This work is in an early stage, and the DPL text is not
yet available, but it should be forthcoming in the near future.
The core idea behind the DPL is that software patents can serve in a
useful, defensive role. They can be used to negotiate cross-licensing
agreements, and they can be used for countersuits if need be. But
defensive patents are not as heavily used as they could be, especially in
the open source area. There are a couple of possible reasons for this:
defensive patents require a concentration of resources that doesn't always
exist in our community, and there tends to be a certain amount of distrust
toward the acquisition of patents for defensive purposes.
[PULL QUOTE:
The DPL would promote the defensive use of software patents in a way which
reinforces the free software community's norms; it is meant to be similar
in spirit to the GPL.
END QUOTE]
The DPL would promote the defensive use of software patents in a way which
reinforces the free software community's norms; it is meant to be similar
in spirit to the GPL. A company which buys into the DPL will put
all of its patents under that license. Any other DPL licensee could
then automatically obtain a royalty-free license for any of those patents.
The license is irrevocable - unless the licensee sues another DPL licensee
or withdraws from the pact. Withdrawal is possible with advance notice
(six months was suggested), but any licenses granted to others would remain
valid.
If this idea takes off, it will encourage the creation of a growing network
of cross-licensed patents; eventually, the value of joining the pool will
be far higher than remaining outside of it. Since patents in this scheme
cannot be used to attack other participants, they will be limited to
defensive uses only. Among other things, that should keep DPL-covered
patents out of the hands of patent trolls.
There are a lot of details to be worked out yet, and it is far from clear
that the idea will really take off. It is hard to imagine that large
companies with extensive patent portfolios would be willing to commit the
entire set to the DPL. The concept is interesting, though; we will see
where it goes.
The discussion danced around a number of issues, including patent
shakedowns that are settled without the filing of lawsuits, current
litigation, or the general problem of low-quality patents. With regard to
the last two, your editor asked about Apple's attack against HTC,
which is using some highly dubious patents as a weapon against Linux.
Nobody wanted to talk about the Apple case, but Julie DeCecco said that the
best weapon against low-quality patents is reexamination actions in the
patent office. They are relatively cheap (at a mere $20K or so) and are
often at least partially successful.
Jason Schultz said that he participated in a number of these actions while
at the EFF. They can be effective, but there are a lot of bad
patents out there, and there's no way to challenge them all.
Your editor would note that, when talking with people more directly
involved in the defense of free software, he has found the reexamination
option to be held in relatively low repute. The actions are risky and
might serve to make the patent stronger; this has happened with the VFAT
patent. And, in the best of scenarios, it is still not possible to truly
kill a patent this way; they can always come back after further rewriting
by the patent holder.
There was a panel on the intersection of open source, patents, and
standards; much of it was about as exciting as sitting on one of the
standards committees themselves. The audience did hear an interesting
presentation
from Steve Mutkoski of Microsoft, who asserted that patent-encumbered
standards are entirely compatible with most open source licenses. In fact,
"only the GPL family of licenses" is truly problematic in this regard. It
is, he suggested, more of a problem with the GPL than with patents.
Also, Steve made the claim that a lot of people who complain about
patent-encumbered standards really just don't want to pay royalties. That
may well be true, but it's not relevant to the larger discussion.
Unfortunately, there did not seem to be anybody on the panel who understood
free software well enough to try to correct that point of view.
There was an interesting suggestion that, perhaps, we need some concept of
"fair use for patents." That is especially true in situations where the
government has mandated the use of a patent-encumbered standard in some
situation. Nobody tried to fill in the idea of how fair use might work in
this setting, though.
In summary, your editor found the event to be somewhat frustrating. It was
dominated by lawyers of the academic variety with a small venture capital
presence; Stormy Peters was the only community representative on the
panels. Even so, it is
interesting to see how the problem is viewed by people who are a few steps
removed from it.
Comments (16 posted)
By Jake Edge
May 5, 2010
As part of our "media kit" project, we put together a reader survey that
ran for the last two weeks of April.
Over 1800 readers filled out the survey—our thanks to all of
them—and, as promised, here is a summary of the responses.
The vast majority (90%) of respondents were subscribers, and almost all of
those folks intend to continue. Less than 5% of responses either never
planned to subscribe or may not resubscribe.
Three-quarters of subscribers were likely to continue their current level
if there were a subscription
price increase, with 8% overall likely to drop to a lower subscription
level and
16% being less likely to subscribe or renew.
As for LWN content, the weekly edition front and kernel pages are by far
the most popular, with 90% reading them frequently. The daily news page
(71%), weekly development (70%), security (61%), and distributions (52%)
pages were all fairly popular as well. Less so were the yearly timeline (33%),
weekly announcements page (27%), and the events calendar (10%).
Pages and
features that readers could live without had responses that, unsurprisingly,
mirrored those above. No more than 25% of readers could live without any
of the daily
or weekly pages, with the exception of 45% who would be fine without the
announcements page. The events calendar (57%) and timeline (34%) didn't
fare as well.
The clear winner for areas that readers would like to see more coverage is
"Languages and development tools" at 57%. Roughly 40% would like to see
more system administration and desktop Linux coverage, while approximately
one-third saw embedded systems and virtualization as areas for expanded
coverage. "The business of Linux and free software" was only chosen by 25%
of respondents and it would seem that we, perhaps, have the right amount of
coverage of legal issues and conferences as only 20% thought those should
increase.
Formatting LWN for mobile device display was the most popular choice for
that question, with 30% saying that they would personally use it. A PDF
version of the weekly edition was next at 17%, but EPub (7%) and Kindle
(2%) were not particularly interesting to respondents.
The question about regularly used distributions led to some interesting
results, with Ubuntu
(54%) and Debian (44%) far ahead of any of the rest. The next tier was led
by Fedora (24%), followed by Red Hat Enterprise Linux (21%), other OS (20%), CentOS
(19%), and other Linux (15%). All of the rest came in at less than 10%:
Gentoo, openSUSE, SUSE Linux Enterprise Server, Mandriva, and Oracle
Unbreakable Linux (with 13 respondents) in that order.
In the single-choice "primary desktop" question, GNOME came out way ahead
with 50%. KDE had a 23% share and the numbers drop off quickly from
there. 8% use some Linux desktop environment that we didn't list and 7%
use another OS entirely for their primary desktop. No desktop environment
(5%) was just ahead of Xfce (4%), while LXDE is only used by ten of our
readers who responded.
As we move forward, and look at changes we might make—for content,
features, and coverage—we will definitely keep these answers in
mind. There are some things, like the events calendar, that we do as a
service to the community and are likely to stay, even if they are somewhat
sparsely used. But when thinking about article assignments and where to
focus our efforts, these answers will come in very handy. Thanks again to
all who responded.
Comments (39 posted)
Page editor: Jonathan Corbet
Next page: Security>>