By Jonathan Corbet
October 22, 2008
Maps are cool; there's no end of applications which can make good use of
mapping data. There is plenty of map data around, but it's almost
exclusively proprietary in nature. That makes this data hard to use with
free applications; it's also inherently annoying. We, as taxpayers, own
those streets; why should we have to pay somebody else to know where the
streets are?
Your editor likes to grumble about such things; meanwhile, the OpenStreetMap project (OSM) is busily
doing something about it. OSM has put together a database and a set of
tools making it easy for anybody to enter location data with the intent of
producing a free mapping database with global coverage. It is an ambitious
project, to say the least, but it's working:
Right now on each and every day, 25,000km of roads gets added to
the OpenStreetMap database, on the historical trend that will be
over 200,000km per day by the end of 2009. And that doesn't include
all the other data that makes OpenStreetMap the richest dataset
available online.
OSM data is not limited to roads; just about any point or
track of interest can be added to the database. If current trends
continue, OSM could well grow into the most extensive geolocation database
anywhere - free or proprietary. And those trends could well continue; one
of the nice aspects of this kind of project is that no particular expertise
is needed to contribute. All you need is a GPS receiver and some time; some OSM
local groups have even acquired a set of receivers to lend out to
interested volunteers. This is our planet, and we can all help to map it.
All this work raises an interesting question, though: under what license
should this accumulated data be distributed? Currently, the OSM database
is covered by the Creative Commons
Attribution-ShareAlike 2.0 license. It is a copyleft-style license,
requiring that derived products be made available under the same license.
So, for example, if a GPS navigator manufacturer were to include an
enhanced version of the OSM database in its products, it would have to
release the enhanced version under the CC by-SA license.
The OSM project is not happy with this license, though, and is looking to
make a change. The attribution requirement is ambiguous in this context;
do users need to credit every OSM contributor? Does making a plot of OSM
data with added data layered on top create a derived product? But the
scariest question is a different one: can the CC by-SA license cover the
OSM database at all?
Copyright law covers creative expression, not facts. The information in
the OSM database is almost entirely factual in nature; one cannot copyright
the location of a street corner. So what OSM is trying to protect is not
the individual locations, but the database as a whole. Copyright law does
allow for the protection of databases, but that law is far more complex
than the law for pure creative works, and it varies far more between
jurisdictions. Europe has a specific (though much-derided) database right,
the US has far weaker
database protections, and other parts of the planet lack this
protection altogether. So it may well be that, if some evil corporation
decides to appropriate the OSM database for its own nefarious, proprietary
purposes, there will be nothing that the OSM project can do about it.
So the project is thinking of making a switch to the Open
Database License (ODbL), which is still being developed. It, too, is a
copyleft-style license, but it is crafted to make use of whatever database
protection is available in a given jurisdiction. To that end, the ODbL is
explicitly structured as a contract between the database owner and the
user. In any jurisdiction where database rights are not recognized under
copyright law, the
contractual nature of the ODbL should provide a legal basis to go after
license violators.
But the use of contract law muddies the water considerably; there are good
reasons why free software licenses are carefully written to avoid that
path. Contracts are only valid if they are explicitly and voluntarily
entered into by all parties. If the OSM cannot show that a license
violator agreed to abide by the license, it has no case under contract
law. The project has
a plan to address this problem:
To ensure that potential users are aware of and agree to the
contract terms, we are proposing to require a click-through
agreement before downloading data. (All registered users would
agree to this on signing up so will not need a further
click-through on each download.)
Registration and clickthrough licensing are obnoxious, to say the least.
But, in any case, the only people who will go through that process are
those who obtain the database directly from OpenStreetMap. The ODbL allows
redistribution, naturally, and it does not require that explicit agreement
be obtained from recipients of the database. So it is hard to see an
outcome where copies of the database lacking a "signed" contract do not
proliferate. Additionally, reliance on contract law makes it
very hard to get injunctive relief, weakening any enforcement efforts
considerably.
The ODbL includes an anti-DRM measure; if a vendor locks down a copy of the
database with some sort of DRM scheme, that vendor must also make an
unrestricted copy available. This license tries to distinguish between
"collective databases" (which are not derived works) and "derivative
databases" (which are). Drawing layers on top of an OSM-based map is a
collective work; tracing lines from such a map is a derivative work. It
is, in general, a complex bit of work.
It is complex enough that a number of OSM contributors are wondering if
it's all worth it. Jordan Hatcher is one of the authors of the ODbL, and
he supports its use with OSM, but even he understands the concerns that some people
have:
The [Science Commons] point is that all this sort of stuff can be a
real pain, and isn't what you are really doing is wanting to create
and manipulate factual data? Why spend all the time on this when
the innovation happens in what you can do with the data, and not
with trying to protect the data in the first place.
There is an active group with OSM which is opposed to this kind of
licensing and would, in fact, rather just get down to the task of
collecting and distributing the data. They express
themselves in terms like this:
One thing I really love about OSM is the pragmatic, un-political
approach: You don't give us your data, fine, then we create our own
and you can shove it.
Not: You don't give us your data, fine, then we create a complex
legal licensing framework that will ultimately get you bogged down
in so many requests by prospective users who would like to use our
data and yours but cannot and you will sooner or later have to
release your data according to the terms we dictate and then we
will have won and the world will be a better place.
These contributors would rather that OSM release its data into the public
domain - or something very close to that. Rather than put together a
complicated license, they prefer to just publish their data for anybody to
use as they see fit. There have been all of the usual discussions which
resemble any "GPL vs. BSD" licensing flame war one has ever seen - except
that the OSM folks appear to be a very polite crowd. It comes down to the
usual question: will the OSM database become more complete and useful if
those who extend it are forced to contribute back their changes?
The public domain contingent clearly does not believe that any improvements
to the database obtained via licensing constraints will be worth the
trouble. So it seems likely that there will be some sort of fork involving
the creation of a smaller, purely public-domain OSM database. It may well
be an in-house fork, with the public domain data being merged into the
larger, more restrictively licensed database for distribution. Regardless
of how that goes, this split raises issues of its own: how are the two
databases to be kept distinct in the face of cooperative additions and
edits?
Any relicensing of the database also brings up another interesting
question: what to do about all of the existing data, which may or may not
be copyrighted by those who contributed or edited it? The license change
may well require a process of getting assent from all contributors and
purging data obtained from those who do not agree. This
proposed timeline shows how the project is thinking about working
through this task. It is hard to imagine this process going entirely
smoothly.
The OSM community clearly has a set of thorny issues to work out. Given
that, it's not surprising that this process has already been dragged out
over the better part of a year. How this issue is eventually resolved will
certainly serve as an example - not necessarily a good example - for other
projects working on free compilations of factual data.
Let us hope that OSM can come to a
solution which lets this project continue to grow and generate a valuable
database that we all will benefit from.
(
Log in to post comments)