March 6, 2009
This article was contributed by Tom Chance.
In my last article on OpenStreetMap I looked at the
recent mass imports of public data — everything from British oil wells to
the entire road network for the United States. But for those interested in
more than an alternative to Google Maps, the ability to extract or add data
to the project is what really makes OpenStreetMap shine. Whether you want
to get an SVG of a campus map or import a local government's database of
every building in the city, Linux users will find plenty of tools that
cater to their needs.
The export tab on the
web site provides the most simple way to access data. Users can draw an
area on the main map view and then grab an image (in PNG, JPEG, PDF or PS
formats); some HTML to embed the map into your web site; or the raw XML
data. To further modify the data, either in the OpenStreetMap database or a
local copy (stored as an XML .osm file on your disk) download the data
using an editor like JOSM (the 'Java
OpenStreetMap editor'). To make life easier when selecting the area to
download, open up the preferences dialog and install the namefinder and
slippy_map_chooser plugins.
Grabbing larger amounts of data would be difficult, slow and clumsy with
these methods. More advanced users can get data directly through the API. Check the
latitude and longitude coordinates for the area you want — an easy method
for this is to use the export tab to draw an area, then note down the
coordinates it records — then fire up wget or curl and download the
data:
wget http://api.openstreetmap.org/api/0.5/trackpoints?bbox=left,bottom,right,top
The main api only lets you grab 5,000 points per request; you have to page
the request to get the additional data. To pull out a really large chunk of
data, or to filter it (for example to just download all the pubs in the
city) use the extended OSM API (XAPI, or
'zappy'). Access to really enormous amounts of data, such as the entire
planet or a country, can be found in the frequently updated dumps listed on
the Planet.osm wiki
page.
Once you have the data there are all manner of uses - your GPS navigation
device, rendering your own
maps for the web or print, or converting the data into another standard
GIS format with tools like the Ruby osmlib. The documentation for each
tool various enormously, but the toolchains tend to be relatively straight
forward.
Of course, extracting data is only half the story. Not only should all good
open source citizens be contributing back, but you will get the most value
from the data if you collaborate with others in developing a rich data set
that will lead to tools and use cases you can later replicate.
OpenStreetMap abounds with methods and tools for entering data. You might
like the "old school" method of tracing a breadcrumb GPS trail —
much more fun in the early days when I mapped much of Reading
with some friends from a completely blank slate. Many mappers have traced
basic road layouts and buildings from aerial imagery donated from Yahoo! so
that others can go in and identify street names and points of interest. The
main editing tools are Potlatch, a flash interface on the main web site
(just click on the 'Edit' tab once you're zoomed into your local area), and
the previously-mentioned JOSM. The wiki has
plenty of guidance.
When importing large sets of existing data, things get a little more
complicated. The first step is to step back and have a good think. Imports
can cause two kinds of headaches for other contributors if done wrong: you
might put a load of new data over the top of somebody else's efforts and
make a complete mess in the process; or worse, you might import data
without proper permission, causing legal difficulties for the project and
technical difficulties in taking the data back out again.
It's always best to begin by asking a few questions on the relevant mailing list;
there are localized lists for many areas, a general (high traffic)
"talk" list, and a "legal-talk" list for legal issues
such as licensing for imports. It's especially important to avoid
convenient interpretations of web site notices regarding copyright and
database rights when deciding if you can import the data. You need to get
written confirmation so that the OpenStreetMap project is immune from
legal attacks. There are some nice general
guidelines on the wiki, which are worth a read.
If you have data with written permission to use it, you can begin the
import process. The first, and most laborious, step is to map out the data
against standard OSM tags, as in this UK public
transport example or this really comprehensive exercise for CanVec
data. You'll notice that oftentimes source-specific data (like unique
IDs for features and really niche data) is retained in a namespace like
"CanVec:FID" and "naptan:StopAreaCode". This can also
be useful where you don't want the data to appear until volunteers have
gone through checking it against existing data in the database, for example
to merge two bus stops (one crowdsourced, the other from the import).
For large chunks of data, importers have tended to write custom scripts to
then bring the data in. If the data is in the OpenStreetMap format, and it
is in a state suitable to go straight into the database, this bulk import
script makes the process quick and painless. The Canvec2osm code
shows how to pull in more complicated data; this converts 11 different
shape files into themed osm files with correct tagging, which can then be
worked into a suitable state for importing.
A more cautious approach can be appropriate in areas with a lot of existing
data. One quite technically challenging route is to set-up your own Web
Map Service (WMS) using a tool like mapserver, and then set-up the JOSM WMS
plugin to pull those maps in as a layer underneath your map data so it
can be traced. This Map Warper
tool is in beta and tries to make this process easier. If the data is quite
simple you could just put the source and editor side-by-side on your screen
and use your judgement to copy over points of interest.
However you want to proceed, you're probably best off getting in touch with
some local or more experienced community members. Interested people could even
just lobby local government officers and public institutions to get the
data, then pass it along to somebody with more of an appetite for the
technical stage. Given 6 months to study, process, and import the data, you
should find richly detailed maps and underlying data available under a
Creative Commons BY-SA license; the license, incidentally, may soon change
to one more suitable for databases. Whatever you do, just remember to have
fun.
(
Log in to post comments)