|
|
Subscribe / Log in / New account

The Overture open-mapping project

By Jonathan Corbet
October 31, 2024

OSS Japan
OpenStreetMap tends to dominate the space for open mapping data, but it is not the only project working in this area. At the 2024 Open Source Summit Japan, Marc Prioleau presented the Overture Maps Foundation, which is building and distributing a set of worldwide maps under open licenses. Overture may have a similar goal to OpenStreetMap, but its approach and intended uses are significantly different.

Once upon a time, not too long ago, Prioleau began, map making was mostly done by surveying — sending somebody out to measure where things were. That has changed over the last couple of decades with the advent of location-aware mobile devices; map making is now driven by sensors, not surveyors. That has changed the nature of maps and how they are used; Overture Maps was created to take advantage of (and support) those changes. The project is still in its early days, but it has the support of a long list of companies and is already being used by some of them.

[Marc Prioleau] Overture Maps was created to change the model for map production. The amount of available mapping data has exploded, he said, and the types of that data have changed. Maps that were once concerned with just the road network have grown to include information about speed limits, road signage, lane usage, and more. There may never be a world where all of this data is open, but there is space for a set of common base layers to tie it all together. These base layers might contain information about roads — their route, geometry, and directionality, for example — while letting others attach value-added data like traffic information. As has been seen in other areas, companies can cooperate in the creation of the foundational layers, while adding their special products on top.

The companies that initially sponsored Overture Maps had all been working on OpenStreetMap previously, but they had needs that were not being fully met there. For example, they want to use all of the available mapping data, much of which does not appear in OpenStreetMap; this data includes some government mapping data and the increasing amount of data generated by machine-learning systems. There is a need for a high degree of validation of this data; maps reflect facts, and there need to be protections to keep people from changing those facts. Among other things, Overture Maps uses machine-learning systems to validate map data.

That data also must be presented in an organized scheme, in a way that makes it easy for others to attach additional data to it. This process is called, for better or worse, "conflation". Overture Maps supports this functionality via a mechanism called the Global Entity Reference System (GERS). Every item of interest in a map is assigned a permanent GERS ID, which is a 128-bit identifier. GERS, he said, might be the most significant part of the entire Overture Maps effort; it makes conflation simple, easing the use of map data.

Mapping data can be divided into over a dozen data types, called "themes"; Overture Maps currently supports six of them. The transportation theme covers the road network, including information on public transit, cycling, and more. Places covers points of interest — primarily businesses. This is the most annoying data type to manage, since it is highly volatile and businesses often don't bother to notify the world when they cease to exist. Divisions are administrative boundaries, from national borders to neighborhood boundaries. Buildings is exactly what it seems, as is the addresses theme. Finally, the base theme covers geographical features, including ground cover, water features, and more.

One of the key features of Overture Maps, he said, is integrating data from multiple sources. As an example, the latest release included over 20 million new Japanese addresses, along with building data from both OpenStreetMap and the Microsoft building footprints database, which was machine-generated from satellite data. It turns out that the OpenStreetMap data is more complete in the cities, where the contributors live, while the Microsoft database is more complete in rural areas.

Returning to GERS, Prioleau said that mapping is being driven by an explosion of data types. But only some types are "exploding". The road network, while constantly changing, is relatively easy to keep up with; the same is true of addresses and such. This kind of data is manageable as a shared base layer that all can contribute to and use. The attached layers, though, which might include traffic data, restaurant reviews, opening hours, or any other sort of add-on data, are far too volatile to be managed in an open layer. This data will, he said, continue to be a proprietary product indefinitely.

For years, these add-on data types have been fragmented into numerous incompatible formats. Integrating the data into a useful product has often been more expensive than acquiring it in the first place. If everybody could agree on a global ID for mapping data, though, a lot of these problems would go away; GERS is meant to be that ID.

Being in Japan, Prioleau took a moment in his conclusion to talk about the automotive industry, which is collecting vast amounts of data as its cars phone home. (He did not address the privacy implications of this data collection). Manufacturers see this data as valuable, but have not always been able to realize that value. Identifying this data with GERS IDs would make it easy for these manufacturers to contribute some of the data to Overture Maps, and for them to use the rest to add value to their products. One manufacturer, for example, is looking at detecting a car's wheels slipping on ice, and sending an icy-roads notification to other cars operating in the area.

Once the talk was done, I could not resist asking the obvious question: why create a new project rather than focusing these resources on making OpenStreetMap better? Prioleau answered that he has been an OpenStreetMap contributor for nearly 20 years; it is far from clear, he said, that the project wants much of the work that is being done in Overture Maps. OpenStreetMap is focused on creating a community of mappers; dumping a bunch of AI-generated data into the project is not the best way to encourage that community.

The companies that launched Overture Maps first tried hard to turn OpenStreetMap into the sort of project they needed, he said; the OpenStreetMap contributors "successfully fended that off". So Overture Maps was created with a focus on the end users of the data rather than on the contributors. While a bit over half of the project's data comes from OpenStreetMap now, he said, he expects that proportion to fall in the future.

An interesting way to look at these two projects, perhaps, is to see OpenStreetMap as being analogous to a typical free-software development project, while Overture Maps is more like a distributor. The former is focused on its development community, while the latter is working on integrating the results of various mapping projects, performing quality control, and producing something that is easily usable by others. So, while the two projects might appear to be in competition at one level, there may actually be a useful role for both of them in the end.

[ Thanks to the Linux Foundation, LWN's travel sponsor, for supporting our travel to this event. ]

Index entries for this article
ConferenceOpen Source Summit Japan/2024


to post comments

What license are those data available under?

Posted Oct 31, 2024 19:14 UTC (Thu) by ceplm (subscriber, #41334) [Link] (4 responses)

Is it something better than embrace-extend-extinguish by the commercial owners of the project?

What license are those data available under?

Posted Oct 31, 2024 19:21 UTC (Thu) by shironeko (subscriber, #159952) [Link] (2 responses)

yeah, I was also wondering if there's a CLA for contributions.

What license are those data available under?

Posted Oct 31, 2024 21:29 UTC (Thu) by corbet (editor, #1) [Link] (1 responses)

Sorry, I meant to include the license info from their FAQ:

Generally, Overture data is licensed under the Community Database License Agreement – Permissive v2 (CDLA) unless derived from a source that requires publishing under a different license, such as data derived from OpenStreetMap, that constitutes a “Derivative Database” (as defined under ODbL v1.0), which will be licensed under ODbL v1.0.

What license are those data available under?

Posted Nov 1, 2024 4:58 UTC (Fri) by ringerc (subscriber, #3071) [Link]

Does that mean there is not and will not be some copyright assignment, CLA or perpetual transferrable copyright grant?

Because it can have whatever license for now, but with assignments that can be changed going forwards. Not retroactively at least.

What license are those data available under?

Posted Nov 2, 2024 5:26 UTC (Sat) by mirabilos (subscriber, #84359) [Link]

Likely illegal, given they use ML/LLM…

City vs rural data

Posted Oct 31, 2024 20:21 UTC (Thu) by SLi (subscriber, #53131) [Link] (6 responses)

> It turns out that the OpenStreetMap data is more complete in the cities, where the contributors live, while the Microsoft database is more compete in rural areas.

Yes, this is a funny aspect of OSM that I'd wish more people were aware of, just because it would drive better choices in what to use and when. It's my experiment that OSM generally beats commercial offerings in at least western cities, whereas in rural areas it's lacking.

In fact, I seem to remember reading long ago that for funny reasons (lots of US soldiers with too much time) Baghdad is probably the best mapped place on earth.

City vs rural data

Posted Oct 31, 2024 20:24 UTC (Thu) by JoeBuck (subscriber, #2330) [Link] (3 responses)

Maps of trails in the mountains surrounding the SF Bay Area are very good in OSM, thanks to lots of hikers contributing.

City vs rural data

Posted Oct 31, 2024 21:14 UTC (Thu) by atnot (guest, #124910) [Link] (2 responses)

Yes, it very much depends. For example in the german/austrian alps, some regional tourist boards seem to have official OSM accounts and regularly update the map with route closures, path difficulties and other information. But other things like benches are very poorly mapped. It all depends on who's there and what data they care about which varies a lot.

City vs rural data

Posted Oct 31, 2024 22:29 UTC (Thu) by JoeBuck (subscriber, #2330) [Link] (1 responses)

I should say that while the trail maps are accurate, facilities like rest rooms aren't marked reliably, often if it's marked on the map it just means there was an outhouse at that spot at one time.

Still, the map is useful.

City vs rural data

Posted Nov 1, 2024 10:56 UTC (Fri) by njh (subscriber, #4425) [Link]

That chimes with the observation in the article that the network of ways is easier to keep up-to-date than the volatile point-of-interest layers.

City vs rural data

Posted Nov 1, 2024 11:15 UTC (Fri) by LtWorf (subscriber, #124958) [Link] (1 responses)

Try hiking on Etna.

On OSM you get the trails, on google maps the whole Etna is a uniform green field.

City vs rural data

Posted Nov 1, 2024 14:16 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

A lot of Google "in the wilderness" is just swaths of green: "Thank you, I know I am among the trees." But OSM is also *able* to be improved by dedicated hikers, so incremental improvement gets us to where we're at today.

What is this, really?

Posted Nov 1, 2024 11:17 UTC (Fri) by LtWorf (subscriber, #124958) [Link] (7 responses)

Is this a proprietary project intended to sell extra data over what OSM has?

Did I misunderstand?

I don't get it.

What is this, really?

Posted Nov 1, 2024 14:18 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (5 responses)

It looks like their interest is in selling layers on top of OSM. Managing things like infrastructure maps for local government entities and the like that don't really make sense to be public (e.g., where are the sewer/water lines, what are they made of, installation dates, inspection schedule, etc.).

What is this, really?

Posted Nov 5, 2024 6:07 UTC (Tue) by DemiMarie (subscriber, #164188) [Link] (4 responses)

Why shouldn't location about sewer and water line locations be public?

What is this, really?

Posted Nov 5, 2024 10:33 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (3 responses)

In what I've seen (not necessarily through Overture), the worry seems to be about providing information to Nefarious Actors™. The obvious instance is terrorism-related (domestic or otherwise) fears, but I suspect lawyers might also be considered such if there are situations like Flint, MI out in the open (rather than behind a FOIA request which might give some notice as to what is being looked for). I'm in America if that wasn't apparent; such institutional fears might be different elsewhere.

What is this, really?

Posted Nov 5, 2024 14:40 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (1 responses)

Manholes are commonly labelled to say what's down there.

What is this, really?

Posted Nov 5, 2024 16:32 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Yes, but not how they're connected to each other, the materials down there, maintenance status, etc. that might want to be tracked by those using the layer day-to-day.

What is this, really?

Posted Nov 5, 2024 15:33 UTC (Tue) by excors (subscriber, #95769) [Link]

In the UK, I believe that specific information is the opposite of confidential - water companies are legally required to make maps of public sewers and water mains available for free if you visit their office in person. Many of them also offer maps online, usually for a fee (maybe £50 for a small area; cheap enough that it's worth paying instead of going in person). I think they're often examined when buying a property, so probably a million times a year. And I guess the main reason they're not made available as open data is simply that it would take significant effort and would cut off a source of revenue for the water companies, so there's no benefit for them to do so.

The water regulator has promoted open data and says "We found widespread public support for water companies opening their data, but that water companies have made little progress in opening datasets" (https://www.ofwat.gov.uk/regulated-companies/open-data-in...), so maybe there will be some change but probably not soon, and I see no indication that pipeline maps are a priority (the current progress seems to be primarily about storm overflows, so you can tell which rivers and beaches contain more sewage than normal).

On the other hand there is an in-progress National Underground Asset Register which is "building a digital map of underground pipes and cables", mainly to help people avoid digging into them. It sounds much more comprehensive than just water pipes, and it's based on an open-licensed data model (https://geospatialcommission.blog.gov.uk/2024/08/13/an-in...), but it's explicitly not open data for national security reasons: "We're dealing here with data that can't be seen by everybody from above ground as it's sensitive data relating to gas, electricity and, more importantly, digital data between key parts of our infrastructure. We are security minded in our approach to stop bad actors from accessing this data." (https://www.government-transformation.com/innovation/nuar...)

What is this, really?

Posted Nov 1, 2024 14:29 UTC (Fri) by corbet (editor, #1) [Link]

If I understand it at all, they will not be in the business of selling proprietary data. Part of their reason for existence is certainly to make life easier for companies that do traffic in such data, but the base layers they create will be freely licensed.


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds