ODbL progress

From Richard Weiat:

We’re planning the final stages of the switch over to the Open Database License for OpenStreetMap data. The OpenStreetMap Foundation Board discussed the license upgrade process and many other aspects of the project at their recent board meeting, and we’ll have more information about that from the board shortly.

One item that came out of the board meeting was the deadline to complete the license upgrade by 01 April 2012 and to publish the first OpenStreetMap planet file under the ODbL by 04 April 2012. The License Working Group supports this target date as a reasonable goal.

New OSM License, Some Practical Implications And Gray Areas

Occasionally I make digital maps for customers based on OpenStreetMap data. Sometimes they request the maps in form of bitmaps, and sometimes they need a vector format like SVG or PDF in order to be able to edit the maps in Adobe Illustrator. Of course, the issue of license always pops up and often have to explain the stipulations of CC BY-SA 2.0 to them.

Soon OSM will switch to a new ODbL license. I have to admit I mostly stayed away from the numerous legal talks that were going on various OSM channels, simply because I feel much more productive coding than participating in endless strings of emails. But now that the new license is here, I need to get acquainted with it from the perspective of someone trying to make (some) living out of OSM data. “I’m not a lawyer” is the usual phrase you can see in OSM legal discussions, but waiting for one to give you some solid information is like waiting for Godot, so I’ll make judgements based on my own understanding and some common sense instead, and simplify things when I feel like it. If anyone objects, they can twitter me with their objections and I’ll try to correct things.

So let’s say a customer requests an SVG map of my home town and I decide to use OSM data for it. For the sake of simplicity the map will be based purely on OSM data, so no other sources. Let’s first look at some of the important definitions in ODbL (emphases are mine):

Database” – A collection of material (the Contents) arranged in a systematic or methodical way and individually accessible by electronic or other means offered under the terms of this License.

Derivative Database” – Means a database based upon the Database, and includes any translation, adaptation, arrangement, modification, or any other alteration of the Database or of a Substantial part of the Contents. This includes, but is not limited to, Extracting or Re-utilising the whole or a Substantial part of the Contents in a new Database.

Contents” – The contents of this Database, which includes the information, independent works, or other material collected into the Database. For example, the contents of the Database could be factual data or works such as images, audiovisual material, text, or sounds.

Produced Work” – a work (such as an image, audiovisual material, text, or sounds) resulting from using the whole or a Substantial part of the Contents (via a search or other query) from this Database, a Derivative Database, or this Database as part of a Collective Database.

Substantial” – Means substantial in terms of quantity or quality or a combination of both. The repeated and systematic Extraction or Re-utilisation of insubstantial parts of the Contents may amount to the Extraction or Re-utilisation of a Substantial part of the Contents.

So the first open question: is an SVG map a Produced Work or a Derivative Database? Or both? SVG map is an XML file that contains projected geographical data (together with visual styling attributes). OSM XML file can safely said to be a database. If you say SVG is not a database, where do you draw the line? What about KML or GML files?

This question is important because of the next clause:

Access to Derivative Databases. If You Publicly Use a Derivative Database or a Produced Work from a Derivative Database, You must also offer to recipients of the Derivative Database or Produced Work a copy in a machine readable form of:

a. The entire Derivative Database; or

b. A file containing all of the alterations made to the Database or the method of making the alterations to the Database (such as an algorithm), including any additional Contents, that make up all the differences between the Database and the Derivative Database.

If the SVG map file is not considered a Derivative Database, then you have an option of supplying the original OSM data (OSM XML file, PBF file or even a database snapshot) together with the SVG file or providing a description of how you derived the Derivative Database.

On the other hand, I can argue that SVG is a Derivative Database because it is “arranged in a systematic or methodical way and individually accessible” and “includes any translation, adaptation, arrangement, modification, or any other alteration” of the original OSM data. So in that case simply publishing the SVG file (and only that file) would cover the license requirements.

I should note that the SVG map has to be released under the ODbL or a compatible license.

Now let’s go one step further. Let’s assume (as I do) SVG is a Derivative Database. What if I then generate a PNG bitmap (or a Web map, for that matter) from the SVG file using Adobe Illustrator and want to publish that, too? One could argue that a bitmap is a Produced Work and since we already published the Derivative Database that produced this Work, we are covered.

But what if I didn’t generate the Web map from the SVG, but used a tool like Mapnik or Maperitive and generated it directly from an OSM extract instead? Let’s say that for practical purposes I don’t want to publish 1 GB of OSM data and I choose to go down the path of describing the “method of making the alterations” I did to generate the bitmap. What are the options here?

  1. I could write a detailed description of steps I performed to generate the Web map. Osmosis, Mapnik with all the batch scripts etc. I could even post the source code of the program(s) I used.
  2. On the other hand, I could just describe the process in a sentence or two. I could also say I used a special filter in Photoshop.

I can partly understand the spirit of the “method” clause - to enable access to the interesting derivations of the original data. But I see several holes in the “method” definition:

  1. What if I produced the map by arranging a lot of the map elements manually, by hand? This is quite a common case when you have to place map labels in order to avoid label conflicts. How would I describe the “method” other than saying that I did it by hand? How would that help anyone?
  2. What if I used an expensive proprietary software (like Illustrator or Photoshop)? Or even a piece of code that I haven’t released to anyone else? In that case nobody else would be able to reproduce the method. Does the “Contents” cover source code as well? It doesn’t mention the source code explicitly. If it does, then that implies you can only use open source software with OSM data, which would be silly.
  3. What about complex algorithms? How detailed the description would have to be for someone to be able to reproduce the algorithm? I’ve tried reproducing various algorithms from long scientific articles and I can tell you it’s not an easy task even if you have a detailed description.

Frankly, I don’t see how the “method” clause could be enforced in practice.

(UPDATE: now that I thought about it once more, the clause only talks about describing the method of arriving to the Derivative Database and not to the Produced Work itself. So I could just say “I downloaded the OSM extract from Geofabrik” and that would be it.)

One final question: does extracting OSM data for a city amount to a “substantial part” of the original OSM database?

Creative Commons: un 2011 concentrandosi sui dati

Il blog post http://creativecommons.org/weblog/entry/26283CC and data[bases]: huge in 2011, what you can do” (CC ae dati(base)“ (creative commons e dati/database: un enormità nel 2011, cosa si può fare”) comincia con questa immagine

[caption id=“attachment_689” align=“aligncenter” width=“500” caption=“DATABASE at Postmasters, March 2009 by Michael Mandiberg / CC BY-SA”]

External image

e con questa introduzione

You may have heard that data is huge — changing the way science is done, enabling new kinds of consumer and business applications, furthering citizen involvement and government transparency, spawning a new class of software for processing big data and new interdisciplinary class of “data scientists” to help utilize all this data — not to mention metadata (data about data), linked data and the semantic web — there’s a whole lot of data, there’s more every day, and it’s potentially extremely valuable.

il tema così si sposta sull'importanza di avere licenze per i dati (caldeggiando però l'uso della licenza CC0 - la public domain - quantomeno per i dati scientifici) e, dei limiti che, attualmente le licenze Creative Commons hanno sul tema.

Il 2011 sarà quindi un anno di impegno, da parte di CC, per arrivare alla versione 4.0 affrontando la problematica di “Database Rights”.

Che le CC (ad esclusione della CC0) non siano strumenti utili per i dati è descritto nello stesso articolo

“(2) However, where CC0 is not desired for whatever reason (business requirements, community wishes, institutional policy…) CC licenses can and should be used for data and databases, right now (as they have been for 8 years) — with the important caveat that CC 3.0 license conditions do not extend to “protect” a database that is otherwise uncopyrightable.”

Questa nuova idea di Creative Commons ha interessato anche il gruppo di lavoro sulle licenze della OpenStreetMap Foundation che ha subito preso contatti.

Così, il 18 gennaio il gruppo di lavoro si è così confrontato con Mike Linksvayer - vice presidente di Creative Commons Linksvayer chiedendo anche  che la versione delle CC 4.0, sul tema dei dati, sia compatibile con la licenza ODbL

Mike Linksvayer, Vice President, and General Counsel Diane Peters of Creative Commons joined us to discuss where they at with BY-SA and data. We were greaty encouraged that they will be looking at this very seriously during a CC suite 4.0 version review process, which will take about two years once started. Mike L particularly emphasised the importance of inter-operability. The meeting was very cordial and both groups look forward to mutual co-operation. Mike said that OpenStreetMap is very happy to share our experience as a pioneer in trying to implement an Open IP license for data and databases that incorporated attribution and share-alike and looked forward to participating in the CC 4.0 input process. He also asked if Creative Commons could consider whether it could make any public comments on the issue of compatibility with current CC-BY licenses on data and the ODbL.

Dal punto 4 del report di incontro - http://www.osmfoundation.org/image/2/2e/20110118_LWG_Meeting_Minutes.pdf - del OpenStreetMap Foundation Licensing Working Group