
Show HN: Free database of geographic place names and geospatial data - marco1
https://github.com/delight-im/FreeGeoDB?hn=2015-09-25
======
rmc
"RESOURCES.md" ( [https://github.com/delight-
im/FreeGeoDB/blob/master/RESOURCE...](https://github.com/delight-
im/FreeGeoDB/blob/master/RESOURCES.md) ) includes OpenStreetMap, which is
licenced under the Open Database Licence (ODbL). You cannot relicence it under
the Apache licence. I've raised an issue ( [https://github.com/delight-
im/FreeGeoDB/issues/1](https://github.com/delight-im/FreeGeoDB/issues/1) )

~~~
maze-le
What if you make extracts, reprojections or create a new dataset from a ODbL-
Licensed resources (or even more Licenses, depending on the number of datasets
you use to create a new one), can you relicense it then?

~~~
Doctor_Fegg
This is known as a "Derivative Database" and so ODbL still applies. It's a
classic share-alike/copyleft licence, with the wrinkle that it has to use
copyright, database rights, and contract law because database protection
varies across legislations. See
[http://opendatacommons.org/licenses/odbl/](http://opendatacommons.org/licenses/odbl/)
.

~~~
rmc
> It's a classic share-alike/copyleft licence

A big difference from a CC-SA licence is that it is possible to make a
produced work from OSM data and all you have to do is attribute OSM, there is
no share-alike requirement. In this way, the OSM ODbL is _less_ restrictive
than a standard share-alike licence.

The main example of that is making a map image. You can make a map image from
100% OSM data, and that image doesn't have to be share-alike.

If you create a _database_ as is the case of FreeGeoDB (and perhaps if you use
OSM to geocode another database), then share-alike applies.

~~~
jumperjake
I don't understand. If I produce a map image with all the details I'm
interested in, and publish it, then use opencv to extract the data from that
image into a database, would I be free to license the resulting database as a
I wish?

~~~
rmc
I don't know. Ask a lawyer or judge. You can read the licence
[http://opendatacommons.org/licenses/odbl/1.0/](http://opendatacommons.org/licenses/odbl/1.0/)
and look at the definitions of "Produced work" (a map) or "Derivative
Database".

In practice, no-one's really done that or likely to do it. Either you'd do
something silly like make the map a SVG with all data encoded as textual
attributes and your "Computer Vision Algorithm" is basically grep (in which
case it would probably be seen as a Derivative Database), or do real CV on a
real image, which is very hard to do and will result in bad results. It's
sufficiently hard that no-one's worried about it.

If you really don't like the OSM licence, you are free to go to another map
data provider, pay them what they charge and agree to whatever they want, and
get something else. If you want OSM, agree to OSM's terms.

------
Quai
First three norwegian cities I looked up was given the wrong name. (Plesund
(should be Ålesund), Bodi (Bodø) and Tdnsberg (Tønsberg)). So, I think it is
safe to say that there are encoding issues in the data set.

~~~
bhaak
There's definitely something wrong.

Looking at the SQL file, "Zürich" is listed as "Zdrich", Munich is only shown
as "Munich" without the German name "München" anywhere, and Cologne is listed
as "Cologne" with the wrong "Koln" as alternative name.

Doesn't look like a reliable data source.

~~~
seszett
French names have the same problem, with "La Réunion" being shown as "La
Rcunion" and "Rhône-Alpes" as "RhAne-Alpes".

It's strange, I can't really understand how these accents can get mangled as
these particular single letters.

~~~
vcarl
Almost looks like OCR errors, especially Rhône -> RhAne.

~~~
bhaak
I don't think that the reason are OCR errors. It's too consistent for that.

I think I also have seen such character set corruptions before, but I can't
remember how exactly you can get these particular corruptions.

------
Vespasian
So this is bascially Natural Earth repacked as JSON, CSV and SQL?

It's a nice start. I like it. Would it be possible to add the programm
/instructions you used for converting the data from their original shape?

I would love to see an additional SQL version for PostGIS and the likes so
that I can use their a large number of spatial functions to work with this
data.

~~~
maze-le
It seems to be the Natural Earth Dataset -- I think the 1:50m-resolution, but
with a slightly reduced set of properties (columns). I haven't tried it, but
from the looks of it, the sql-files should work with PostGIS as they are.

~~~
Vespasian
For what I've seen the data type of the columns is always varchar and not
geometry.

Not a big issues merely it's more convenient otherwise ;)

~~~
marco1
We should definitely fix this in the future! Thank you :)

------
legulere
Apache seems a strange license for data. Why not CC-0 like for instance
wikidata?

~~~
marco1
Thank you!

We know that Apache, MIT, etc. are usually for code and the Creative Commons
licenses are for content (e.g. writing, images).

We thought that the data sets (CSV, JSON, SQL) are rather in the intersection
of code and content, so the Apache license would be okay.

Is there anything specifically wrong with the Apache license for this type of
project? We couldn't find any tangible downsides but we'd love to hear about
any pros and cons.

~~~
pki
data sets are simply that - data - content

your code would be the scripts you write to process it, and your content would
be your data in my opinion

------
vezzoni
congratz, nice job! really interesting and helpful. I'll try to use it.
nowadays I am using sollo atlas (which is used by the
[http://www.findmyninja.io](http://www.findmyninja.io)):

[http://atlas.sollo.io/atlas/api](http://atlas.sollo.io/atlas/api)

it's possible to easily navigate through resources (places data) and, a
helpful feature is its capacity to have synonyms.

------
aembleton
This would have been useful if the railroads and roads actually had names
attached to them.

~~~
marco1
We didn't get those from our sources but we'll love to add them!

------
Crocode
The data like these are available for free (and updated by community!) for
years. See [http://www.geonames.org/](http://www.geonames.org/)

Numerous website use the Geonames dataset foe there work. (I also partisipated
in this madness: [http://www.wemakemaps.com/](http://www.wemakemaps.com/))

~~~
datamongers
I've not see [http://www.wemakemaps.com/](http://www.wemakemaps.com/) before,
looks like a cool site, but whats with consistently referring to OpenStreetMap
as OpenMap? and how about adding proper attribution on the OSM maps?

------
krick
There are free airport lists that are much more complete. Also, two columns
(lon, lat) for points would be better, as it's much easier to transform "lon,
lat" into "POINT(lon lat)" than otherwise.

And, yeah, although I don't really care for licenses usually, here it makes me
feel uneasy.

~~~
marco1
Thank you!

What makes you feel uneasy about the license? We'll love to fix this!

Regarding the airport lists, we're definitely open to merging in more complete
and accurate data.

------
IanCal
Looks interesting, how would you compare it to GeoNames?

~~~
marco1
Admittedly, it's similar! But we wanted to include more data, _other_ data,
and just offer an alternative in general.

Three things we definitely wanted were (1) complete boundaries, (2) easiest
programmatic access and (3) efficient collaboration.

------
mileszim
Cool! Do you have any plans to track historical changes of geospacial areas?
What reference frame are you using for labeling areas? (places America
considers countries vs what China considers, for example)

~~~
marco1
Thanks! Tracking historical changes is a great idea, but this requires
_perfect_ data from day one, doesn't it? Otherwise, how will you differentiate
between historic changes and mere factual/technical corrections? Thus, right
now, there are no plans to track this. But we're open to ideas and
contributions on how to do this.

Regarding the reference frame, that's definitely an issue. Right now, it
follows the souce's guidelines (see "SOURCE.md"), which means "boundaries of
sovereign states according to defacto status. We show who actually controls
the situation on the ground. For instance, we show China and Taiwan as two
separate states. But we show Palestine as part of Israel."

------
electriclove
How will this be kept up-to-date?

~~~
spacemanmatt
Yeah, a quick perusal (VERY quick) revealed no provenance, maintenance, or
currency information. There's a lot more to data than a snapshot of some
tables.

