

Ask HN: Database with country names down to the city, etc. - wenbert

Hello HN,<p>I remember reading these in HN a few months ago. Apparently they were free. They had coordinates, country names, cities, towns, etc.<p>Can anyone point me to a reliable site with a good database for such?<p>Thanks!<p>-Wenbert
======
jacquesm
hey Wenbert,

I've got one, let me dig it up for you, just a moment.

edit:

Ok, ready for download, it's a mysql dump so you can simply read it back in to
any mysql server:

<http://autotagger.com/geoinfo.dump.gz>

41 Megabytes, 2.67 million towns & cities, fields are country,
region,name,accented name,latitude,longitude,verified,current,prefered and
population.

The 'verified', 'current' and 'prefered' fields are all internal to an
application I wrote so you can lose those safely.

I've also added tables with countries, country aliases and regions.

edit2: holy guacamole, easy guys, that server is only on a 100 Mbit uplink.

I figured there'd be one download...

~~~
wenbert
Thanks! What about for updates? How do you regularly update your data?

LOL. It downloaded in about a minute. Nice server!

Again, thank you so much!

~~~
jacquesm
This is a hybrid of a public dataset and a _lot_ of manual polishing to get
rid of duplicates (no doubt there are a still quite a few of those, especially
in Asia).

I don't actively maintain the dataset, but if you want I can create a small
website around it so people can update it.

The reason I built this was a project called 'autotagger', the idea behind it
is to identify people, places and subjects in texts automatically.

It's one of my pet projects but I don't have enough funds to continue to work
on it full time without a way to monetize it, so it's been sitting there
rusting for the last year or so.

I think I have most of the datastructures and algorithms worked out, it needs
a massive rewrite to get it production ready and fast enough.

The fields most subject to change are the population and the names,
cities/towns don't get created too often but merging smaller towns does
happen.

They rarely move around, so I figure the longitude/latitude should be good for
a while ;)

~~~
wenbert
Hehe. I am from Asia (Philippines), but in Norway right now for a short
training. Things are a little bit unpredictable right now (schedule, etc.) for
me but I find your project very interesting. It would be good if I can find
something else to do since I have a few hours at night after the training.

Would be good if you can give me some more info about it.

~~~
jacquesm
Autotagger is one of the more interesting things I've worked on in the last
couple of years, it takes a pretty weird approach to the problem but it worked
surprisingly well.

When I have some time I'll do a write-up on it and post it.

The basic idea behind it is 'google backwards', if you can figure out how it
works from those two words drop me a line ;)

~~~
wenbert
I have no idea what you mean by "google backwards".

Anyway, thanks! I keep an eye on this project.

------
ratsbane
Not exactly to your requirements but related: we use the MaxMind GeoLite city
database to correlate IPs with lat/lon and city/country names. It's free (the
GeoLite version; the pay version has a little more detail but doesn't cost
very much) and simple to update and use.

<http://www.maxmind.com/app/geolitecity>

------
gyardley
The one we used for our analytics product came from here:

<http://www.geonames.org/export/>

~~~
wenbert
Thanks again! I will compare this with jacquesm's file.

------
bquinn
<http://www.geonames.org/> (download area at
<http://download.geonames.org/export/dump/>)

Used as the placename source for <http://www.dopplr.com> among others.

HTH,

Brendan.

------
wenbert
I have found the link in HN: <http://news.ycombinator.com/item?id=530086>

