Folks... he gives us the database for FREE! We can use it or leave it. I would hope that someone downloads the damn thing, compares it against his GeoIP and tells us his findings. Otherwise, the discussion about accuracy is silly.
I was trying to find a free SQL database of IP geolocation with country, city, region, latitude and longitude for a project but none was accurate and up to date so I decided to create my own and now I offer it for free.
He makes it sound as though this isn't particularly difficult or expensive. Does anyone want to speculate on the method he is using to compile this data? For his zipcode database he seems to be running mass queries on the Google Maps API.
Simply doing mass WHOIS queries to the regional registrars only gets you data about the corporate headquarters of the ISPs, which I suppose could be enough depending on the level of granularity that is acceptable to you.
(presently a similar question on his forum, posted by someone else, is unanswered: http://blogama.org/node/105)
I checked his ip->city database against a dozen class C's with which I am intimately familiar, and in every case they were either correct or as correct as possible (i.e. someone in that /24 is located in city X, even if most people in the block are located 25-100 miles away). I specifically looked at cases where the ARIN information would lead one astray, so it appears that is not the primary source of location data.
.. so perhaps he's pulling the data from some of the same sources as Maxmind does and hence is on a similar update schedule, is what you're implying, right?
Anyway, I'd still like to know what those sources are.
Or perhaps the implication is that he's pulling some of his data from the Maxmind database. (Which would be illegal, unless I'm misremembering Maxmind's licensing terms.)
I'm guessing that you're merely pretending not to understand what's being implied, but I don't understand why...
I don't believe this is what Maxmind does; I have no direct knowledge but I believe their data is based on IP registry data (location of the registering org) along with inference from per-user registration data (surveys, social networks, panels, etc.)
Akamai has a (much more expensive) geolocation product that's based on doing triangulation, since they have a lot more information on routing (all those edge servers) but I'm not sure it's appreciably more accurate.
Sorry: I was actually joking. I should have capped with an emoticon. But it's cool to know Akamai actually has a triangulation DB: I didn't know it was feasible. So I gave you a karma point:)
Here's one I made (and maintain) that doesn't need a database: http://chir.ag/projects/geoiploc/ but it only does Country name, not city/region.
My file is updated once a day and is basically one giant PHP array. It works quite well for parsing through logs and I've used it on many of my hi-demand sites with some caching.
This seems very interesting. I definitely think there's a need for competition to MaxMind's free geoip database. I registered openipdb.(org|com) a year ago with the intention of doing that but never got around to it.
Is anyone interested in taking this data and turning it into a legit open source project? Let me know and I'll donate the domain name :)
You'd have to contact the author and make sure the data has an open license, but my guess is that he's OK with it.
I spent a lot of time a few years back looking for something like this, and the existing solutions were just terrible. Like the author says, there just wasn't a usable dataset out there. Mostly there were a bunch of bad web APIs wanting to charge money for wildly inaccurate data.
So yeah, this might still be inaccurate, but it's a huge step forward.
I've been using it for about 9 years. If all you need is country resolution, it seems pretty much spot on, and very useful.
I haven't seen anything that is all that accurate beyond that. Most geo location (for the UK anyway), pegs people at the ISPs head quarters, rather than their actual location. Which isn't really very useful.
I'd like to know the souce of the data and how it compares to other databases. It might be good for a prototype, but there's no telling the real quality of the data.
I'm all for competition in this space but I'm left with some questions.
"How accurate is the data?
Very accurate. The database is updated during the first week of each month."
To me this is not acceptable. Firstly, I want numbers and not a vague assurance. Secondly, I'm dubious whether a brand new service in this space is going to be, what I would consider to be, very accurate.
This isn't to say that the service is without merit just that I want a little more info and a little less marketing hyperbole.