Hacker News new | comments | show | ask | jobs | submit login
LocationIQ — Free and Fast Geocoding Service (locationiq.org)
205 points by gopi_ar 459 days ago | hide | past | web | 73 comments | favorite



I'm glad to see more geocoding offerings, but if you need more than a few thousand addresses, you'll find no offering will. Running your own geocoding really shouldn't be that scary of thing to do.

The biggest pain in running your own OSM geocoder (nominatim) is kind of a bear to set up. I imagine there docker images that could make things a bit easier and allow you to grab a region extract, but I could be wrong there.

On a side note, I was working on an open source geocoder based on OSM that was easy to build as part of a mapping util I've been working on (https://github.com/buckhx/diglet). I lost some traction on the geocoding side of things and have been focusing more on map building/serving, but did get a stable version for US addresses. There are lot's of interesting problems that come along with geocoding.


> Running your own geocoding really shouldn't be that scary of thing to do.

> The biggest pain in running your own OSM geocoder (nominatim) is kind of a bear to set up.

You said it shouldn't be scary then immediately call it a bear. I am scared of bears.


Here is the Docker image for the geocoding stack that foursquare uses, with all world data already set up (would take you about a day to set up yourself):

https://github.com/doda/docker-twofishes-geocoder


Why does it take so long?


Processing the raw data and generating a search index is quite resource (IO/RAM/CPU) intensive. Some geocoders (not twofishes) need 100GB of raw data.


Not sure about your first sentence.

I'm founder of the OpenCage geocoder, we'll gladly work with anyone large or small, we have clients doing millions of requests per day. We use OpenStreetMap but also other open geo data sets. https://geocoder.opencagedata.com


Looks pretty good. We currently use Google but would love to switch.

One thing our app needs is a way to only get street address matches. Basically, I already know that an address is a street address or at least a very small side road. It's not an entire county or city, and never a very long road like a highway. Google lets me filter out such bogus results by returning a "bounds" field in the results. Using this bounding box I can calculate the area, and ignore anything less than 200x200m, regardless of the type of feature returned (which is sometimes a "route" even though it's very small). I see your API returns a "bounds" in the results, but it's not documented. Is that what it's for?

I'm also looking for a good autocompleter. Basically, if my app is about Montana, then the user should be able to type "M" and get "Missoula" as the top hit within 50ms. Not Mississippi or Montreal or anything outside Montana. It should also be possible to misspell names, type partial street addresses and so on. Google's geocoding API is useless for this since it doesn't do fuzzy matching. They have an API called Google Places, but it has restrictions and doesn't seem designed exactly for this stuff.


The Google Geocoder API has viewport biasing[1], and will work perfectly for your example of typing "M" to get Missoula, and should should handle typos. Not sure what you mean exactly by fuzzy matching in this context. Note that you still need to create your own clientside autocomplete widget, and an average user would potentially require numerous queries (instead of just 1), which would hit your quota.

Places API autocomplete can be limited to only return results of type 'geocode', which will give you the autocomplete of just geocodable locations, and was designed for this purpose.

[1] https://developers.google.com/maps/documentation/geocoding/i...


We used to use viewport biasing, but I forget why it's no longer enabled; it's been a long time since I worked on this. I know we do use component filtering, and I can tell you that with "components=administrative_area:MT|country:US", typing "M" will yield "Montana, MT" and that will be the only result. You have to type as far as "Missou" to get Missoula as a result.

Generally, the geocoder's substring matching has limited range, which is why it is fairly useless for autocompletion. Its partial matching seems limited to misspellings or nearly-complete terms. For example, I just tried typing "mead", and asked the API for 10 results. I got 5 results, all things like "Mead Lane" or "Meade Ave", but no "Meadowood". So it stopped at 5 even though there are more sharing that prefix. If I type "mea", I get just _one_ match, about some random place with "Mea" in the name. In other words, it's doing something else (something more sophisticated, but less appropriate for this use case) than mere prefix/substring matching.

Again, it's been a while, but from the last time I looked, I recall the Places API being better for this type of incremental autocompletion, but it had some issues related to geofencing that made it almost as bad.


The only way to know if our service will meet your need is to try it. Please give the free trial a go.

We don't offer an autocomplete.


Geofabrik can set up a nominatim geocoding service on your own servers for a one off fee, if you don't want to do it yourself. http://www.geofabrik.de/services/server.html


Just wanted to call out that we regularly process datasets in the 10s of millions and 100s of millions range at geocod.io. Our service was specifically built with batch geocoding in mind.


FWIW, Mapzen also has been providing OSM-based geocoding at a rate of 30,000 a day, as well as reverse geo and routes: https://mapzen.com/documentation/search/

The advantage of OSM geocoding is that, unlikely Google Geocoder, there are no terms of service barring you from using the data on a non-Google product or for storing the data for your own use and analysis.


> The advantage of OSM geocoding is that, unlikely Google Geocoder, there are no terms of service barring you from using the data on a non-Google product or for storing the data for your own use and analysis.

Basically correct. However the OpenStreetMap licence (ODbL) may apply to geocoded data. So it might not be as simple as "no terms and conditions".


Mapzen also ingests data from OpenAddresses.

(which for many regions will have much better coverage than OSM)


This is fantastic. I talked to Google for their geocoding API a couple years ago and was quoted $17,500 per year for a pretty basic package that included up to 10 requests per second and 100,000 geocodes per day if I remember correctly.

I looked at hosting OSM myself but it seemed like a lot of work. Huge data files for the initial import and setting up daily increment jobs. Glad to see a managed service emerging!


You may also enjoy: https://geocoder.opencagedata.com


Once you get past the initial import, the incremental updates are essentially no-maintenance. The situation has improved since a few years ago, IIRC - the toolchain got significantly better.

But yeah, for the occasional geocode, running your own instance is overkill.


If I recall correctly MapBox used to have better pricing when they first started. The government agency I worked for was quoted a reasonable rate per 1,000 addresses but the minimum required usage to use it was too prohibitive (paying tens of thousands even although the actual usage was < $1,000/year).


I think MapBox's rates are slightly better. To Store geocoded addresses, we pay ~$12,000/year I believe. That gets you like 1 million request/day I think.

We also felt more comfortable with MapBox given Google's history of sudden api breaking changes...

Also, our requirements had to have the geocoding be very accurate. MapBox and Google's geocoding results came out to be pretty much identical. About every other service, both free and paid, were not accurate enough.


We were quoted 10M minimum geocodes @ $2.50/1,000 if we store any of the data. If my sleep deprived brain is doing that math correctly, the minimum buy would be $25,000/year. Our monthly geocode needs are ~ 5,000/month so the minimum per year is way out of align with our actual usage. Other than that it is a great product. I just wish their pricing for storing the data was more realistic.


Indeed, and we need SSDs on a RAID! I remember we tried importing this on HDDs a year ago and it took 2 weeks and the average response time was 2000ms!

Our current config allows an import in 8 hours and responds within 20ms (not including network latency). It's not cheap though.


Cool! Google's is super-cheap (would rather pay for one for client needs - feel like paid-for means it'll be around for the long term) but would be cool to use this on personal project

Just saw the company is Hyderabad based too, I always enjoy seeing new Indian startups on the radar!


This actually started out as a free-offering for our customers once MapquestOpen started charging unreasonably. Our devs put up LocationIQ as a standalone project that we expect to support fully for a long time to come.

Thanks for your wishes!


Geocodio [1] is another really good service to look into too:

* Good free and premium paid service

* They've gone a bit above and beyond to provide examples/demos in all languages in their docs. Super easy for users

* Can process bulk/batch request super fast in parallel on your behalf

* Have a super cool front-end CSV upload tool built in React so non-programmers can geocode data in seconds

* Forward/reverse geocoding

* Also give Census data, congressional districts, state legislative districts, and even school districts

Full disclosure: I know the husband and wife founder. Very nice people

[1]: https://geocod.io


LocationIQ wasn't meant to compete with paid offerings. It was merely a way for the team at Unwired Labs to give back to the OSM community. We think the good folks at OSM deserves a great free tier.

If geocoding needs are enterprise-grade / you are OK with spending a bit, you should look at Mapzen, OpenCage, and now, Geocodio.


I don't understand. You are "giving back" by creating a service where you make money.

Are you publishing your source code?


Excellent pricing on this. But i just ran my house on it and it was off by 10 houses while proclaiming perfect accuracy. The Google answer was exact. The other party was off by 1 house. Is this because of the underlying data from OSM?


Hi Tim. Mathias from geocod.io here. Can you please drop us a line at hello AT geocod.io would love to look into what happened there.

We're using Tiger/Line and Rooftop-level (through OpenAddresses) datasets under the hood. We do not use OSM as it's generally not optimized for geocoding.


It's a little odd to mention Tiger/Line and optimized for geocoding in the same sentence. Tiger/Line is specifically obscured to protect the anonymity of individuals that live in low density areas. It's de-optimized for geocoding before it is published for public access. It seems pretty likely that's the issue above.

Of course OSM is not a great source of addresses either, as it simply doesn't have data for so many regions.


USA and Canada only :( Ah well, can't have everything.


Kind of funny that they trust Google Maps more for their contact page ;) https://unwiredlabs.com/contact


I'm more worried that OpenStreetMap is misspelled several times and the OSM license listed as CC-BY while it should be ODbL (http://www.openstreetmap.org/copyright).


We'll check that out right away, thanks!


I need to correct myself. The license in the footer references the map tiles in the background. Those were created from pre-2012 data and at that point OpenStreetMap still using the CC-YA license.


Would like to see this default to HTTPS. Addresses can often reveal lots of personal information.


Sorry for the simple question here, but what would a person use this for? What is the use case?

Thank you


Here's an actual workflow I have worked on:

"I'm at 50.00000N, 15.00000E (a GPS coordinate). How do I get to the Foo Bar in Baz City (a text input or selection), by public transit (a choice of transport modes)?"

- Reverse geocode "bus stop or train stop or tram stop near 50 N, 15 E" - "there's a bus stop named Xyzzy at 50.0012 N, 15.0003 E"

- Geocode "Foo Bar, Baz City" - "51 N, 14 E"

- Reverse geocode "bus stop or train stop or tram stop near 51 N, 14 E" - "there's a train station named Baz City Central at 50.99998 N, 14.001 E"

(plus routing and scheduling on top of that - but that is beyond the scope of geocoding, which is one part of the toolchain)

Other example: "I'm at 50 N, 15 E; get me a list of restaurants around here" (optionally: non-smoking, currently open - not sure if Nominatim directly supports filtering like that)


Thank you for the detailed reply. I understand it would be normally used for an app which provides travel planning abilities now.


That is one of the possible uses, indeed; I'm pretty sure someone else has used it in a more creative way :)


To geocode means to transform an address into a location.

So say you have a textual address and want to know where it is located. You feed it into a geocoder and it returns a location.

The same database can often be used to go the other direction, taking a location and returning the name and other details about a place.


Here's a use case I've encountered: Find me all the auto repair shops within X miles of this street address, where the database of repair shops is populated from street addresses as well.


I've worked on a few projects where I've needed to know pairwise distances among a set of addresses.


Wow, I had no idea that Geocoding was so big around here, nice to see. So a question for all of you geocoders, how are you testing and determining accuracy? ie, how many "failed to geocode records" out of an arbitrary number could I expect given that the address is properly formatted?

By the way, (again, at least for the US), the best (fastest, very accurate) geocoder I've ever used was created by Alteryx. I've always been curious if it's actually their own geocoder, or they are using another service in the background. (edited to add: though of course, this is for Alteryx's proprietary system, and though it provides decent ways to get the data in/out, it's not simply a plug and play system if you're writing your own software.)

ESRI's is one of the worst; relatively slow, not all that accurate and worst of all (at least this was the case) it'll choke on anything over 300,000 records.


Hi, co-founder of the OpenCage Geocoder here.

I can't speak to the opinions of others, but for me your question is a lot like asking "what's the best programming language?" The only realistic answer is that it depends on your task. We're continually facing new customer requirements, and what one customer thinks is absolutely essential, the next guy couldn't care less about.

A good example is speed. For some clients every millisecond is critical (imagine real time bidding systems), for others they are running a batch process to geocode their database in the middle of the night and couldn't care less if it takes one hour or two. Likewise huge differences in requirements in terms of accuracy. Some clients will accept only perfection, meanwhile the next guy intentionally wants a vague answer so that consumer privacy is maintained. Then obviously there are big differences across countries, forward and reverse, etc, etc. Some clients must have the attributes that using an open data source like OpenStreetMap allows, others care only about price.

So there is almost certainly a perfect answer for your specific geocoding needs, but there is no perfect geocoder.


Thanks for replying, your data sources are amazing, it must have taken quite a bit of work to put them all together.

And I get that different users have different needs, but I'm still curious about the accuracy (it's geography after all, I don't care how fast the results are returned if they're wrong.) And especially given the multi-data sets that OpenCage uses, how do you know that you're returning the right results? (obviously there is the spatial aspect, ie, within 100 yards of the true location; but I'm most curious about the percentage of returned address with a greater than 90% probability of being the "correct" address.) I wouldn't expect it from most geocoder services, but that's what "ground truthing" is for.And what happens if you happen to come across conflicting results when you're using the multiple data sets?

So again, all these new geocoders provide some nice services, but how are they measuring their accuracy of results? I could also shrink this question down to a business question, what makes your service better versus all the others? Who can prove to me that they provide the "best" (most accurate) results? (I'm not in the market, sorry, it's a hypothetical.)


I still think you're putting to much weight on accuracy as the key feature. We have plenty of customers who only care about having the correct town or postal zone or neighbourhood, and some even who actually do NOT want accuracy (due to privacy implications).

Nevertheless, yes of course I get what you are asking. Fundamentally all geocoders rely on someone having verified the input data, be it a government surveyor, a car taking pictures that are then evaluated (by humans and/or image processing software) or an OpenStreetMap volunteer, etc. We are at the end of a long data chain and have to trust the inputs we get.

In my 20% time I'm working on a world map at 1:1 scale which will solve this problem, hoping to launch next quarter ....


Are they are just running Nominatim? Is Nominatim reliable? Looking at Nominatim project on github it does not look like well maintained software (e.g. there is even pull request from 2012, issues with basic use cases)...


Nominatim is actively developed, and in active use by many organisations to handle large load. It is reliable.


We use a more stable version of Nominatim internally and pushed that out with locationIQ. :-)


Nothing I have found so far provides the accuracy and data of http://opencagedata.com/

And i've searched everywhere.


Other geocoders and lots and lots of open data.

There's Nominatim, Data Science Toolkit and the Two Fishes geocoders.

All of this is built on open geospatial data including OpenStreetMap, Yahoo! GeoPlanet, Natural Earth Data, Thematic Mapping, Ordnance Survey OpenSpace, Statistics New Zealand, Zillow, MaxMind, GeoNames, the US Census Bureau and Flickr's shapefiles plus a whole lot more besides. Here's the full list of datasources. https://geocoder.opencagedata.com/credits

I am about to use this for a project so thought I would recommend the find when I saw this post.


Genuine question: What does this offer that http://open.mapquestapi.com/nominatim/ doesn't?

Today I use the MapQuest API, and it's been stable and fine for the 2+ years I've used it, and the search has always been very good, intuitively selecting the right entity... i.e. London, UK rather than London, OT, and finding the right lat:lon for pubs and not getting confused by other places with similar names.


Variety is a good thing.

Make sure to pay close attention. MapQuest recently announced that they are going to discontinue their direct access OSM tile offering:

https://lists.openstreetmap.org/pipermail/talk/2016-June/076...

This is part of a ongoing reorganization of their offerings, it wouldn't be surprising if they flipped the switch on their other open data stuff.


Jeez, I had no idea that MapQuest will kill their free access. Thanks so much for sharing.


This is actually why we had to setup our own servers in the first place!


Would you consider adjustments so that subjectively the most important place is returned?

i.e. the London example I gave.


Cool, that's the missing bit of info I needed.


The home page looks more swank.


At a quick first glance: MapQuest has 15K transactions/month for free, LocationIQ claims 30K/day currently.


Is there a specific format for an address to work?

For instance on locationiq.org

Department of Food Safety and Zoonoses (FOS) World Health Organization Avenue Appia 20 Geneva

Does not work.

On maps.google

  Department of Food Safety and Zoonoses (FOS) 
  World Health Organization Avenue Appia 20
  Geneva
Doen't work

but

  World Health Organization 
  Avenue Appia 20. CH-1211 Geneva 27
works on google.


At least for the US, you can check out USPS CASS. I can't tell you about the rest of the world, but I'd bet bet that the EU has something similar.


What's the caveat? How do they plan to keep the lights on? I hope they don't resort to dark patterns to keep afloat.


It's a value add anyway for our existing customers @ unwired labs. There's a lot of unused capacity though, so we thought why not give it away..


I see "free tier has these limits" - seems to imply there are paid tiers for higher usage.


Geofabrik also provides OpenStreetMap based Nominatim geocoding as a service, and can set it up for you on your own machine if you want.

http://www.geofabrik.de/services/server.html


When i run the example of 'Statue of Liberty', i get an array of various places of the statue (didn't know theres one in Pakistan). Then i put 'Statue of Liberty, New York' and i get an empty array. Am i missing something here?


No it's a bug or data error. The query is "explained" here:

http://nominatim.openstreetmap.org/search.php?q=Statue+of+Li...

Note that it places the statue in New Jersey. So it isn't properly handing the fact that Liberty Island is an exclave of New York (so most likely a bug).


I love the "Get Started" section/workflow. I will copy it and look like a pretty creative person at work when I propose something similar. Thanks


Haha, sure. :-)


This looks awesome! Is there also a places autocomplete API?


No.


Are you using Meteor for this website?




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: