
Show HN: Ridiculously cheap bulk geocoding - thecodemonkey
http://geocod.io
======
rubyn00bie
Just for those curious, it's not that cheap actually compared to Google's
enterprise level Geocoding. Nor, I'm guessing, is it able to geocode
internationally. In which case you might as well use Mapquest, as it's
completely free.

Currently, the company I work for uses Google for geocoding and we have 1.1mil
a day which ends up costing around $5k (22%) per year more than these folks...
but! It includes international geocoding, google maps, etc.

Simply using census data to Geocode US addresses is easy; and there are
directions how to do it here in the comments... but setting up Nominatim (from
open street maps) is a serious amount of effort (and not cheap for a 32GB
server) but /is/ capable of global level geocoding.

One great use case for this service though: using mapbox, which is currently
forbidden by Google's TOS...

While I'm stoked to see competition in this space, I wish the competition was
a bit more robust (but everyone has gotta start some where, right?)

I hope you all continue forward with this, and hopefully add international
capabilities as well as price drops. I for one would do away with your free
offer altogether as the free users ROI will probably always be an expensive
crap-fest and allocate those resources to driving the price down for your
paying customers.

If/when you all can do ~1.1mil international geocodes per day for less than
$10k a year, LET ME KNOW! :)

~~~
mjwhansen
Other founder of Geocodio here -- thanks for your feedback! You bring up a lot
of good points.

Building off of what's above, Geocodio is intended to be accessible to
developers who don't have $10k to drop on geocodes. We found that this is a
big need in the community (and for ourselves for our other projects). All of
the other non-major-mapping geocoding services we found, including CSV upload
instead of API, were more expensive than $0.001/each (oftentimes much more --
$0.25+)

Also, we don't have limitations to how you use the data. No requirements that
you use a specific brand of map with it, no attribution requirements, etc.

We priced it at this point, with a free tier, so that people can give it a try
first. No, our data isn't quite as good as Google's -- we get about 90% of
addresses within 1 mile, and most within a tenth of a mile -- and we want
people to be able to play around with the service and get to know it before
they have to give credit card info.

With that said, we definitely plan to continue improving the product and add
international support.

PS. We are HUGE fans of Mapbox, so we're pretty excited that you listed that
as a potential use case :)

------
joelverhagen
Neat website! Very clean, simple pricing. Thank you for batch geocoding to
minimize network traffic...

How does the accuracy (as well as address parsing capabilities) compare to the
completely free solutions such as Nominatim[1] or DSTK[2]?

Both services provide capabilities for local installations, obviously with no
query limits and minimal latency.

[1]
[http://wiki.openstreetmap.org/wiki/Nominatim](http://wiki.openstreetmap.org/wiki/Nominatim)

[2] [http://www.datasciencetoolkit.org/](http://www.datasciencetoolkit.org/)

~~~
thecodemonkey
Thanks! Yes, I haven't really seen a lot of other services that provides batch
geocoding as an API endpoint.

We have mostly been running tests against the Google Maps API, and from a
totally random sample of 100 address, 90 of them were within a mile from the
Google Maps API returned location (Most of them were actually within 0.01
mile).

I'm not sure how we would compare to OpenStreetMaps and Data Science Toolkit
since our data source is different (US Census Bureau). - But the obvious
reason why we provide this as a SaaS, is that you don't have to host anything
yourself, or juggle around with gigabytes of boundary data. We handle all the
mess.

~~~
maxerickson
They both pull in Tiger data + other sources (so the expected outcome would be
that they are more complete in some areas).

------
dougmccune
For those looking to roll your own, the Ruby implementation of a TIGER
geocoder released by GeoIQ a while back is a pretty solid starting point:
[https://github.com/geocommons/geocoder/](https://github.com/geocommons/geocoder/)

We ended up using that as a base and then making some customizations for our
US-based geocoding solution. As these guys are figuring out, there's no great
int'l option. Google is bad from a licensing perspective (but their tech is
fantastic). MapQuest is great but can get really expensive. We've had decent
luck with TomTom I think, but if I remember correctly there are a lot of
caveats.

~~~
chippy
It's a very smart geocoder that one, have contributed to it - It is a bit old
now and not that easy to get started with.

The state of the art of open source geocoder would be TwoFishes:
[https://github.com/foursquare/twofishes](https://github.com/foursquare/twofishes)
written in Scala and developed and used by FourSquare

~~~
dougmccune
sounds like twofishes doesn't do addresses, only city names. That puts it in a
different category of product.

------
cjauvin
I've been doing freelance geocoding gigs for a couple of institutions in the
past years (canadian addresses), with only open source tools and data.

I also wrote a primer explaining the basic geocoding ideas:

[http://cjauvin.blogspot.ca/2012/04/lean-
geocoding.html](http://cjauvin.blogspot.ca/2012/04/lean-geocoding.html)

~~~
thecodemonkey
Great blog post! We obviously also use address interpolation to determine the
exact location.

------
lzhou
For an API, you can also try
[http://open.mapquestapi.com/nominatim/](http://open.mapquestapi.com/nominatim/)
(which is kinda free -- and uses OpenStreetMap data).

The biggest problem we've had is changing non well-formed addresses /
ambiguous addresses into canonical addresses with lat/lng. Google Maps wins on
that front.

~~~
thecodemonkey
We obviously can't beat Google in that case :) That's also why it's priced to
be way more affordable. It does however happen that Geocodio is more accurate
than Google Maps - try for example "8895 Highway 29 South, 30646" (Address of
a CVS store) on Google Maps and Geocodio.

~~~
me_bx
I'm using mapQuest geocoding API[1] which basically does what you do for free,
without the rate limits.

setting it up was quite a pain because they don't use semantic http codes, and
I had to play with it a lot to handle their undocumented error codes (they
store it inside body.info.statuscode). Good to read that you return semantic
http codes.

If you want to differentiate from the competition, I would suggest that you
improve the address parsing and support more patterns. Think of us having to
geocode user-typed location fields from twitter. Enjoy it :)

[1]:
[http://open.mapquestapi.com/geocoding/](http://open.mapquestapi.com/geocoding/)

------
jrheard
UI suggestion: 'street addresses' currently has a box around it, so I thought
it was an <input type="text"> field, thought "how cute", tried to click on it
to enter an address to geocode, and was disappointed to find out it was just
some bolded text. Might be a fun little feature to have that actually be an
entry point into trying out a demo of the API (I thought I was supposed to
enter an address to have geocoded).

~~~
atlbeer
Agreed. Did the same thing.

Would actually be neat though for that to be a quick demo of your software.

------
appleflaxen
Your pricing page is not as clear as it could be.

When people read "$0.001 each" they sometimes understand it to be one
thousandth of a _cent_ rather than one thousandth of a _dollar_.

Even though you are completely correct/accurate, people find it confusing (1).

Wouldn't it be clearer to say "1 cent fore every ten uses" (or "10 calls for a
cent" or "a tenth of a penny per call")?

Admittedly, your audience is semi-technical, and should parse it correctly,
but why not simplify it?

[1] [http://verizonmath.blogspot.com/](http://verizonmath.blogspot.com/)

~~~
cmaggard
The part I was unclear about, actually, is whether or not one is charged for
the first 2500 if one were to hit that threshold.

~~~
mjwhansen
The first 2,500 are always free. So 2,501 would be $0.001. I'll make that
clearer on the site -- thanks!

------
stfp
Congrats on the product! Just a couple website level things:

\- [http://geocod.io/contact/](http://geocod.io/contact/) says DC but shows me
a map centered somewhere south of Topeka.

\- Random $0.02 suggestion: stop using "ridiculous".

~~~
mimiflynn
I'm getting a map centered in Brooklyn. I'm in Midtown Manhattan... so, maybe
its trying to find our location?

Are you close to Topeka?

~~~
thecodemonkey
Seems like our embed code was bad, I've updated it now. Thanks for the
feedback!

------
mholt
I work at SmartyStreets, where we've learned that geocoding is very, very
difficult, so I definitely feel your pain! We started with basic Census Bureau
stuff and it's definitely complicated, and accuracy can be spotty. (We've
since worked with other data vendors to improve the accuracy.) It's too bad we
don't all have little cars to roam the country with and manually collect
rooftop-level data like Google does.

+1 on the versioned API endpoint... when we released ours nearly 8 years ago,
versioning APIs wasn't really a thing yet. We're paying that technical debt
off now as we vigorously rewrite and improve our service.

Quick feedback: Links on the FAQ page are hard to distinguish from regular
text.

Good luck with the project!

~~~
thecodemonkey
Thanks! Yes, it is definitely not easy, a lot of edge cases to take care of
too. Luckily we are not trying to directly compete with any of the big guys
out there, which makes us able to keep the price low and the output high.

We'll update the FAQ links, thanks!

------
thecodemonkey
We were tired with dealing with the often steep pricing on geocoding when you
reach your daily free limit (e.g. Google Maps starts at $10k/year). So I built
this service so I can use it myself and hopefully it would be useful for
others too.

------
Jemaclus
Love it. I'll keep using my current service for now (SmartyStreets), but I'll
let you know two things I noticed:

1) Most services will accept shortcuts for names, like "SF" for San Francisco
or NYC for New York, but in both cases, I got error messages instead of
geocodes.

2) Addresses that aren't "properly" formatted (i.e., without commas or
something) often return very incorrect information. Here's an example:

2680 NW 8th Pl, Fort Lauderdale, FL 33311 - returns correct info

2680 NW 8th Pl Fort Lauderdale FL 33311 - returns incorrect info (see suffix,
formatted_address)

For what it's worth, SmartyStreets mangles even the first address that you got
correct, but on the other hand, they're very good at correctly returning data
for improperly formatted addresses like the second one.

Anyway, good luck. Great tool.

~~~
thecodemonkey
Thanks for the feedback! We don't currently support shorthands for city names
- only states. But this is definitely something that's on the todo list now.

Our address parser will try to pick up the address even if isn't formatted
correctly with commas, but it obviously won't work in all cases. Address
parsing is indeed a very complex problem.

~~~
jwnacnud
Address parsing is actually very easy. Knowing when you got it right (or
wrong), that's the hard part, and that's where address validation come in
handy.

If you can start with a list of all the following, you've got a great start:

prefix abbreviations street names street types suffixes city names state names

Add to that all the possible misspellings and then factor in levenshtein and
soundex to account for misspellings you didn't know about and you've got a
pretty dang good address parser. Figure out how to do that lickety-split fast,
and you've got gold.

------
ColinWright
Is it possible to delete an account? I created one before discovering it was
US only.

~~~
thecodemonkey
Sure! Just send us an email at support@geocod.io and I'll remove your account
right away.

~~~
ColinWright
An admirably quick response, both to my question here, and to the email I
sent.

People, this is a lesson. If you post a "Show HN" then be ready to respond to
people's questions and comments. Posting and then going silent for hours is
not a good message to send to people who you want using your service. It says
you haven't thought enough about your level of service.

Kudos to GeoCod.io.

------
carlosdaniel
Will your service be expanding to provide reverse geocoding? we would be
interested if it did both (we need both).

~~~
thecodemonkey
If there's enough interest, we'll definitely be working on this next. It would
just require a slight restructuring of our data to make the lookups as
efficient as possible.

~~~
dexterbt1
+1 to reverse geocoding support. Our app runs around 25k/day reverse geocode
calls to OSM and Mapquest's Nominatim. We are projecting up to 4x growth
within the year so an accurate, bulk and cheap service will help ease our
pain. And oh, we're based and operating in the Philippines (which hopefully
you can add soon as well).

------
davidcelis
Cool project. Like others have said, not particularly convinced that it's
cheaper than Google's enterprise geocoding, but I'm more than glad to see the
competition.

I wrote you guys a Ruby client:
[https://github.com/davidcelis/geocodio](https://github.com/davidcelis/geocodio)

The code's maybe a bit rough, but it's worked in my limited usage. Maybe you
can take it for a test run before I push version 1.0.0 to RubyGems?

~~~
thecodemonkey
Thanks! This looks great! Would you mind if we possibly mentioned this in our
documentation?

As for the pricing, we are indeed much cheaper than Google's geocoding
offerings (given the nature of our product). If you are looking to do a high
amount of geocoding requests, just contact us[1] and we'll work out a pricing
model for you.

[1] hello@geocod.io

~~~
davidcelis
Definitely feel free to mention it in the docs! Thanks again for an awesome
new alternative.

Version 1.0.0 is on RubyGems now, by the way!

------
rurabe
I wonder how the "choose your own api key" policy is going to work in
practice... given that people don't usually make very secure passwords and
that the example is "Real estate website" you're going to get some pretty easy
to guess api keys.

~~~
thecodemonkey
That's actually just a name to identify the API key, the actual API key is a
40-character automatically generated string. The idea is that you will be able
to create an API key for each of your projects and revoke them individually as
necessary.

------
suanmeiguo
I tested this website api for 2000 randomly selected home address. And it's
not accurate enough. It's 4000 foot away in average to google's lat lng. This
number is kinda less accurate comparing to bing's 1000 and datasciencetoolkit
is 2200.

------
brokenbeatnik
Geocoda ([http://geocoda.com](http://geocoda.com)) launched last year, does
point storage as well as geocoding, and should be comparable for low amounts
of geocoding, and cheaper for large amounts per month (> 250K).

------
pkh80
TIGER (dataset that this is based on) has some giant holes in it, and is based
on block faces not building footprints like Google Maps. Its also U.S. only...
why not base on OSM, which should include TIGER as well as all the other
contributions.

------
CalRobert
Gah! This is awesome. Where were you when I was trying to get an idea launched
and the cost of geocoding was the wall I kept hitting??? Seriously this makes
my week, maybe it's time to dust off some old projects...

~~~
thecodemonkey
Feedback like this is exactly why we released this (and made it so cheap) :)

------
joahua
IIRC Google's TOS prohibits saving geocoded points. "Caching" is allowed, but
I think this has value/is different insofar as it would let you store points
permanently without breach of contract.

~~~
yummysoup
I wonder if, given APIs for both Google Maps and Joe's Free And Permissive But
Sometimes Wrong Maps, you could:

* query both services for each address

* if the [lat,lon] are equal (within a threshold), store Joe's result as correct

* store nothing otherwise

Are you storing Google's results in that case?

------
matthuggins
Where the pricing says $.001/ea for 2501+ geocodes, are the first 2500/day
prior to that still free? Or am I paying $2.50 for the day as soon as I make
that 1 extra request above the free limit?

~~~
thecodemonkey
Yep, the first 2500/day will always be free. So if you geocoded 2510 addresses
in one day, you would pay $0.01

------
nl
This is great.

Also great is Pete Warden's
[http://www.datasciencetoolkit.org/](http://www.datasciencetoolkit.org/)

 _Street Address to Coordinates: Street Address to Location calculates the
latitude /longitude coordinates for a postal address. Currently only the US
and UK have street-level detail._

 _Google-style Geocoder: Are you currently using Google 's geocoding API and
want to switch? Replace maps.googleapis.com with the address of a DSTK server
and your code should work without changes._

Free to use, also available as a (free) self-hostable VM.

------
aabalkan
Why don't you put a demo query page so I can try addresses in my country
without signing up?

edit: signed up. does not work outside us. Why not bother documenting that?

~~~
mjwhansen
Good idea about the demo page!

And apologies that we didn't make the US-only part more prominent before.
We've added it to the front page and moved it to the top of the FAQ.

------
ericd
Neat! If you can get your address parsing up to Google's level or anywhere
close, you should do quite well.

For others looking for a solution you can play with yourself, here's a VM
image with a pretty good geocoder you can set up yourself (iffy address
parsing, though):
[http://www.datasciencetoolkit.org](http://www.datasciencetoolkit.org)

~~~
pkh80
Parsing isn't the hard part, its the source data, which if you do the math on
what Google has done (drive around the world taking 360 video and LIDAR of
streets) is literally billions of dollars worth of work.

TIGER is a pretty bad starting point, geocoding based on block faces is really
inaccurate if you want to zoom in to the street level. And its U.S. only.

OSM Nominatim should be a better place to start.

I'd love to see open sourced Street View data collection / processing as part
of the OSM project. Then there is a chance to compete with Google.

~~~
ericd
What you're talking about (massive ground-level driving effort to pinpoint
where along streets specific addresses are located) would boost accuracy.
Without a Google level address parser, though, you don't get _usability_ for a
lot of use cases, which is frankly much more important for a lot of companies.
One of the best things about Google's geocoder is that you can throw various
location names, as humans type them, and Google will return _something_ , and
it's usually the right thing. For many applications, this is the desired
behavior, rather than precision.

------
basicallydan
I really like the look of your API. I work a lot with location-based apps,
I'll probably be giving this a go :)

~~~
thecodemonkey
Thanks! Please let us know what you think.

------
scrabble
What's preventing people from simply signing up for Google Maps API for
Business then sending your requests that way and returning the results?

Thereby spreading out the bulk cost of an API license amongst your customers
who have to pay a significantly smaller amount, but adding up to profit?

~~~
haney
I would imagine what you're talking about is a violation of their terms of
service.

------
chippy
Also free [http://geocoder.us/](http://geocoder.us/)

~~~
thecodemonkey
I believe they have a daily limit as well. They also charge $50/20k addresses
for bulk geocoding which is way more than what we charge :)

------
3pt14159
I oversaw a project like this elsewhere (where we had reams and reams of geo
coordinates, but we needed text searchable tags (like "Canada", "Toronto",
etc).

We had millions of them though, so maybe an API isn't really the way to go.

------
kevingibbon
How does the dataset compare to google? Would love to see some side by side
comparisons.

~~~
thecodemonkey
See my previous answer [1], obviously it's impossible to compete directly with
Google and especially not at this price point. Our goal is to return a geo
coordinate that is at least on the correct block and as close to the street
number as possible.

[https://news.ycombinator.com/item?id=7095467](https://news.ycombinator.com/item?id=7095467)

------
pitzips
Would love if you could get integrated into
[http://www.rubygeocoder.com/](http://www.rubygeocoder.com/) \- That would
make my switch much easier. Would love to support you guys.

------
verelo
I think this is great. We use Google Maps for geocoding today, we paid around
10k for this years license.

If you guys can do the same without the rate limiting restrictions they place
on us, we'd switch over in a heartbeat.

~~~
thecodemonkey
We actually don't have any rate limiting currently (we can handle a pretty
high amount of concurrent requests and will hopefully be able to scale up
hardware before we hit any performance issues).

------
snake_plissken
Very cool. I'm in the telematics industry and forward geo-coding is something
in which I am always interested, since it can be quite the bitch of a task.
How did you go about assembling the shape files?

------
neovive
Great to see something new in this space. I remember having to rewrite quite a
bit of backend code when SimpleGeo shutdown.

Note to self: code back-end API consumers with Interfaces and drivers instead
of hardcoding API calls.

------
mmahemoff
Small thing: I would drop the "bulk" as the tagline is too much of a mouthful
and "bulk" is unnecessary. It's free at smaller volumes anyway, so certainly
not deceptive to drop it.

------
nicolsc
The ability to understant how the input was parsed is an interesting feature,
but i think it'd better be optional.

Most of the times users will only care about the results, so you'll be sending
useless data

~~~
thecodemonkey
Good point, we might want add that as an optional parameter. Also note that
our address address parsing API endpoint is free and doesn't count towards the
usage statistics and billing :)

------
beagle3
Is there a similar service for reverse geocoding? US and international?

------
wikwocket
Very nice! Can't wait to try it out.

What does the "accuracy" value in the return mean? Maybe I am missing
something but I don't see it in the FAQ or docs.

------
HillRat
Great job, guys! This definitely opens up some nice options. Reminds me how
much I miss the _old_ TIGER/Line file formats, though.

------
jsumrall
Interesting. What interesting things can you do once you geocode a street
address? How are you (your business) using this?

~~~
alepper
I used it to take a contact database for a series of seminars around the
country and allocate participants to the geographically-appropriate venue.

------
renegadedev
I like the price point about $1/1000records. Just curious to hear how you
arrived at this price point.

~~~
thecodemonkey
Our infrastructure is pretty efficient, making us able to keep our operating
costs low. We wanted to have a pricing point that was below any other similar
services we could find.

------
vitalyny
This is very cool! We use SmartyStreets because of the price. Where did you
get the addresses database?

~~~
jxf
In another comment OP says the source is US Census Bureau.

------
martin1b
Very nice. Pricing is very attractive. May need to use this in the future.

------
jaque
so.. us addresses only?

~~~
thecodemonkey
Yep! Unfortunately we will manually have to add support for each country
(including getting data, normalizing it, etc.) which is quite some work. We're
planning to add support for additional countries if demand is high enough.

~~~
TillE
Separating a street name into "street" and "suffix" is a baffling decision
which probably has a few issues even in the US, and definitely won't work
elsewhere.

~~~
thecodemonkey
Nope, it won't work outside the US. But US addresses actually have a list of
suffixes that all ordinary addresses comply to. See
[https://www.usps.com/send/official-
abbreviations.htm](https://www.usps.com/send/official-abbreviations.htm)

------
hydralist
next up, geofencing-as-a-service?

