
Show HN: ProximityHash – Geohashes in Proximity - ashwinnair
https://github.com/ashwin711/proximityhash
======
kaushalmodi
Reminds me of [https://plus.codes](https://plus.codes) . Check it out on a
phone to get the Plus Code for your current location. (The Google Plus Code
fetching feature is awesome on Google Maps Beta on Android.)

------
haney
Are you familiar with the S2 library for mapping lat/lon into a 1d space?
([http://blog.christianperone.com/2015/08/googles-s2-geometry-...](http://blog.christianperone.com/2015/08/googles-s2-geometry-
on-the-sphere-cells-and-hilbert-curve/)) any idea what the tradeoffs would be
between this and S2?

~~~
ashwinnair
Hi, yes I am familiar with the s2 cell library. The whole intention was to
bring in a similar feature for Geohash, which by default cover only
rectangular regions. :)

------
goodells
I'm a bit confused about the "hash" part of this... wouldn't you want to be
able to go back from the hashed string to the original location? I can't think
of much of a use for only being able to test locations to see if they match a
proximity hash.

Internally, the space-filling curves are super interesting. I highly recommend
this writeup [1] on how they're used in things like planet-sized maps. I've
heard rumors that Google adjusts the offset of their space-filling curves to
minimize large discontinuities near land and push them over the ocean instead.

[1] - [http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-
Spatial-...](http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-
indexing-with-Quadtrees-and-Hilbert-Curves)

~~~
ashwinnair
You can decode the hash back to the location coordinate. In fact if you look
at the code, you'll see that the output is a set of points which are then
geohash encoded in order to get the hashed value.

Would surely go through the blog mentioned. Thanks.

------
speeq
> data size reduces considerably, thereby improving speed and performance.

By how much? Could you please give us an example?

Would something like Redis' GEOHASH command be any faster if it used
ProximityHash?

~~~
ashwinnair
It's not ProximityHash that does the optimization. There's an option to use
Georaptor which does the optimization. Please have a look at GeoRaptor
([https://github.com/ashwin711/georaptor](https://github.com/ashwin711/georaptor)).

On your second question, it completely depends on the problem at hand. I have
been using GEORADIUS on Redis for around a year now @ > 30k QPS. The case
there is to find locations that are closest to our input. ProximityHash
provides you all the geohashes in a circular area for which I haven't found a
library yet. Most of them provide geohashes in a BoundingBox.

------
polskibus
What's this for?

~~~
haney
I haven't used this specific library, but it's common to want to convert
lat/lon into a single dimension where sorting the hash gives nearby locations.
This is helpful for lookups/indexing/caching geospatial data. This appears to
solve some of those problems.

~~~
SOLAR_FIELDS
Correct. A common use case and straightforward example is reverse geocoding.
Using this hash scheme, if your lat long lies in one of the hashed areas
(using trivial bounding box computation to determine), there is a very small
set of addresses needed to figure out which is the closest. A naive
implementation using this indexing scheme might compute the distance to each
lat/long in a geospatial database for everything in the indexed bounding box,
then return the address with the shortest distance.

Spatial indexing is quite like regular database indexing. If the operation is
clever enough to be inexpensive to compute an index with high levels of
precision, you gain the advantage of spending less time in the indexing phase
of your computation. This becomes relevant when you regularly want to
spatially index the entire world, for example.

The only special thing I can see about this particular approach is that it's a
circular index, which isn't a common way to approach spatial indexing. The
more common approaches such as building R-Tree are rectangular which may not
be applicable to all use-cases. Many spatial analyses use circular buffer on
point approach (I see this a lot in health fields such as disease analysis),
where this specialized approach might eliminate a post-processing step and net
some performance gains for the end user.

------
stumptownkiwi
This has existed for years:

[http://www.geodna.org/](http://www.geodna.org/)

Demo here: [http://www.geodna.org/docs/google-
maps.html](http://www.geodna.org/docs/google-maps.html)

GeoHash is cool - but the hashes don't lend themselves well to stemming for
use in text-based searching. This improves on that.

Full disclosure, I am responsible for GeoDNA.org

~~~
ashwinnair
> hashes don't lend themselves well to stemming for use in text-based
> searching

I didn't get this completely. :)

This was something that hadn't been solved using geohashes. Intention was to
just bring out 'geohash'y of solution to proximity search.

