Hacker News new | past | comments | ask | show | jobs | submit login
Google’s S2, geometry on the sphere, cells and Hilbert curve (christianperone.com)
166 points by perone on Aug 15, 2015 | hide | past | web | favorite | 29 comments



The linked presentation slides give a bit better explanation IMO: https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSn...

Too bad there aren’t more general docs for the library.

I still don’t quite understand the benefit of using a single number instead of a pair of numbers for describing a cell in a two-dimensional space. Seems like a pair of 32-bit fixed point numbers would be just as descriptive as a single double-precision float along the hilbert curve.

The way they project the sphere onto a cube also leads to cell shapes that aren’t especially relevant to typical human purposes, whereas there are many shapes on a map which align with the latitude/longitude grid.


The purpose of the space filling curve is to preserve locality when you store the two dimensional data on a one dimensional storage medium. If you used x and y coordinates and stored the data like an image row by row, then data of cells adjacent in the x direction would get stored close together but data of cells adjacent in the y direction would get stored quite a bit away because of the other cells of the rows in between. On the other hand the Hilbert curve keeps cells adjacent in both directions closer together and therefore preserves locality better which in turn means less read head repositioning on spinning disks or a better hit probability when you prefetch adjacent data.


It also allows you to do the opposite, and turn a single-dimensional quantity into two-dimensional coordinates in a useful manner.

The most famous example I know is, of course, XKCD's Map of the Internet:

https://xkcd.com/195/


It isn't a matter of representation, but also how data structures could take advantage of such representations. B-trees for instance are very efficient with one-dimensional data. Take a look here: http://www.drdobbs.com/database/space-filling-curves-in-geos...


I think that if you used a pair of numbers you wouldn't be able to do the fast "checking of containment" described by the author, which is just a bitwise comparison apparently. I don't think they use a floating point representation for the 1D hilbert space coordinate.

The checking of containment, to verify if a cell is contained in another cell, all you just have to do is to do a prefix comparison. These operations are really fast and they are possible only due to the Hilbert Curve enumeration and the hierarchical decomposition method used.


If you had a pair of numbers you could do two bitwise comparisons instead. If you packed both 32-bit fixed point numbers into one 64-bit integer, you might even be able to handle both parts with the same few instructions. In the worst case it would be within a factor of two of the Hilbert-curve version in terms of instructions, and I highly doubt this operation is ever a bottleneck in practice. Would be interesting to see some benchmarks.


Correct me if I'm wrong but isn't locality unidirectional? Two close points in the plane might by chance be encoded far away on the line if they're unluckily next to a seam.


Theoretically you can try and find projection where most major seams are in un-navigable oceans and it won't hurt that much.

My take is that they should allow height to be coded in too.


I had the fortune to be introduced to this by a Google employee and while they were not from the maps team, they had soaked up enough of the general Google way of doing things that their explanation was able to "hit home" very quickly.

As a result one of the first things I asked them was "where are the corners?" and they confirmed my suspicions that they have "tilted" the cube slightly (the vertical edges are not parallel to meridians at the equator) and that a very rough approximate location for two of the corners is one is roughly in the great Australian bight on the South Australian coast line, and another one is roughly in the sea of Okhotsk between Japan and the far eastern half of Siberian region of Russia.

TL;DR. Yes, yes they do try to put the vertices in areas where their distortion will be least noticed.


Right. Article text says so too.


Would someone please explain to me how you one convert between the 1D and 2D coordinates of a Hilbert curve? Is there a formula for it? The drawings look nice but they don't tell you how to actually do the conversion, which seems to be the crucial piece of the data structure.


Here is Google's code for doing it:

https://code.google.com/p/s2-geometry-library/source/browse/...

The lookup table it references is generated up here:

https://code.google.com/p/s2-geometry-library/source/browse/...


Thanks!


If you don't mind worse order-preserving behavior, you can use z-order curves. Encoding and decoding coordinates can then be done by (de)interlacing the bits of (X,Y), which is quite simple: https://fgiesen.wordpress.com/2009/12/13/decoding-morton-cod...


I was looking for the same thing, this article below has an incredible explanation of the algorithm using Python:

http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-...


This is a great reference that I relied on while struggling with my own implementation:

http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-...


In contrast there is an incomprehensible bit of code by Ken Musgrave in Graphics Gems II: http://www.realtimerendering.com/resources/GraphicsGems/gems...

(peano not hilbert but very similar idea)


There is a sample of code in the Wikipedia page about it. https://en.m.wikipedia.org/wiki/Hilbert_curve

The links the authors provide to learn more about the HSFC probably discuss it too!


Thanks!



I suspect that there are more efficient ways to do this, but simple recursion should work. Divide the curves into quadrants[0]. We know that a given point should be in the same quadrant in both the 1D and the 2D curve. Due to the fractal nature of the curve, you can repeat this procedure on the given quadrant.

[0] For the 2D version, split divide the quadrants with the center x and y axis. For the 1D version, just break it into four equal length pieces.


This is called a quadtree. Here's a comparison: http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-...


Is it faster for spatial queries to store the S2 cells in a Btree index in a database like the article mentions or to just use something like PostGIS with its gist spatial indices -- specifically, for determining whether points are inside or outside of a polygon?


I think this is also a great example how Google abandoning the Google Code will affect the availability of previously published work.

What I want to tell is that this library is not (yet according to some googling) migrated to the new platform even when it is from people from Google.


There is also Healpix and other libraries


There's also a Go port of S2: http://godoc.org/github.com/golang/geo/s2


Yeah, it's really awesome. However not complete, but I've added a lot of functionality especially the coverer, there's a pull request for it if you want to take a look.



Thanks. I solved the problem, lots of visits and missing swap space lol.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: