I've read all the comments here and still don't understand the rationale for hexagons as opposed to squares.
In every sense, squares seem to be much easier to reason about and easier to hierarchically partition than hexagons are.
There are certain advantages to hexagons in certain contexts, like six degrees of movement instead of four in board games, but I don't see how any of those advantages translate here for geographical indexing.
I'd love to understand why hexagons as opposed to squares in this context are a superior solution rather than unnecessary complexity?
Squares have a contiguity problem: does a square in a grid have 4 neighbors (rook contiguity) or 8 (queen contiguity)? Hexagons do not have this problem. All neighbors of a hexagon share a full edge, not just a vertex. All neighbors of a hex also have their centers equidistant from that hex. A massive number of spatial problems turn on neighborhood definitions, and hexes are almost always a better representation of reality than squares.
But if you assume rook contiguity then it seems equivalent to hexagons. All neighbors share a full edge, all neighboring centers are equidistant.
I get that you're saying hexes are almost always a better representation. I still don't see a concrete example of why, for geographical indexing specifically.
[Edit: sibling reply explained that at the end of the day, it's not about indexing but rather route planning.]
Let's say you are looking for the closest gas station. One that is in the close corner of a "diagonal neighbour" would be closer than most points in the "edge neighbours". So if you want to find something nearby, you'd usually want to look at all 8 neighbours. The hexagonal neighbours look more like a circle centered in the original hexagon, thus more convenient for that purpose.
>The cell shape of that grid system is an important consideration. For simplicity, it should be a polygon that tiles regularly: the triangle, the square, or the hexagon. Of these, triangles and squares have neighbors with different distances. Triangles have three different distances, and squares have two different distances. For hexagons, all neighbors are equidistant
And
>This property allows for simpler analysis of movement. Hexagons have the property of expanding rings of neighbors approximating circles
>Hexagons are also optimally space-filling. On average, a polygon may be filled with hexagon tiles with a smaller margin of error than would be present with square tiles.
(closer to) equal distance between all the neighbouring cells is really helpful when you're modelling a problem as a network - eg analysing and planning routes and capacity.
Hierarchical partitioning with exact containment is useful when aggregating but that's not always the most critical property.
Ah OK thank you that is making more sense -- that since Uber is primarily about navigation and route planning, hexagons minimize the number of cells or total land coverage you need to retrieve/analyze between two points? I think that was the missing piece of information I was looking for!
Key reasons for the choice of hexagons in h3
- hexagons have have low distortion as compared to squares when you attempt tessellation of a spherical shape. You just require 12 pentagons to complete this tessellation.
- all six neighboring cells are equidistant from the center of a cell.
One of the use case Uber was trying to solve here, was to show which regions had surge pricing - on the uber app for the drivers. With h3 and hexagons you can argue it looks a teensy bit better with smoother gradients https://h3geo.org/docs/comparisons/s2.
I am not sure whether Uber uses h3 for other use cases like - finding the closest drivers next to a location.
I created an h3 polars implementation and I think one of the big one's is flow modeling. As other's have mentioned, each hexagon has 6 equidistant neighbors. As a result, it lends itself to analyzing telematics data (hence Uber's backing of h3) - I attached a colab notebook where I do some of this. Trying to do this sort of analysis any other way would be incredibly painful.
This will seem like a nit at first but it's really a key driver to why you'd look at other shapes: a hexagon is more comparable to a quadrangle, a square is more comparable to what is called "a regular hexagon". Regular meaning the sides and angles which make up the shape are all equal. In the 2D world, such as on board games, equilateral triangles, squares, and regular hexagons can all tile a plane perfectly. This is not the case for the surface of a sphere, there is no tiling regular polygon in that case.
From there you're just trying to optimize uniformity in distance to neighbors, how big the adjustments to the irregular polygons need to be to get them to tile on the surface are, how easy the polygon is to split up into smaller similarly shaped variants of itself as sub tiles, and trying to be somewhat close to a circle in shape as that means the average distance to the center of the area defined by the index is as close to as it can be.
If you chunk through those you'll find quadrangles aren't attractively simple anymore and hexagons tend to optimize the parameters very well. H3 actually uses both hexagons and the occasional pentagon (12 total, no matter the zoom level). It all comes down to "tiling isn't going to be perfect - what is the most optimal answer for the purpose of the tiling".
For the purposes of visualization, you want each cell to enclose approximately equal surface area. These are your "pixels". H3 is a way of rendering any subset of a sphere for display.
While squares have superior properties for analytical geospatial data processing that H3 doesn't have, such as congruency, they really only work for Euclidean spaces and the surface of a 2-sphere is non-Euclidean. Any system using squares will be a poor approximation of "equal area" relative to hexagons, which makes them poor for visualization. To use squares for indexing, you need an extra step that allows non-Euclidean space to be addressable from Euclidean space. There are two main ways of doing this.
First, one can project the surface of the sphere onto the surface of a Euclidean cube. Second, one can use an embedding, indexing the 2-sphere in Euclidean 3-space. Both of these can be trivially projected to a hexagonal system like H3 for visualization purposes and commonly are.
If you primarily need visualization and your data is small, using H3 eliminates the step where you need to figure out how to map non-Euclidean data models to Euclidean data models. If you are doing large-scale geospatial processing, it becomes worth the effort for both scalability and performance reasons.
In practice, there is little to no advantage. Any spacing filling index is trying to fit into an ordered index like B-tree, so existing horizontal scaling solution can be directly applied. The problem is it's much harder and inefficient to implement computational geometry algorithms on spacing filling index than a real spatial index like R-tree. On the other hand, scaling can be easily solved by table partition on different area.
You can more smoothly interpolate a function for points on a surface using hexagons than with squares. For Uber, think about things like surge pricing. If you compute a surge % for each hexagon, you can interpolate it for each rider location.
IIRC it has to do with tiling the globe. You can't correctly tild a sphere with swuares. On a local level, it seems easier to reason about squares on a flat surface, but the Earth is round and Uber is international.
Thanks! I'm still a bit confused because you can't tile the globe with hexagons either, if I understand correctly -- everything I can find online says you need a mix of hexagons and pentagons, and you can think of a soccer ball as an example.
Has Uber figured out a way to do it with just hexagons?
iirc it’s exactly 12 pentagons required to tile the sphere with hexagons, independent of hex-tiling size. So it’s almost fully consistent, and uber tries to put most of those pentagons in the ocean
Since cars don't need to cross water, maybe this isn't actually relevant. Uber aren't trying to model the planet, just the routes within the geographies they support.
I think what GP is saying is if you zoom to the max level 15 there remain only 12 cells which come out as pentagons and all 12 of those are in the ocean (albeit sometimes very near shore). If you zoom to the global scale those 12 get inherited into ever larger parent cells which do cover large areas of the Earth though, as you found.
In every sense, squares seem to be much easier to reason about and easier to hierarchically partition than hexagons are.
There are certain advantages to hexagons in certain contexts, like six degrees of movement instead of four in board games, but I don't see how any of those advantages translate here for geographical indexing.
I'd love to understand why hexagons as opposed to squares in this context are a superior solution rather than unnecessary complexity?