Hacker News new | past | comments | ask | show | jobs | submit login
SpatiaLite: A Spatial Extension to SQLite (gaia-gis.it)
190 points by chippy 7 days ago | hide | past | favorite | 21 comments

It had been quite a while since a major release, but then SpatiaLite 5.0 came out a couple of months ago with some very significant new features - the K-Nearest-Neighbor stuff is particularly interesting. https://www.gaia-gis.it/fossil/libspatialite/wiki?name=5.0.0...

I've built a few fun demos using SpatiaLite and Datasette. Here's an API that tells you the timezone for a latitude longitude point:

https://timezones-api.datasette.io/timezones/by_point?longit... - implementation here: https://github.com/simonw/timezones-api

And here's the same thing for which US county a point is within: https://us-counties.datasette.io/counties/county_for_latitud... - implementation here: https://github.com/simonw/us-counties-datasette

I've also built an experimental Datasette plugin that lets you draw a shape on a Leaflet map to generate a GeoJSON polygon, then uses SpatiaLite to show you geometries that are contained by that drawn polygon. I wrote about that here: https://simonwillison.net/2021/Jan/24/drawing-shapes-spatial...

I've been playing with it since yesterday and I have to admit it's great

What are the general solutions to handling spatial data and in particular, 'nearness' in huge databases?

Last time I needed something along those lines, I implemented geohash [1] in the application level ontop of a time-series database. I assume there is a better, more sound way to achieve that, as geohash doesn't handle boundaries well.


For the uninitiated, geo-hash uses a space filling curve multiple times at different precision to map quadrants into strings. So imagine a map, top left is North America, bottom right Australia, we can name the quadrants 0,1,2,3, we then take each quadrant and again use a z filling curve to split it to more quadrants (well, quadrant of a quadrant) and keep the name we gave it, we repreat the process up to desired precision, finally, we take the all the names of quadrants with the same order and concatenate them, to create a string that repretents the position. Strings with the same prefix are in the same quadrant and depending on the length of the prefix, the quadrant they are in is either bigger or smaller, longer prefix, smaller quadrant.

We found that geohash, even at the application level, provided much faster queries than searching for coordinates within range on two fields.


[1] https://en.wikipedia.org/wiki/Geohash

A kd-tree is the generalized data structure that can do log(n) lookups of nearest in any dimensions. I wrote an implementation for nearest to a geographical point years ago (since ported to many other languages): https://github.com/AReallyGoodName/OfflineReverseGeocode

A less generalized solution is to grid things up (quad tree and oct trees) as you implied above but the kd-tree is the generalized solution without edge cases (what if the majority of points of interest end up in a single grid square in your example?). A kd-tree is essentially what a sort list is for one dimensional data but instead supports multiple dimensions.

> what if the majority of points of interest end up in single grid square in your example?

Geohash provides arbitrary precision, at 8 character length, we can measure difference in in the order of decameters iirc.

The edge cases are on the literal edges, where you could have two points, one at say -79.995, 35.0 and -80.001, 35.0, so the area is split at -80 and the first point has different prefix than the second even though they are very very close.

I did some pretty effective geospatial clustering with quad trees. Had to do some smoke and mirrors to take the grid out of it, but worked great overall.

I recently ported the wonderful geohash computation approach of https://mmcloughlin.com/posts/geohash-assembly into an OCaml CGI conversion webservice. There are so many subtle gems in that article, a pleasure to read. The bit-fiddling reminded me of invsqrt.

Spatialite has some nice spatial indexing features built in that work very well. This is an older article series that talks about Spatialite spatial indexes, but most of the queries in it still work today: https://northredoubt.com/n/2012/01/18/spatialite-and-spatial...

For those looking in this space, two other options to consider are:

SQLite R*Tree Module - https://www.sqlite.org/rtree.html

The Geopoly Interface - https://www.sqlite.org/geopoly.html

I used the r*tree module for some gis stuff. It was super fast, but the workarounds to deal with curved earth and projections were not pretty

PostGIS is the equivalent for PostgreSQL:


Spatialite is the defacto standard to pair with Postgis. It is is a phenomenally useful piece of software and probably the first thing to reach for if you are doing GIS work.

What is a good framework for building GIS applications, preferably desktop ones presently, which supports interfacing with geodatbases and also provide functionality for spatial computations and UI components.All this without going the proprietary route.

Building QGIS extensions seems to be the only viable option I'm seeing.

Make sure you observe its license. Evaluated spatialite and absolutely loved it, but its license was incompatible with the target project. We would have needed to statically link it which would have in practice meant open sourcing the whole shebang. No bueno.

How does the api differ from say mongodb geo spatial api?

Mongodb is pretty limited with geospatial stuff.

It's great for basic things, but the geo index is considered a "special" index.

You can only use one "special" index at a time in a query, so (for example) you can't combine a full-text search with geo queries.

Sort of superseded in the GIS world by the geopackage (https://www.geopackage.org/): another spatial standard built with SQLite databases. Geopackages are less about manipulation/analysis at the sql level and more about storage—the idea is you’re already using something like QGIS that can perform analysis

They’re not really comparable at all. There are lots of use cases where a desktop GIS system is not appropriate or relevant.

If by "they" you mean spatialite and geopackage then they are very much comparable, in that they are both ways to store spatial data in sqlite databases.

From the geopackage FAQs[0]:

> What is the relationship between GeoPackage and SpatiaLite? > > SpatiaLite was a major influence on the vector portions of GeoPackage but after extensive > discussions, the working group chose to diverge from the SpatiaLite format in a few subtle ways. > SpatiaLite now supports GeoPackage as of version 4.2.0.

[0] https://www.geopackage.org/#faq

Sure, but Geopackage is a transport format: it describes itself as "a format for transferring geospatial information".

Spatialite is a spatial analysis database: it describes itself as "a complete and powerful Spatial DBMS (mostly OGC-SFS compliant".

They both use the same container, SQLite. And indeed there are other geospatial formats built on SQLite, such as mbtiles. But the intents are very different.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact