> We set the interpolation process running in QGIS on a Mac Pro and, a mere 11 days later, found ourselves the proud owners of a raster 250m2 grid layer
Probably done in minutes to hours with a properly setup PostGIS database and one query. "Data Journalists" seem to love it when things take long, thinking they are doing something super innovative and novel. Often it's just that they were using the wrong hammer...
> 1. Don’t use GeoTIFFs for geometric visualization
What? I am not sure what exactly the image is supposed to show but it seems like some rescaling/filtering. Not the fault of GeoTIFF.
Why not render the raster tiles locally? gdal2tiles or Tilemill can do that for you.
Why does Mapbox recommend LZW for compression? DEFLATE is usually much smaller and with the horizontal differencing predictor it should be even more smaller.
> 2. Your GeoJSON can probably be way smaller
Does Mapbox not support TopoJSON? That helps a lot for local data with quantisation.
> Because nothing in GIS is straightforward, attempting to round the values of 2.7m polygons using the field calculator in QGIS invariably resulted in the application crashing, even on a high-end Mac Pro. After several attempts using QGIS 2.16, 2.18 and 3.2,
Did they ask the community or even just provide a bug report? This should not happen from a QGIS fault.
Unfortunately, mapping documentation is absolutely abysmal. 90% of what's available online is either eight years out of date, riddled with TODO:'s, or a mish-mash of incompatible versions.
As someone who doesn't do this for a living, but has to build 70,000 maps on a weekly basis, I know it's a nightmare. Especially when you're going in for the first time.
I think they did a pretty good job considering they have to be jacks of all trades.
Absolutely! Solving what you are set out to do is a great result and I am super happy for them and their end result!
But this state of documentation (as you call it, I would also/rather consider all the needed background knowledge and lingo to be a hurdle) is all the more reason to just document their goal, their data(s), their abilities and capabilities and ask an expert for 15 minutes of their time for input and pointers. That is so much more efficient!
Like in the article. If you have a problem that takes 11 days to run in QGIS, then you shouldn't be using QGIS, but a tool that is designed for processing large amounts of data. 1-2 million points might be a lot from a GIS tools perspective, but is absolutely nothing from a big data/HPC perspective, so be sure to check what those guy are doing.
As a general rule always try to ask yourself, what am I actually trying to do, mathematically, and then ask how would someone go about solving that math problem, if you didn't tell them it was a GIS problem?
Much of raster analysis, for example, is just a combination of matrix math and convolutions. So look up how the numerical analysis people do matrix math and convolution on huge matrices. In python, for example, you have tools like numexpr for fast elementwise transformation of matrices and via numpy/scipy you can call BLAS and LAPACK. If you're dealing with rasters that don't fit in memory take a look at solutions like Dask.
Same principle applies when dealing with vector data. Many problems are 'just' graph theory, so find out which library that graph theory people use to solve that sort of problem on large graphs and use that instead. Or if a problem reduces to a line/polygon intersection problem, well that's just raycasting, and the games industry has spent a lot of effort in making that really really fast.
Finally, learn to use the underlying libraries from the command line and scripts rather than via the GUI and how to divide that work across several processors/machines. GDAL + GNU parallel from the command line will transform 1000 rasters faster than QGIS could ever hope to do.
My company recently hit the need for maps showing several of our own proprietary layers at various zoom layers, and considering how much of the data is already in GeoJSON, converting that to the Mapbox Vector Tile binary format (MVT) was a breeze to implement on our own servers, compared to rasterizing all of the layers and re-rendering whenever any part of our datatset changes.
As noted in the sections:
4. Maps aren’t truly responsive by default (but are annoying on mobile by default)
5. Performance, particularly on mobile, will need all the help it can get
As someone who worked for me used to say: "It's better than good. It's done."
Can you try to categorize what you hate about it? If it are specific things and not the general hurdle of figuring where goes what and what means what.