Hacker News new | past | comments | ask | show | jobs | submit login
Protomaps – A free and open source map of the world (protomaps.com)
1001 points by rcarmo on Oct 23, 2023 | hide | past | favorite | 163 comments



I just used their pmtiles tool to grab a map of just the area around Half Moon Bay, south of San Francisco.

I grabbed the latest macOS Go binary from https://github.com/protomaps/go-pmtiles/releases

I found a rough bounding box using http://bboxfinder.com/#37.373977,-122.593346,37.570977,-122....

Then I ran this:

    pmtiles extract https://build.protomaps.com/20231023.pmtiles hmb.pmtiles \
      --bbox=-122.593346,37.373977,-122.400055,37.570977
The result was a 2MB hmb.pmtiles file which, when dragged onto the "drop zone" on https://maps.protomaps.com/ showed me a detailed zoomable map, with full OpenStreetMap metadata (airport polygons etc) - all the way down to individual building outlines level.

All that in 2MB! I'm really impressed.

My file is here: https://gist.githubusercontent.com/simonw/3c1a5358ca8f782f13...



Beat me to it. I was working on this last night but had to sleep before writing it up. I was trying to use parcel as a bundler and was thrown because whatever dev server it uses doesn't seem to support HTTP range requests properly. I was getting "invalid magic number" on the client side.


That's really cool. Cloud be really use full for local businesses. They often embed Google maps, but I feels like having a map from the local are would be sufficient, and cheaper.


Wouldn’t they need to keep updating this from time to time to stay current? That would likely kill any savings pretty quickly.


It depends what you use it for. I plan to use it for blog posts where I show the track of where I walked. I actually like that the map won't update. It wouldn't make sense to show me walking across a map of the future, would it? It's a record of the past and the map should be frozen just like the writing.


Sure, but in GP's context of local businesses, they certainly will want to keep up-to-date with any changes to make sure customers can still find them.


Yes, absolutely. I think, in that case, a managed tileserver (using PMTiles or otherwise) is probably the best solution.


Protomaps generates daily world files I believe [1] but I haven't confirmed that they're used by the download tool yet. In theory you can download the data everyday to pick up updates.

[1] https://maps.protomaps.com/builds/

Edit: re-reading top comment it looks like they're using a daily file so you'd just have to create a cron job


I did the same and got a map of Gran Canaria which is 15MB! And it's not like a user is ever going to download the entire file. It's really quite cool and I'm quite excited to put maps on my static site.


I realise this isn't tech support but I have no idea why it doesn't work for me. The pmtile file you linked works, but creating my own smaller one doesn't.

But the concept is great to get a map for a project


I've been an avid Mapbox user since getting alpha access to Tilemill. From my perspective, this is the most important contribution to the open mapping space since the introduction of vector tiles and Mapbox GL / Map GL. Mapbox in my opinion left the door open in how they approached tile baking with MTS. While powerful it was way to confusing and expensive. While you could always bake tiles with Tippecanoe you still had to struggle with vector tile hosting hosting. With PMTIles vector tile hosting is arguably easier, more convenient and an order of magnitude cheaper then going with Mapbox (especially if you don't need fancy projection / 3d support and are ok with MapLibre). Felt is one of the major services that I believe uses the PMTile approach behind the scenes and they won't be the last. (Notably: they have continued to invest in Tippecanoe which is awesome). With Mapbox focus shifting, it appears, to car navigation - it's great to see disruptive innovation coming from new places. Even more amazing that is has come from a one person team.

Also to note, the emergence of PMTiles coincides DuckDB spatial query support will unlock a lot of innovation. The ability to query and process large geoparquet files and quickly stream them into baked PMTiles unlocks a lot of really compelling use cases.


At the bottom of this post is an example of creating PMTiles from GeoParquet via tippecanoe + gpq. Thanks to Tim Schaub for making this possible!

gpq convert Cairo_Governorate.parquet --to=geojson | tippecanoe Cairo.geojson -o Cairo.pmtiles

* https://cloudnativegeo.org/blog/2023/10/where-is-cog-for-vec...

* https://github.com/planetlabs/gpq


It's kinda sad how we need open source alternatives to MapBox, the formerly open source company.


Mapbox is still the best choice where a polished suite of mapping APIs is a better fit for a project.

Mapbox is a venture-backed company with a SaaS business model, and has never been open source in total - it used to be open core with a FOSS frontend and proprietary backend. This SaaS model is absolutely the best way to fund huge companies and give investors a return. Mapbox has also done the bulk of innovation in open source web mapping over the past 10 years - the Protomaps project uses MapLibre (fork of Mapbox GL 1.0) and the MVT tile format. Both required teams of full-time developers - easily tens of millions in salaries and stock - and they have given away version 1.0 for free as a gift, even though 2.0 is not open source.

The ideal software economy is one in which innovators capture a good portion of the wealth they create. This is why it's important for Protomaps to focus on use cases underserved by SaaS, instead of just being a cheaper map API. The sibling comment on wildfire mapping https://news.ycombinator.com/item?id=37989059 is a good example of the applications I want the project to support.


> The ideal software economy is one in which innovators capture a good portion of the wealth they create.

Beg to differ, the ideal software economies maximally empowers end-users at the absolute minimal cost. Innovators can and should leave substantial cash on the table. They should see themselves as stewards of a public good.


> They should see themselves as stewards of a public good.

I think that goes a bit far. They don’t absolutely _have_ to. It’s just nice if they do.


I agree in principle but being a steward doesn't put food on the table


Ideal software economy would be all free (as in freedom) software. The current economy is hugely wasteful with lots of redundant work and generally bad outcomes.

Private ownership economy doesn't work for zero marginal cost products even in theory. It's a huge waste of our resources and a big hinderance to progress.


I lurk in an online community of journalist coders and some folks there are excited about PMTiles as cost-saving way to host your own custom-styled map tiles.

See, for instance, "How The Post is replacing Mapbox with open source solutions" https://www.kschaul.com/post/2023/02/16/how-the-post-is-repl...


How do the publications know this data is accurate?


Most of these piggyback on the OSM for the data tier, which $T companies work on. Afaict the discussion is more about compute on top - tile generation, interactive renderers, ... .


$T companies aren't a source of truth. Much of journalism is reporting on their errors, including the intentional, reckless, and lazy kinds.


What would you suggest the publications use instead? Maps from billion dollar companies!


I asked the question because it's an interesting question. In our world flooded in misinformation and cheap data, there's often little accurate, high-quality data.


I'm not sure what answer you are looking for / what alternative you are suggesting?

OSM is the biggest community effort - NGO, volunteer, corporate, etc - to solve data quality in GIS. The participants do everything to improve quality from individuals walking around with GPS devices to companies launching low earth orbit satellites into space and self-driving cars in the ground with AI error detection.

There is a more corporate and afaict anti-google effort more recently by tomtom and google competitors (Microsoft, meta) called Overture, which seems to be attempting a more closed and big corporate governed fork & ecosystem replacement of OSM, even if they phrase it as complementary. Your questioning of OSM-as-misinformation seems interesting in the comparative context of alternatives like Google Maps, Apple Maps, and Overture.


> I'm not sure what answer you are looking for / what alternative you are suggesting?

None of the above. People might just be curious and interested!

Thanks for sharing all that. I didn't know much about the quality. Much crowd-sourced data has accuracy problems.


It is an interesting question. I've worked with a fair bit of geo-data, and found pretty much any sufficiently large geo-dataset, regardless of whether it's crowd-sourced or purchased from a commercial provider tends to have a fair number data accuracy issues.

If you spend any time on either OpenStreetMap or Google Maps in the rural US, and you are very likely to find both missing streets, and streets that have been completely hallucinated out of nothing -- mostly because they were both originally sourced from the US CENSUS TIGER data -- a US govt data set that tends to be full of errors.

OpenStreetMap contributors of often improve these issues over time (though it largely depends on if the area you care about has some sufficiently dedicated mapper).


Google maps also isn't always accurate. There's no source of maps that's guaranteed to be 100% accurate all the time.


> 100% ... all the time

Because nothing meets that standard, it's not an applicable measure of quality. Also, it tends to lead to binary thinking: 100% or null.

But we live in a world of less than 100%, and there is a lot of variation: Five nines, 80%, datasets with just a few valuable records, etc. That's where the real-life questions are.


Sounds like a neat community. Can you share where it is?


Sure thing, check out NICAR-L: https://www.ire.org/resources/listservs/


If anyone else was curious, their sample OSM derived PMTile file (protomaps-basemap-opensource-20230408.pmtiles) is 103GB.

https://docs.protomaps.com/basemaps/downloads


It's 107 GB now.

It would be great to be able to download only diffs and apply them to a daily, weekly, monthly baseline.


I wonder if the compressed Hilbert ordering inside the tiles make them hard to diff efficiently.


The ordering applies across tiles, and not within - it should be efficient to diff two archives because there is a single, well-defined ordering over the entire archive, and writing algorithms over the tileset is all 1D operations instead of 2D (a single zoom level plane) or 3D (multiple planes)


Hmm I was just going off the Firefox estimate, maybe there's some browser difference?


The newest major version has moved to a daily build channel (the last major version was last updated in April)

Today 10/23 is 107GB.

https://maps.protomaps.com/builds/

I'll update the documentation page today.



Exactly the kind of thing that could be downloaded using Bittorrent!


I assume that would require BitTorrent v2 (and I think more specifically something along BEP-46 - Updating Torrents Via DHT Mutable Items) or your map is gonna be out-of-sync and sad after a while.


are the torrents in the wild that uses this? and are there bittorrent clients that implement this? how do they surface the mutable aspect to the user?

(is it like, your torrent client will silently update files, and if you dont want to update anymore you need to pause? can you still seed if you dont want to update?)


I've been testing a setup that automatically generates OpenStreetMap/OpenMapTiles formatted pmtiles and mbtiles, then makes them available by torrent. The torrents get placed in a rss feed at https://planetgen.wifidb.net/

Basically what I did was set up 'qbittorent' client to watch the openstreetmap rss weekly torrent feed. when it downloads a new pbf file, it runs this script I made at https://github.com/acalcutt/PlanetilerTorrent , which was based of the osm torrent creation process.

The script creates pmtiles and mbtiles using planetiler, then makes torrent and starts seeding them in qbittorent.

In qbittorent I have options like how long I want to seed for (30 days), what do do when done seeding (Delete the files) and also speed controls so I can limit bandwidth during my working hours.

I have this running on a old laptop with 2TB nvme/64GB memory. It seeems to work pretty well so far. It would be nice if my internet speeds were a little better for initial seeding, but at least the torrents share the load with other people who are downloading/


That's interesting. But it doesn't use BEP-46, right?

https://www.bittorrent.org/beps/bep_0046.html (discussion on HN at the time, https://news.ycombinator.com/item?id=12282601)


I don't think so because it is not updating the same torrent it is making new ones. I am transmitting the torrent over Bittorrent/DHT like they mention, not http, but without it being the same file I'm not sure that is the same.


You can't modify files, because torrent file itself is just sequence of chechsums. But there is RSS extension that allpo to pipe updates of new files.


But with BEP-46 you can, right? (BEP-46 apparently doesn't use RSS)

I was asking like, how widely is BEP-46 adopted in practice. It's a spec from 2016!

https://www.bittorrent.org/beps/bep_0046.html (discussion on HN at the time, https://news.ycombinator.com/item?id=12282601)


I use it in anacrolix/btlink. There's support in anacrolix/dht.


That's very cool!

Does this serve decentralized "web app" through bittorrent?

Do I need your client to seed it, or can something like qbittorrent seed it?

Will I seed only the last version of the page or all versions? Can someone download an earlier version or list all past versions?


What are they using to compress tiles? Could they get even smaller with AVIF? Or AVIF+SVG overlaid for text?


Yes thank you I was wondering exactly this. Quite reasonable size one might say.


thanks for the info, weird that this isn't answered in the FAQ


> The Google Maps API has a generous free tier, but high fees past 30,000 map loads. Protomaps is designed from the ground up to be a system you run yourself - as a 100% static backend on cloud storage - so you only incur the usage costs of storing and serving that data, which can be pennies for moderate usage.

It would be interesting to see a cost comparison: recommended hosting setup vs Google Maps, and at what point the line intersects.


I haven't found this info in the FAQ, but it would be interesting to know how large that PMTiles file is for a medium-sized country.

It seems to be a really attractive way of self-hosting maps that don't have to be 100% up to date and are only used in a specific region. "Recompiling" that file once a year and uploading it to a static file hoster would be an easy process with very little external dependencies.

Edit: Found the official download link[1] for the whole world, which is a bit over 100 GB. So I guess the answer to my question would be "hundreds of MB to a few GB", which seems totally fine.

[1] https://docs.protomaps.com/basemaps/downloads


Geofabrik already does this for vector tiles of each country, but as individual mvt files rather than in a container like pmtiles or mbtiles.

http://download.geofabrik.de/


That's true - I suppose the cost changes depending on the size of the area you're mapping. Maybe a representative set? A city (e.g. Johannesburg), a small country (e.g. UK), and a large country (e.g. USA)?


Monaco is 500k, Ireland +NI is 400M, and Europe is 40G, for their basemap output.

(Representative numbers from my runs)


Cloudflare: 100GB hosting is $1 and $.36 for 10 million requests.

If you read the Protomaps docs it explains how to cache the requests on Cloudflare CDN so you only have to pay for each tile request once per cache period. It's quite cheap.

Edit: previous submission title for Protomaps: Serverless maps at 1/700 the cost of Google Maps API


Okay but the question was what it actually costs, not what our favorite mitm provider offers as a commercial service (they have the worldwide hardware and sufficiently deep pockets to offer this at a rate that guarantees gaining market share, below the cost price of normal hosting providers)


Are you asking how much it would cost to run this on self-hosted bare metal servers? Otherwise I'm also not sure what you're asking for.


I mean, one of the examples uses cloudflare to host the thing, so that is the pricing parent is talking about. What else do you want?


Brandon, the founder of Protomaps, was interviewed on the Geomob podcast a few months back: https://thegeomob.com/podcast/episode-176



For me the most interesting part is “PMTiles”:

> PMTiles is a single-file archive format for pyramids of tiled data. A PMTiles archive can be hosted on a storage platform like S3, and enables low-cost, zero-maintenance map applications.

> PMTiles is a general format for tiled data addressed by Z/X/Y coordinates. This can be cartographic basemap vector tiles, remote sensing observations, JPEG images, or more.

> PMTiles readers use HTTP Range Requests to fetch only the relevant tile or metadata inside a PMTiles archive on-demand.

> The arrangement of tiles and directories is designed to minimize the amount of overhead requests when panning and zooming.

https://docs.protomaps.com/pmtiles/


PMTiles is amazing. We discovered it when we were building a Canadian wildfire hotspot map this summer.

We were taking daily CSV files, turning them into vector tiles, syncing them with S3, and displaying in mapbox-gl map. The biggest cost was S3 file operations. We had to aggressively cluster to reduce our cost down and got to a point where the tiles numbered ~75,000. At the frequency we wanted to update this was costing $500/mo in S3 ops.

Now we process the files into a single PMTiles file, deploy that one file and cut the costs down to pennies.

For a map that we make public, it was incredibly enabling. You can checkout the Canadian hotspot data for all of 2023 if you're curious: https://lens.pathandfocus.com


Pmtiles are the vector counterpart to Cloud Optimized GEOTIFF (COG), allowing for efficient usage of mapping info from clients when the server supports a http range request.

Previous iterations of this on the vector side have either been a ton of small files(pbf) or a large file that needed a front end to serve the individual tiles (mbtiles).

At this point, you can take an OSM dump and convert it to a country level basemap in minutes on a stout machine, or hours for a continent, which you can then self host on a plain old web server with a custom style sheet.


Generating country level basemaps from scratch is a great approach!

I've made an even easier path which is to just extract the relevant tiles from a daily build: https://docs.protomaps.com/guide/getting-started

This can take seconds to minutes for small-medium areas and doesn't need a powerful computer at all. The drawbacks are that the build is only daily, and the lower-zoom tiles aren't clipped so include extra information beyond your specified area.


I've been using TileServer GL for a while and it looks like there is support for pmtiles coming soon.

https://github.com/maptiler/tileserver-gl/pull/1009


pmtiles are only the counterpart to the "overviews" part of COG. pmtiles doesn't give you analysis-ready data. For that, look to FlatGeobuf or GeoParquet (once it gets a spatial index in v1.1 of the spec)


Yeah, I'm aware of that. I love what you're going with geoparquet, it's certainly quite nice for analysis and some viewing. But there are still going to be a lot of datasets and clients that aren't going to be able to handle that where the tiled approach will work.

I'm looking at this currently for a couple of clients, and I think that for what we're doing -- we're going to wind up serving both a view and analysis version separately.


So, could PMTiles be used as a cheap/dirty/fast way to host other forms of geospatial data that isn't strictly "map tiles"? Such as a large-ish, infrequently updated POI database?


If the geometry is points/lines/polygon/multi* + properties on the features, yes. (In other words, if it's representable as geojson), with the caveat that it's more for viewing and less for analysis.

It's essentially what the internal layers are in the base layer -- there are land use, buildings, transit, physical, and border features that are combined with a stylesheet to make the base layer map tiles.

There are some subtleties for packing the overview data for the wider zooms, and what happens when you have tiles that are too big because of number of features or metadata size -- where you can drop on density or coalesce on density.


These being vector files, does it mean they can be styled on the fly?


(the answer is yes)


Maplibre[1] + PMTiles + Felt's "tippecanoe"[2] (it can output .pmtiles) are an awesome combination for self-hosted web maps if you're ok with being locked into a Web Mercator projection

for pretty much any geospatial source you can convert to .pmtiles via GDAL[3] and tippecanoe (.shp .gpkg ...) | ogr2ogr -> .geojson | tippecanoe -> .pmtiles

for OpenStreetMap data there's planetiler[4], and and openmaptiles[5] styles that work with Maplibre

with those combinations you've got a great start to something you can host for pennies on AWS S3+CloudFront or Cloudflare R2, with an open source data pipeline

[1] https://maplibre.org/

[2] https://github.com/felt/tippecanoe

[3] https://gdal.org/

[4] https://github.com/onthegomap/planetiler

[5] https://openmaptiles.org/styles/

ps I find GDAL/ogr2ogr documentation pretty hard to parse, as an example to get you started

  ogr2ogr -f GeoJSON counties.json -t_srs EPSG:4326 -nln counties -sql "SELECT STATEFP, COUNTYFP, NAME FROM tl_2022_us_county" /vsizip/tl_2022_us_county.zip

  https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html


I'm a student trying to upskill in geospatial/compsci. What kinds of projects could you make with all of this? Any good starting points you'd recommend?


Not OP, but almost anything where "a map" is the output will be a hugely big learning opportunity for you. Static maps, print-quality maps, interactive web maps -- the experience of building any of those will bring lots of learning. (That's how I got started.)


ty!


you can checkout https://github.com/maplibre/awesome-maplibre#users for some examples of what you can do with Maplibre


ty!


I’ve contributed to this project, and we use it in production at work! It’s great. Brandon is very nice, and I really appreciate all the work that has been put into this.


We've had quite some success with deploying "serverless" maps, leveraging PMTiles. Have a look at https://github.com/serverlessmaps/serverlessmaps if you're interested to deploy this on AWS CloudFront/Lambda@Edge/S3...


I also wrote a blog post as introduction: https://tobilg.com/serverless-maps-for-fun-and-profit


I wonder if Github Pages supports HTTP range. Obviously putting a multi-GB blob on there would probably be abuse, but a small blob limited to a small region with limited detail would be really nice for having a map to accompany a blog post or something.


I used it for this hospital accessibility map of Maryland. Sorry the legend looks terrible on mobile though

https://wcedmisten.github.io/nextjs-protomap-demo/isochrone


It does, but there is a limit of something like 500 MB for single files.


It would be cool to have an automatic tool to generate a blob of any size and it would include as much detail as possible within your given blob size limit...


You can kind of do that already with the pmtiles CLI:

pmtiles extract INPUT.pmtiles OUTPUT.pmtiles --bbox=BBOX --maxzoom=Z --dry-run

--dry-run will output the total size of the final archive without downloading any tiles, so you can adjust the maxzoom from the default of 15 (for basemaps) until it fits under your limit


I am curious about maps and the software to generate them and play with them, but when I tried - a few years ago, admittedly - to dabble in it (I wanted to generate those art-sy nice city maps pictures for my hometown [1]), I was overwhelmed with the ecosystem, all the steps involved, all the options etc.

Does anyone have a good introduction to paint a high level picture of this, so someone like me can navigate?

[1] https://www.etsy.com/nz/listing/1541532385/dusseldorf-print-...


Worth mentioning this project (https://github.com/onthegomap/planetiler) that lets you create osm mbtiles and pmtiles pretty easy!


The protomaps basemap is also built on planetiler: https://github.com/protomaps/basemaps - and Brandon is one of the main contributors a to planetiler!


planetiler is such a sophisticated software. Hats off to Michael!

Therefore I'm a bit proud that a tiny part of the fast and memory-efficient import pipeline is using code from GraphHopper :) (see Acknowledgement)


We actively use this at Street Art Cities (https://streetartcities.com). Really helped keep hosting costs down for a project that is more community than commercial, and created a more cohesive experience.

Little blog post here: https://schof.co/moving-over-street-art-cities-to-pmtiles-an...


Maybe I am overlooking something, but where does Protomaps get its data?

Also, is the data open source? The FAQ says the code, etc. is open source, but doesn't mention data:

> Is Protomaps open source? / Yes! All core software libraries and formats are open source under permissive licenses, including: ...


It's OpenStreetMap (ODbL) and Natural Earth (public domain) currently

* http://openstreetmap.org

* http://naturalearthdata.com


What does naturalearth have that OSM doesn't? I checked the site and it's showing what looks like a subset of OSM


Natural Earth is generalized (low zoom) data and has hand-curated scaleranks on features to determine what features appear at low zooms. None of that is in OSM.


Ah, thanks!


How is that different from OpenStreetMap? Am I missing something here?


This looks like another way to deliver OpenStreetMap data, which I think is exactly in the spirit of the OSM project.


Not sure if op is affiliated, but I recommend increasing the font size of the street names as you zoom in. Look forward to using this in the future


In theory you could do that yourself with a vector basemap. Wouldn't it be great if browser minimum font size applied to absolutely everything (ie. minimum font size actually means "I cannot read anything below this size so never render it like that").


That is a constant problem on Google's and Apple's phone maps. The names are almost unreadable to many people I've encountered with vision in the normal range of issues (e.g., wear glasses, etc.).


Apple Maps does respect the "text size" system setting. All the text including street names will go very big if you want them to!


Whats the relationship between this and openstreetmaps? Is OSM the "runtime viewer" and Protomaps is the maps and query data?


Open streetmap is a database that you can use to produce maps from. Open streetmaps the organization does not produce maps as a service directly. They do provide a map editor, which of course also renders the tiles and they host the central database that people contribute to with that editor. But their tiles are not distributed via APIs or dumps otherwise. The only thing that they provide is a database dump in various formats.

Mostly people either generate their own tiles from this (using a variety of OSS and proprietary tools available for this) or use companies such as mapbox, maptiler, etc. that do this for them. Typically the commercial options bill using a metered per request model. This can make using maps quite expensive. Ever since Google raised their prices substantially a few years ago, things like Mapbox have become a lot more popular. Mapbox and similar products are also not cheap.

What pmtiles does is dramatically lowering the price and complexity for self hosting your own maps to basically the raw CDN network cost plus a small amount of overhead for storage and computation. The tiles are generated using e.g. lambda functions from a single pmtiles file on demand and then cached in the CDN. Any subsequent loads of the same tile are cache hits. So, especially for busy websites and apps, the savings can be substantial.


I really wish there was a PMTile implementation for MapLibreGL Native like there is for the JS side (I believe the JS side actually has a way to add arbitrary data protocols). Of course I could stand up some sort of worker or function to sit in the middle, but having this all work client side on Android and iOS would be a game changer for me!


Can’t really include 100gb of maps with your app though. How would you use it?


A PMTile is basically a single giant file that can sit on a CDN somewhere, and the client can ask for a certain part of that file using the byte range header. It eliminates having to either have a server that is serving tiles, or paying through the nose for IO if serving static files (uploading literally millions of tiles costs a lot, and not just because of their size).

If you want to embed a map file in your app, you can already use MBTiles for that. A MBTile is conceptually like a PMTile, in a way, but it's a zip file full of the tiles with a SQLite database. MapBox/MapLibre already supports local MBTiles. You can serve them online, but of course you need something taking in the x, y z coordinates and serving back the tiles.

Most of/the entire value prospect of PMTiles is you can dump a single multi GB files on a CDN somewhere, and clients can just request a little slice of it. No servers or any logic required (besides a CDN that supports byte-range headers).

I am hoping that MapLibre native will one day will support sources that are PMTiles. `source: pmtile://my.cdn.com/bigOleFile.pmtile` and then it just works as if pointing at a traditional slippy tile source.


You download it within the app. Or drop it into iCloud and tell the app where to find it. I have 200 gigs of offlined map data my phone with a custom native map app.

Not a big deal with terabyte storage on mobile these days.


These PMTiles don't seem to work well with users behind corporate proxies. The proxy must have to download the entire files before releasing it. Not making a nice user experience. I wonder if there is a way around this?


Switch to POST request?

If not, then probably it can be done by additional proxy that will convert range headers to GET params. But TBH corporate proxy in this context doesn't seem to work properly. For example, all other header-based authentication should be broken if proxy doesn't respect additional headers.


I'm currently self-hosting a planet OpenMapTiles mbtiles file with several styles on top of it. I've been meaning to look into Protomaps closer to see if I could use it to reduce costs. It looks like a very promising project!


I'm confused, isn't it just openstreetmaps? I must be missing something.


OpenStreetMap is the data they use. Other software usually fetch tiles (images) to display, but here they built a single and much smaller file from OSM data which is rendered directly by clients.


This is super impressive. Way to go Brandon! Excellent work on PMTiles!


Are there any free (speech or beer) sources of aerial or satellite data that can be used with these free maps?


Wow, that is a huge achievement for an open source project given how much data there is.


how did you deal with the china map part?

AFAIK it is illegal to distribute accurate geographical data about China without their authorization.

You can distribute the one that is slightly inaccurate, such as what provides google maps or apple maps for non chinese users


Illegal in what jurisdiction? I suppose only in China?


For a while I've wanted an SVG map with a JS API where you can input latitude and longitude and show the point on it. I hope this will do that!

The reason I want that is, almost all humans have a GPS device in their pocket. You could easily get your approximate location from this, even without cell service. Implementation could be a PWA.


Your best bet for this is something like OsmAnd or organic maps, which download countries worth of osm data and make it available offline.

The amount of map data for any reasonable area is going to be too large for a PWA, though it would technically be possible to do with existing map viewers and this format. (Eg maplibre or leaflet)


Thanks. I'm OK with a single layer just showing the outlines of coastlines and regions, including the names of regions. I think a suitable simplified world map could fit all this in SVG.

edit: Initially underestimated so did some calculations below

Key Data Points for Estimate of SVG Size

  - Radius of Earth: 6371 km  
  - Surface Area of Earth: 510e6 km^2  
  - Approximate Number of Countries: 250  
  - Land Area of Earth: 29% of Surface Area = 150e6 km^2
Corrections and Estimations Based on Average Country Size

  - Average Land Area per Country: 150e6 km^2 / 250 = 6e5 km^2  
  - Country Closest to Average Size: Ukraine, with 603550 km^2, or roughly 6e5 km^2  
  - World Resources Institute Measure of Ukraine Coastline: 4953 km, or around 5e3 km
Assumptions for SVG Complexity

  1. Only coastlines matter for the paths.  
  2. Resolution set at 10 meters.
SVG Data Calculations and Estimates

  - Coastline Points for Average Country:  
    - 5e3 km = 5e6 m  
    - Equals 5e5 X,Y points (at 10m resolution)
  - Total Points for All Countries:  
    - 250 countries * 5e5 points = 1.25e8 X,Y points
  - Storage Requirement for Each Point:  
    - 6 figures per coordinate + delimiters = roughly 15 bytes  
    - Total Storage = 1.25e8 points * 15 bytes = ~1.875e9 bytes or about 1.875 GB
Even with a simplified approach, we're looking at an SVG that'd be almost 2 GB, just for coastlines at a 10m resolution. Definitely some hefty complexities in play here.

What about 1000m resolution? 20 megabytes. Much more achievable.


I would expect border length scales with the square root of country area, so smaller countries have disproportionally longer borders than larger ones.

There also must be more countries smaller than the average country than there are larger ones (https://www.worldometers.info/geography/largest-countries-in... confirms that. Ukraine is 45th of 223, or about 20% down.)

So, I guess yours is underestimating the size.

You probably get a better estimate by using the Ukraine to estimate the scaling constant (√603,550km²)÷4953km and then use a list of land areas to estimate border lengths.


I like your refinements! I may take a look at reviewing the estimate later! Thank you for making this contribution, it's really valuable that you share your intuitive statistical knowledge here :)


Countries with coastlines, especially with fjords, have disproportionately longer borders. Scaling constants aren't going to do it for something like Chile.


SVG is kind-of terrible for maps, but you can get pretty small with GeoPackage (read: sqlite). I recently spent a bit too long on exactly this problem and ended up with the following.

116KB - 5MB for country borders

16MB - 52MB for ~50K city/county level borders based on geoBoundaries

The range of sizes depends on how much custom compression/simplification you put into it. The source files are about 10x bigger, but that's already pretty small.

Topojson might be even smaller though.

Check the repo for details /selfplug https://github.com/SmilyOrg/tinygpkg-data


Thank you we forked your repo! What makes you say SVG is kind-of terrible for maps?


Thanks!

Well, mostly that it's text / XML you usually have to parse in full and boundaries are data heavy, so if you have anything more than a world map, I don't see that working very well.

In contrast with the OP or GeoPackage, you can query by tiles/range and only extract the boundaries you need if you're zoomed in somewhere. If you use Tiny Well-Known Binary compression, you can also shrink the data quite a bit while keeping the querying capabilities.

But if you only ever need to render the whole thing, topojson is probably the winner as it cleverly encodes only unique borders once, so it tends to be a lot smaller.

And of course if SVG works for your case, go ahead and use it, it's surely the easiest way to render vector gfx in the browser :)


Thank you for that! I love your analysis. Really appreciate you teaching me more about this. :)


You're welcome!


So, you're still going to be better off with a tiled overview sort of system.

FWIW, I've got an official 900MB geojson of borders from a large unnamed multinational NGO type that's particular about their mapping.

At z=4, or 600m resolution, it's a 2.5MB pmtiles, at z=10 (10m resolution) it's 83M.

That's just the ADM0 borders, adding local borders would obviously increase the size.


I'm a total noob to all this, so I totally get if you can't help--it's not your problem, no worries! But would you be able to share any code to convert this, or any of the data?


I can't share this data, but there are a couple of ways to get started:

1) Just use the overview tiles from the global protomaps dump (OSM source). The getting started guide can walk you through some of it: https://docs.protomaps.com/guide/getting-started but you'd want to use the pmtiles tool to extract the widest zooms from the global dataset.

  pmtiles extract https://build.protomaps.com/20231023.pmtiles overview.pmtiles \
      --maxzoom=4
Then see the intro to pop it in a viewer, or it's a few dependencies and a style sheet and it pops into OpenLayers, Leaflet or MapLibre for a viewer. There may be some extra layers there, but at wide zooms there shouldn't be much data there. (e.g. buildings)

2) If you have/find your own data, use Felt/tippecanoe to convert it.

  tippecanoe -z4 -o overview.pmtiles data.geojson


Thank you! No worries at all, I totally understand. Just thank you for replying!

I really appreciate you showing me how to do that. Thank you very much!


But protomaps is still getting the map data from a CDN, do you mean that the map data would be cached on the device?


Yes. With the tradeoff that it would be radically simplified.


Usually .com domains were used by commercial entities, and protomaps.com seems like it might have cost some money.

I don't fully understand LLC except that they protect owners from company liability.

Just trying to figure out what the goal for this company is.


I'm the owner of said commercial company.

LLCs in this context signal "This company has no outside investors", because almost every venture-backed technology company will be organized as a C Corporation instead.

I am choosing to run this open source project through a commercial company - that enables me to have a bank account, use GitHub Sponsors, pay others for work, and enter in support and development contracts with other companies and the public sector. Non-profit foundations like US 501(c)(3) aren't practical to create for solo developers.


Why not an SPC?


Not op but I believe the processes associated with LLCs are far better understood and streamlined than SPCs. If I'm diy-founding a legal entity myself, I'd try to follow the path of least surprises too.


> Usually .com domains were used by commercial entities

I must disagree. Even in the early days, even though that was the intended use of the .com TLD, it has always actually been used as the default TLD instead of the .net TLD.

I really hate it, but whenever I see a .com I have never, and will never assume it has some sort of commercial angle.


>Even in the early days

I'd go so far as to say "especially in the early days". When I got my first domain in ~'92, it was just a personal domain, and they wouldn't let me get a .net because I wasn't an ISP, or a .org because I wasn't a non-profit. And .edu was right out.


The head-scratcher is more like: If it's a non-profit, they (and I) would have expected a .org, which, unlike .net, has been used consistently over the years. But I get your point in general.


.com has awesome recognition and seo, in my experience always ranking before the same .org, .net, .io, ...


as is often the case, marketing has helped to make the web a worse experience


At the bottom of the frontpage there's a remark on "revenue"

A 100% independent software project

Protomaps is a self-funded, solo developer project with a mission to make interactive cartography accessible to hobbyists and organizations of all sizes. An essential part of that mission is publishing open source software under commercial-friendly licenses.

You can support my full-time work on Protomaps in a few ways: * Downloading the open source world basemap tileset with a support plan on GitHub Sponsors. * Paid development of open source features.


From what I can tell, it's a for-profit company for a single dev that is attempting to get by through nothing but GitHub Sponsorships.

Said sponsorship is a requirement for using the pre-hosted files for commercial uses in a SaaS-like manner.

But some of the wording in the faq is a bit confusing. E.G. There's a whole section about "Why don't you just sell plans for the hosted API?" right after detailing how to sign up for said plan for the hosted API.


That is correct, in addition I have contracts with companies for support as well as development of open source features.

The hosted API is "free" in the sense that there are no tiers. The API costs me money to run, which is why I require a GitHub sponsorship for commercial use. This is ideal for use cases where deploying your own tileset is impractical.

If you are using the hosted API heavily then it makes sense for you to "graduate" to deploying it yourself.


https://protomaps.com/faq

They seem to be explicitly noncommercial


I'd like to correct that impression - Protomaps is an explicitly commercial venture.

The difference from other commercial vendors is the focus on a complete, easily deployable solution - the hosted API is meant for non-commercial and light commercial use, instead of being the main product offering, to avoid the incentive trap of locking-in paying users to the API.

A related concept is the Community Right to Replicate: https://2i2c.org/right-to-replicate/


Cool but basemaps are a very tricky thing from a geographers perspective. There are several border conflicts. When it is open source there could be some manipulation.

Immediatelly getting downvoted. All I'm saying is that official map boundaries these days should be used.

Because I can again not post because too fast here my clarification: I mean official not commercial or proprietary


I think the downvotes may be because people take "When it is open source there could be some manipulation" to mean that you think proprietary alternatives are somehow safe from such manipulation—but of course they aren't, least of all Google Maps.


That's quite the edge case, though.

This is exceedingly useful for so many use cases.

E.g. internal tools that display locations of assets, system statuses, webpages that display store locations, and so forth.

Plenty of use cases where the costs for Googles APIs would exceed the dev costs for implementing something like this.


There are no "official map boundaries" when it comes to conflicted borders.


Indeed!

The Protomaps project encompasses the generation of basemap PMTiles and the ecosystem for delivering them to clients. It supplies a standard daily build from OpenStreetMap, and since the generation is open source it can be customized to conform with your specific requirements.


A commercial data provider can also be showing a manipulated border, and is at least making some kind of editorial decision about what they are showing (whether they think so or not).


For instance:

Why isn't my time zone highlighted on the world map?

> In early 1995, a border war broke out between Peru and Ecuador and the Peruvian government complained to Microsoft that the border was incorrectly placed. Of course, if we complied and moved the border northward, we’d get an equally angry letter from the Ecuadorian government demanding that we move it back. So we removed the feature altogether.

> The time zone map met a similar fate. The Indian government threatened to ban all Microsoft software from the country because we assigned a disputed region to Pakistan in the time zone map. (Any map that depicts an unfavorable border must bear a government stamp warning the end-user that the borders are incorrect. You can’t stamp software.) We had to make a special version of Windows 95 for them.

https://devblogs.microsoft.com/oldnewthing/20030822-00/?p=42...


> All I'm saying is that official map boundaries these days should be used.

Pray tell, which official map boundaries should be used?

The whole problem being that officials disagree of course.


>"There are several border conflicts. When it is open source there could be some manipulation."

Manipulation can and does happen with non open source data. Just ask any politician. You should be thankful to developers who did a great service for free. If something is incorrect because of conflicts - the world will sort it out eventually. It is not developers job.

PS - I did not downvoted as I never downvote any posts.


> PS - I did not downvoted as I never downvote any posts.

Sounds like a good policy.


For all who downvotes. Here is the prof that leaving Mapping borders open source is not the way: https://www.calcalistech.com/ctechnews/article/r1hyeammt


This is so true. Otherwise all Mapping authorities would be useless. I could change my house ground area and nobody cares




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: