Hacker News new | past | comments | ask | show | jobs | submit login
Wikimedia Maps Beta (wikimedia.org)
350 points by chippy on Sept 17, 2015 | hide | past | favorite | 82 comments

In case you're wondering what this is about: the scale of Wikipedia makes it difficult to just drop in a third-party OSM tiling service, so they're rolling their own map rendering process using OSM data. This is a beta of the rendered tiles, displayed inside Leaflet.

If you're into maps and this kind of thing, I highly recommend checking out LocalWiki (https://localwiki.org) too!

Is their rendering process documented anywhere?

Yes. From https://www.mediawiki.org/wiki/Maps#Production_maps_cluster:

The implementation [1] has various components including:

* Kartotherian [2]: a server capable of providing map tiles in vector (pbf) or raster (png) formats, as well as static map snapshots of any size for a given location.

* Tilerator [3]: a distributed backend tile generation service with a jobque

* A flexible sources [4] system to set up the needed storage and processing pipeline

1. https://www.mediawiki.org/wiki/Maps/Tile_server_implementati...

2. https://github.com/kartotherian/kartotherian/

3. https://github.com/kartotherian/tilerator/

4. https://github.com/kartotherian/kartotherian-core/

Cool! It looks like they're using leaflet and hosting their own tile server from Open Street Map data. I really like that they opened up how to connect to their tile server (thanks to chippy for the link: https://www.mediawiki.org/wiki/Maps ). When I was looking at options for creating a side project with open street map data earlier I ended up using Mapquest's open data api because Google maps had API restrictions after a number of calls and it was easy to integrate with leaflet, but if I was making it again I would consider using wikimedia's option. (Fyi, the sideproject is a static site hosted at http://newsatlas.io - it takes a news/rss feed & attempts to parse the location client side by parsing the news article's summary text and displaying the location on a map - it was written in JS with angular and leaflet as dependencies - hosted on S3).

Thanks for all your support! Means a lot. Clipped labels is a big issue, working on it. Any help is welcome with bugs and styling. Info is at https://www.mediawiki.org/wiki/Maps . Will post updates on twitter @nyuriks. Thanks!

One thing I found neat about this was that (at least by default) it seems language-neutral: scrolling around the world, labels over the U.S. are in English, labels in Japan are in Japanese, Korea, in Korean, China, in simplified Chinese, etc. Not even any romanized (or otherwise translated/transliterated) labels in parentheses. Just the native presentation.

Granted that for practicality, you probably also want a mode that makes the labels more useful for a reader that may not understand every writing system, but as a default (when the user doesn't express a language preference), there's something very satisfying about this method of presentation...

In my mind it reflects Wikipedia's aspirations of being a useful tool for the whole world....

This is probably not intentional - the data is from OpenStreetMap, and if you have a look at http://www.openstreetmap.org/ you'll see that there isn't an attempt to make all labels the same language.

OSM has a policy that the base 'name' tag should be the name used on-the-ground locally, i.e. what appears in real life on street signs, building nameplates, and so on. Plus some special considerations for multilingual areas [1]. Items can also have additional language-specific tags (name:ja, name:en, etc.), which can in principle be used by renderers to display names in the language preference of the user, e.g. to produce an all-German-label or all-English-label map, to the extent the data is present [2].

[1] http://wiki.openstreetmap.org/wiki/Multilingual_names

[2] http://wiki.openstreetmap.org/wiki/Map_internationalization

I wasn't aware of that policy - thanks for noting it.

> but as a default (when the user doesn't express a language preference),

I understand your viewpoint, and think it's noble. But you might want to tweak your parenthetical comment. In practice, I and most others do express a default language preference, which is sent as part of the browser's HTTP request. In my case it's:

    Accept-Language: en-US,en;q=0.5
which means it's ignoring my preference. And I'm fine with that. Language is a mess. Look at Switzerland, which is labeled with all four official languages, or the three languages of Belgium. Yet South Africa, which has 11 official languages (and 11 official names - https://en.wikipedia.org/wiki/Official_names_of_South_Africa ) is labeled only in English.

Being the edge case of internet users with a different default language than what my geolocation would result in, I know how frustrating it is to expect one language, and get another one "tailored" to you in order to "improve" user experience.

However, in this specific case I agree with the parent commenter. While the preferred language should affect the UI so you can interact with the map, I feel like in this case translating the content right away could take away from the experience and is not necessarily what the language flag should mean. It shouldn't be more than a click away, though.

I'm in immigrant living in country where English is not the native language. :)

However, I think you've misinterpreted the OP's comment and my response.

The OP likes the choice of native language for geographical names in a given country. You and I agree. The OP parenthetically notes that the reason is because the user has not expressed a language preference. I pointed out that the browser/user-agent actually does give a preference, which the server ignores.

It is this parenthetical logic I disagree with, not the result.

Wikipedia comes in many languages, so ultimately they require the mapping system to display labels based on the language of the user.

This can be complicated - imagine a wiki page about Paris France, translated into 40 or so languages each with their own high performant, cached map served from the same system. Wikimedia maps will be beyond awesome in a few years!

This doesn't seem to be true consistently FWIW; in Thailand the city I looked at had the city name in Thai (แม่สอด) but the street names romanized. Some other city names are romanized as well.

I am pumped about this, especially the Wikimedia Commons use cases described at https://www.mediawiki.org/wiki/Maps/Future_Plans#Commons.

There are actually already ways to browse Commons images on a map, but they need significant work. For example, the tile at [1] depicts an area with easily 20 geotagged Commons images, but, inexplicably, none of them are shown until you zoom in another level. Or, zoom out to see the entire state, only to see Massachusetts shown as completely lacking any geotagged Commons images [2].

There is a ton of potential for awesome applications involving geotagged images and geographic maps. I'm glad to see the Wikimedia Foundation stepping up its investment here.

1. https://tools.wmflabs.org/wiwosm/osm-on-ol/commons-on-osm.ph...

2. https://tools.wmflabs.org/wiwosm/osm-on-ol/commons-on-osm.ph...

More documentation here: https://www.mediawiki.org/wiki/Maps

Including the development setup

Wow, it's so refreshing after using Google Maps. Wikimedia Maps are just fast. Reminds me of how I remember Google Maps being. (My machine is much faster now, but GMaps just _lags_.)

The data quality is a bit weird, as if it doesn't know what things to highlight. But I'm sure this will improve over time.

I was about to comment that it is so slow to be unusable (seconds before tiles appear after zooming in) but apparently it's not the same for everybody. Or HN or some other site are slashdotting the tile server rigth now?

I have tried one or two Google Maps mashups, but never bothered seriously because its ... Well not worth the effort learning on my free time how to use something that is not free. But openstreetmap and Wikipedia - how can I resist.

Ok folks, let's look at this properly - you are going to get my donation this time round, and in five years google will look worried and in ten the goto mapping solution will be a free Wikipedia service.

Open always wins in the end.

I wonder if they're going to funnel contributions back to OSM.

They're not inviting separate contributions. There's no edit tab. If you want to edit these maps, you need to edit OSM.

I really like the colour selections.

But it seems to me that the designation of green space is off.

Kew Gardens comes out as a built-up area: https://maps.wikimedia.org/#16/51.4778/-0.2975

The colour scheme is far more legible than open street map. I wish they'd change their styling...

OSM's default road colouring is actually about to change: https://github.com/gravitystorm/openstreetmap-carto/pull/173...

The default mapnik render on osm.org is meant to be for editors rather than map users. For using the map it's often better to use other renders.

Not bad.

There are some pretty bad errors on tile boundaries, though: http://i.imgur.com/ifSXs2H.png


I also noticed if you zoom out all the way the map disappears.

Also, zoom isn't 'idempotent', which can be annoying.

can you explain what you mean by this?

Sorry for the late reply. I suspected my description might have been on the cryptic side. I was referring to the fact that zooming in one direction, followed by the exact same amount in the other, should return you to exactly the same state; these maps don't quite (i.e. they'll sometimes put you in a different location).

Nice effort, but still needs a lot of work. This view of the western US picks out four coastal cities worthy of mention: "Tijuana" (fine), "Los Angele" (sic!), "San Jose" (err..) and "Calgary" (wat). No San Francisco, no Seattle, no Vancouver...


Scripts are also all over the place. Japan is Japanese only, China is Chinese only, India is English, Bangladesh is Bengali, Pakistan is partly Urdu and English, the Arab countries are totally inconsistent...

If you're going on population, San Jose is the biggest city in the Bay Area; in that sense, it makes sense to pick it as the "one that wins" in a very dense area. Calgary is around 1M people and Tijuana is 1.3M, while Seattle and Vancouver are each around 600k. Those choices seem pretty reasonable to me.

In the context of a zoomed out global map it would seem to make more sense to approach it by Metro area. Vancouver metro area is 2.5M vs Calgary's 1.2M. Seattle is 3.6M

Also, I would think national capitals should be included where possible.

Alas, it's an early beta.

But you shouldn't just pick it based on some statistical measure only. Maps are supposed to be a useful tool, and that should consider the cultural importance of a place.

Is the e in San Jose, California really accented? (It shows up as San José for me.)

It's optional I suppose. I'm not aware of any official city documents that use it, but it has historical roots.

> California Governor Felipe de Neve established the first civilian town or pueblo in California on November 29, 1777. Founded near the southern end of San Francisco Bay, it was christened El Pueblo de San José de Guadalupe.

[1] http://web.archive.org/web/20080218051828/http://www.califor...

The official website seems to use the accent when the city name's is written in regular running text, but not when it's in all-caps or small-caps (as in the logo): http://www.sanjoseca.gov/

That used to be an unofficial convention of the Spanish language regarding accented letters: only in minuscule text but not on majuscule (capital) text. The Royal Academy has always supported (since at least 1792) that majuscules should be accented, but it was difficult to do so since movable types and typewriters didn't allow room for them: https://upload.wikimedia.org/wikipedia/commons/8/82/Acento_e...

San Jose is bigger than San Francisco only because its city limits occupy far more area. San Francisco is a bigger population center.

It sounds like you argue that because San Jose is bigger both by territory and by population, somehow that makes San Francisco truly bigger. However, if it's bigger in both, it only makes sense the map algorithm chooses it to display on lower resolutions. I don't think going by population density is that common.

>San Francisco


>City and county 852,469

>Metro 4,594,060

>CSA 8,607,423


>San Jose


>City 1,015,785

>Metro 1,952,872

>CSA 8,607,423

Colloquially, "population" refers to either the metropolitan area population or the CSA population, or something in between. But since different countries have different concepts and definitions for "metro area", it can't be relied on for a global map. The only thing remotely portable in this case is "how many people live within the city limits". And by that measure, San Jose is indeed bigger than San Francisco. San Francisco doesn't even get a mention because its population within city limits is below one million - the cut-off line.



…do you live in San Francisco?

I do. How can I help you? :)

I'm guessing it's because San Jose and Calgary have over 1M population -- the cities of SFO and Seattle proper do not.

Seems worth noting, it does have the right name for Los Angeles — the label is just running into Phoenix.

Looks like a boundary between tiles there:



Placing labels on a map is quite a hard problem, especially when certain tiles are refreshed.

Milwaukee loses the last letter too at certain zooms. Oddly nothing but the lake to its right though.

Nice color gamma - less like OSM's rainbow and more like Yandex or Naver maps. I wonder why Google insist on indecipherable shades of gray.

Some things would benefit from more contrast, but as it happens, almost all map services would.

This is going to be a great enhancement. The maps in Wikipedia articles always seemed very lacking to me.

Is the rendering process open, and what are they using for rendering the tiles? Mapnik?

Everything is completely open. We use https://github.com/kartotherian/kartotherian which is Mapbox+Mapnik based.

Would be nice if the farthest zoom out wasn't Mercator.

TIL it's actually Web Merkator https://en.wikipedia.org/wiki/Web_Mercator

There are many good reasons for choosing mercator, the best one being that it is mostly conformal.

Yes, while zoomed in that property is useful, zoomed out less so.

You can't really switch projections mid-zoom, because the tile coordinates wouldn't match and you'd find yourself in entirely the wrong place.

Eh, you can. You can even do actual 3D rendering in browser.

Not a big problem when zooming out from say, North America.

I think you need to think a little more carefully about the geometry involved.

Sounds like you need to think a little more about the use case of these maps.

This is incredible, it's about 2x as fast as OSM over tor and about 10x as fast as Google Maps.

Funny thing, the other day I was looking for a good map of all Wikipedia articles with geo positional data. I didn't really find one.

Looking at the title of this submission, I hoped that it would offer this. However, it seems it does not. Does anybody know if it ever will?

I would imagine that DBpedia would have something like that: http://wiki.dbpedia.org/

This will be available in the Android app too[0]. It'll debut on beta[1] first.

[0] https://gerrit.wikimedia.org/r/#/c/212922/ [1] https://play.google.com/store/apps/details?id=org.wikipedia....

Whats the difference between this and wikimapia? Aside from wikimapia having complete maps and orthographic layers ofc...

This isn't (or ever really directly indented to be) any kind of generic mapping service, or a map-driven interface for Wikipedia. Wikimapia (and other services!) provide interfaces for that. This is the first draft of a tile service that may be integrated into different spots throughout the project -- not an alternative to those kinds of things. It's basically just because Wikipedia can't use a tile service like the default OSM mapnik server without totally crashing it, so they need to build their own tile rendering pipeline, etc.

  * Under what license is the data published at wikimapia?
  * Where can I download the data?
  * [How] can I use it for my own website?


Creative Commons, they have an API if you want to consume the data.

Crimean names are in Russian. :/

Whatever you may think of Crimea status, people in Crimea use Russian language dominantly. Why would you expect names not be in it?

This sort of behavior you're demonstrating is exactly what contributed to recent Ukraine troubles.

P. S. In Italy, cities in Sudtirol - Alto Adige marked in both German and Italian. That's responsible behavior if you want examples.

Recent Ukrainian troubles were mostly caused by, oh I dunno, a foreign army invading?

> marked in both German and Italian.


> Recent Ukrainian troubles were mostly caused by, oh I dunno, a foreign army invading?

It's not as simple as "they just now got attacked and now their signs have all been changed".

Crimea and its people have had linguistic/social/cultural/economic ties to Russia/USSR for a lot longer than merely the recent conflict....

Well, China has very old ties with Outer Manchuria (also known as Priamursk is Russia [1]), so would it be OK for China to invade?

There is a lot of shared history around the world, like Alsace-Lorraine, for example. Strangely enough Germany doesn't invade France for some reason...

[1] - https://www.quora.com/Is-it-true-that-Russia-Soviet-Union-in...

Eh? I'm not saying it's okay for Russia to invade. Quite the reverse. I'm saying that the mere presence of Russian language signs in Crimea doesn't mean it was caused by the recent events.

Your example shows the exact same property. :-)

Oh, I understand. Good point, actually. :)

Nobody told you? Oh.

You mean of the Russian occupation?

Occupation schmockupation. If Russia was Germany, Poland would be getting nervous.

Nobody cares. Isn't it kind of sad, really? I mean that the world pays attention to Russia only when they either kill, steal, or take hostages?

Seems extremely fast, and actually has a lot more information about my university than google maps!

Looks like they are having some problems with caching. Many labels in major cities are cut off. Caching is hard, especially when you aren't using an out-of-the-box product like ESRI

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact