
AI is supercharging the creation of maps around the world - infodocket
https://tech.fb.com/ai-is-supercharging-the-creation-of-maps-around-the-world/
======
yorwba
OpenStreetMap contributors weren't happy about the quality of tags when the
import started:
[https://forum.openstreetmap.org/viewtopic.php?id=63456](https://forum.openstreetmap.org/viewtopic.php?id=63456)

Facebook wasn't the only company adding low-quality data:
[https://forum.openstreetmap.org/viewtopic.php?id=64430](https://forum.openstreetmap.org/viewtopic.php?id=64430)
(Previous discussion on HN:
[https://news.ycombinator.com/item?id=18723138](https://news.ycombinator.com/item?id=18723138)
)

This article's quotes of what appear to be OpenStreetMap representatives are
generally positive, so maybe that means they fixed all the problems they
caused.

~~~
tmcw
OpenStreetMap contributors haven't been happy with anything, for the last 15
years.

~~~
tuukkah
There are many positive developments such as good quality satellite imagery
and goverment geographic data being offered to OpenStreetMap contributors for
integration. An up-to-date aerial image of a city does wonders to the mapping
experience. Mapillary's service complements OSM nicely as well as provides an
invaluable on-the-ground data source. There's so many happy things in the
community I can't possibly list them all.

~~~
habnds
Agreed, Tranport for London recently built an enormous database of cycling
infrastructure to be integrated into OSM. It's an exciting
project/collaboration.

[https://wiki.openstreetmap.org/wiki/TfL_Cycling_Infrastructu...](https://wiki.openstreetmap.org/wiki/TfL_Cycling_Infrastructure_Database)

------
michannne
Projects like these highlight that as ML becomes more and more complex, a gap
grows larger and larger which must be filled by manual or semi-manual labor.
If it isn't a team of volunteers combing the rendered results for errors, it's
users who have succumbed to those errors and leave feedback in the hopes the
system improves.

That gets me wondering - is the future of AI really just a semi-autonomous
twilight zone where cheap / free labor augments an already faulty system? If
not, what possible application is there for an expensive and closed automated
system which works 100% requiring no human input, when other options are
cheaper and leave clear directions for improvement?

~~~
tschwimmer
Yes and no. There is a huge explosion of 'Ops Plus' startups that take an
existing manual process and then build some basic tooling around it (with or
without any substantial ML component). There are mild to moderate efficiency
gains coming from this, but a lot of their valuation is coming from a bet that
in the future they'll be able to fully automate the system and reap efficiency
gains.

In practice, almost nobody is even thinking about building a fully automated
process for every case. The reason is simple: automating the first 60% of work
takes x effort, automating the next 30% takes 10x, and automating the next 9%
takes 100x and operating the final 1% is essentially impossible. So if you
came to the table with the goal of 100% automation right out the gate you'd
spent 10 years developing something with little to show.

I think full automation of some systems is possible, but is actually blocked
by generational norms. By and large the systems that "Ops Plus" startups are
attempting to automate were designed by people who are not digital natives.
They're not illiterate, but things like instant messaging, async communication
and and structured data are not natural primitives for them. I'm not saying
everyone in the Fortnite generation is a master data modeller, but I think
that when they join the workforce they'll set up systems that are much more
feasible to automate.

~~~
nicolaskruchten
I’m interested in learning more about this “Ops Plus” term but it doesn’t seem
very google friendly... Is this a term of art in some circles?

~~~
tschwimmer
I actually just made it up. Companies that come to mind are Scale.ai (doing
data labelling stuff for ML that historically would have been done by
outsourcing companies, etc), Flexport (freight forwarding, traditionally done
via spreadsheets and emails), Checkr (background checks), Atrium (legal
services), Oscar/Clover (health insurance), Cadre (real estate investing).
There are tons and tons of them in the recruiting (really sourcing) space too:
I'd say Triplebyte is an Ops Plus company, as is Sourceress and a couple more.

------
letcree
Very cool that they are integrating it with HOT tasking manager and making it
easy for anyone to use the editor with the ML-generated proposed objects in
several countries. I think the ML has been there for a couple years but
currently it's not easy to take advantage of it.

Hopefully they eventually release the ML pipeline itself as well.

Their RapiD editor has some similarities to a research project I was involved
with:
[https://mapster.csail.mit.edu/maid.html](https://mapster.csail.mit.edu/maid.html)

([https://www.youtube.com/watch?v=i-6nbuuX6NY](https://www.youtube.com/watch?v=i-6nbuuX6NY)
vs [https://tech.fb.com/wp-
content/uploads/2019/07/add_ML_road.g...](https://tech.fb.com/wp-
content/uploads/2019/07/add_ML_road.gif.-1.gif))

------
Avamander
I guess it's nice in those areas with low contribution precentage. But I
suspect in many cases and definetly in Estonia there's a national high quality
map database that could be used to augment existing maps instead. I'm
wondering though, has anyone attempted to do so, quick Google search reveals
nothing.

~~~
rmc
It is possible to "import" other data into OSM. However it's very tricky.
Licencing/legal issues aside, it can be very technologically challenging to
merge another database into OSM.

I suggest posting to the OSM talk@ mailing list, or if there's a local
estonian one too

------
rihegher
They say it uses a deep neural network but we don't know this is a convnet or
another type of NN. Anyone has more technical infos about it?

~~~
gwern
This is semantic segmentation, so it has to be a convnet. I've never heard of
trying to do semantic segmentation any other way, and a prototype of another
approach wouldn't go into production at scale.

OP links to [https://ai.facebook.com/blog/mapping-roads-through-deep-
lear...](https://ai.facebook.com/blog/mapping-roads-through-deep-learning-and-
weakly-supervised-training) which says that it's D-LinkNet specifically:
[http://openaccess.thecvf.com/content_cvpr_2018_workshops/pap...](http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w4/Zhou_D-
LinkNet_LinkNet_With_CVPR_2018_paper.pdf) (more or less a Unet).

~~~
joshvm
I don't disagree, but I'd be careful with that statement.

A huge amount of landcover segmentation in remote sensing still relies on
simple models - either linear regression (thresholds) or classical machine
learning like random forests or SVMs. For a lot of cases, these techniques
will get you 90% of the way and it's very rare to have ground truth data that
is accurate enough that you can measure the difference with any real degree of
confidence.

A big problem in the field is the lack of good (public) ground truth. There's
so little hand labelled data to work with that without humans in the loop it's
extremely difficult to validate the results meaningfully (unless you have an
army of staff to do it). With something like roads you could also have
heuristics about what a road looks like and where it goes (e.g. it's a
continuous thin line), which can help condition things.

I've seen a lot of papers which are applying deep learning for semantic
segmentation for satellite mapping, but they evaluate on very limited
datasets, they attempt to regress to simpler models without realising it (e.g.
trying to predict a linear model), or they leak train and test data and report
amazing results because they randomly split data from the same region.

I'm not saying that convnets aren't better than simpler models, but
particularly for satellite imaging I'd take them with a pinch of salt and see
what the improvement from a baseline method is. If you look at a random
sampling of papers from the DeepGlobe competition, almost none of them provide
the results from e.g. a cheap linear SVM.

Fun side note - several existing "famous" datasets generalise poorly to the
developing world because most of the imagery is from the developed world (and
even more specifically the West) and infrastructure looks totally different.

~~~
junipertea
How do you give a class to each pixel in an image using a linear classifier in
a way that uses the surrounding pixels as context/input? I'm genuinely
curious! You are right about the data, it's expensive to make and startups
based on satelite imagery tend to keep them as it's their main advantage.

~~~
joshvm
In a most cases, you can reshape to a 1xN^2 vector for a NxN region. This was
how object detection worked long before convolutional inputs were popular.

Have a look at mnist classification using a linear SVM, for example.

~~~
junipertea
Classification makes sense, because you do a linear (or kernel) combination of
the input and squash it using sigmoid to get a probability of a class. For
segmentation you output a pixel mask so you would have a NxNx3 vector to
predict 1 class for 1 pixel and then you would have to do it for all pixels so
you'd have to encode the position as well. Alternatively, if you take a unique
weight for each position you end up with get single FC layer with NxNx3 inputs
and NxN outputs (N^4 parameters). I guess for me it's hard to imagine doing
segmentation "back then" and I find it very fascinating.

------
dr_faustus
Is it open source? It is somehow implied, but I can't find the source.

------
m4eta
Maybe AI can help with Lyft/Uber. I'm sort of tired of being dropped off in
alleyways.

~~~
jdblair
The great thing with Lyft/Uber is that I'm in the car and I can say "please
take me around front to the main entrance, not here in the alleyway."

~~~
andygates
That works for regular cabs too! /s

------
workingpatrick
My first thought is wondering what data FB hopes to acquire from making this
tool available.

~~~
gwern
FB is far from the only big tech company which has sponsored OSM as a
competitor to Google Maps
(Microsoft/Facebook/DigitalGlobe/Telenav/FourSquare/Craigslist have all
sponsored OSM to some degree; Apple, of course, went its own way and created
Apple Maps).

It's a reaction to Google Maps: a monopoly on high-quality up-to-date global
maps with business location is dangerous to everyone else, as a chokepoint on
mobile applications. It's less about 'acquiring data' and more about not being
extorted by GM. Classic 'commoditize your complement' dynamics:
[https://www.gwern.net/Complement](https://www.gwern.net/Complement)

~~~
gashad
Yes, see also "Corporate Editors in the Evolving Landscape of OpenStreetMap"
[https://www.mdpi.com/2220-9964/8/5/232/htm](https://www.mdpi.com/2220-9964/8/5/232/htm)

Figure 3 demonstrates the scale of corporate OSM edits.

