
Terrapattern: a visual search tool for satellite imagery - bufo
http://www.terrapattern.com
======
kcimc
I helped out with this project, happy to answer any questions. I was involved
from the beginning, but my biggest contribution was on the deep learning side
that does the tile-matching. I helped with the initial prototype using DIGITS,
Caffe, and a bunch of Python. Then Aman Tiwari moved us to a more accurate
34-layer ResNet trained in TensorFlow, and a more efficient nearest neighbor
lookup using a new implementation of CoverTree search.

~~~
danso
Since you've opened up yourself to questioning :)...Thanks for sharing, I
apologize for this very laymanny question:

I clicked on a swimming pool in New York City...there aren't a ton of them in
NYC, but very few of the matches have even a spot of blue in them...I know the
algorithm is more than just "look for more blue patches"...if I were to
explain this to another layperson, what is the most obvious explanation for
something that _seems_ more non-intuitive than expected?

Screenshot of my panel: [http://imgur.com/H0wo5jK](http://imgur.com/H0wo5jK)

[http://nyc.terrapattern.com/?_ga=1.84865689.1830936426.14642...](http://nyc.terrapattern.com/?_ga=1.84865689.1830936426.1464201975&lat=40.7256043&lng=-73.97539890000002)

~~~
workergnome
Not Kyle, but another of the team involved in the project. When we built the
training model, we used OpenStreetmap data to find locations of ~1,000,000
"things". A thousand churches, a thousand water towers, a thousand
playgrounds, and so on. For each of those locations, we downloaded a satellite
image.

The neural net was then trained to look for the things that make each one of
those things distinct; what makes a playground different than a church? It
could be patterns, it could be colors, it could be any number of things. (For
more precise details, you'll need to talk to Aman or Kyle.) It compares lots
of things to lots of things, makes some guesses, and then sees whether those
guesses help it correctly determine what we told it was in each tile.

Once the model is trained, it's identified the 1024 "features" that are most
significant in correctly distinguishing types of things from each other. We
then run every tile of a geographical region through that feature determiner,
which converts each tile into a point in our 1024 dimensional space. The
search function then identifies a tile, looks up its location, and finds the
100 things closest to it within the 1024 dimensional space.

So, TL;dr: It's not looking for colors, it's looking for computable features,
which may or may not be color-specific. (Actually, they're highly non-color-
specific: the training model randomly "wiggles" the color to makes sure that
it doesn't get too tied to a very precise color.)

~~~
dockd
In my experience in the USA, OpenStreetMap data that comes from TIGER (?)
imports, like heliports, are close but wrong. Playgrounds and watertowers are
usually accurate because someone added them by hand. Churches are mixed. Other
items are confusing; a hospital is typically a single point, but they tend to
be some of the largest buildings in small towns.

Any thoughts on how that affects your model?

~~~
workergnome
We found the same—using OSM data that just had points was problematic, both
because of accuracy and because it tended to often be the front door, not the
centroid of the object.

I believe we limited our model generation to selecting places that had
outlines, and computed the centroid of that outline. One of the benefits of
our technique is that we didn't need to be comprehensive—we can throw out lots
of places and still have enough to be useful for the model.

------
polartx
I see that the satellite pictures come from openstreetmap, but I'm not
familiar how often those pictures are updated, anybody know?

When they are updated, does Terrapattern recognize the change an update the
corresponding photos?

~~~
Bedon292
The imagery in their demo is coming from Google Maps, but if you look at the
copyright you can see what companies it is coming from. The USDA shoots, if I
remember correctly, the whole country every 3 years, and that data is public
domain. The commercial sites are constantly shooting new stuff, but it depends
on how often Google wants to pay for it. With their purchase of Skybox though,
they will likely start getting imagery updated on an extremely frequent basis.

------
Bedon292
This is quite fun, thanks for sharing. I would be interested in trying it out
with different tile sizes too. I keep trying to pick objects that are on the
corners of tiles.

------
tudorw
awesome, very, works great on sports fields
[http://sf.terrapattern.com/?_ga=1.257993994.1025291016.14642...](http://sf.terrapattern.com/?_ga=1.257993994.1025291016.1464207852&lat=37.6480227&lng=-122.4246675)

