
Counting solar panels in the U.S. with machine learning and satellite images - myinnerbanjo
https://www.engineering.com/DesignerEdge/DesignerEdgeArticles/ArticleID/18348/How-Do-You-Count-Every-Solar-Panel-in-the-US-Machine-Learning-and-a-Billion-Satellite-Images.aspx
======
dmurray
Why spend a month of computer time to evaluate all one billion images? A
randomly chosen million would seem to give you excellent data, more than
enough to make the statistical conclusions the article talks about. Maybe
you'd run the chance of misrepresenting large-scale commercial installations,
but you could correct for that manually, and that's already a risk since the
methodology skips some rural areas. Maybe it's simply good advertising to say
you did the whole thing. But more likely the real purpose of the survey is to
sell the gathered information at a much more granular level: zip codes,
streets, or better still, individual households.

~~~
AznHisoka
What's the usefulness of analyzing just a random sample? They need to analyze
1 billion images so they have _granular_ data on what % of homes in specific
zip codes, towns, etc have solar panels installed. This is actionable data.
IE. if there's lower % usage in a specific zip code, they might implement some
action to improve it in that specific zip code.

So of course, the value is all in the granularity. They already trained the
model on random images.

~~~
Scoundreller
Agreed.

If you can figure out exactly where the residential rooftop panels are, you’ve
found a goldmine of people susceptible to door-to-door sales of “get 10%
returns guaranteed per year by letting us sell you/install/use your ______”.

------
sbdmmg
Thanks! interesting article...reminds me of planet.com The project website [1]
has a lot more technical details.

[1]
[http://web.stanford.edu/group/deepsolar/home.html](http://web.stanford.edu/group/deepsolar/home.html)

------
daeken
> As DeepSolar learned to identify the characteristics of panels in the
> images, the system was eventually able to correctly identify that an image
> contained solar panels 93 percent of the time. About 10 percent of the time,
> the system missed that an image had solar installations.

How can these two facts both be true? If it fails to identify that an image
contains solar panels 10% of the time, then it can't possibly correctly
identify that an image contains solar panels greater than 90% of the time, and
that would require that it has a 0% false positive rate. I can only assume
they mean that it has a 7% false positive rate and a 10% false negative rate,
but that is ... not what the first statement says at all.

~~~
desdiv
Going by this graph[0], what they meant was: we had a precision of 93% and a
recall of 90%.

[0]
[http://web.stanford.edu/group/deepsolar/assets2/img/roc.jpg](http://web.stanford.edu/group/deepsolar/assets2/img/roc.jpg)

~~~
floatrock
Wikipedia has a good illustration showing the relation between precision &
recall vs. false positives & false negatives:

[https://en.wikipedia.org/wiki/Precision_and_recall](https://en.wikipedia.org/wiki/Precision_and_recall)

Precision is what percent of the identified-positives are actually positives:
true-positives / (true-positives + false-positives).

Recall is what percent of the actual-positives were identified as positive:
true-positives / (true-positives + false-negatives).

These are useful summary stats over the true-false/positive-negative matrix
because, to quote the wikipedia: "In simple terms, high precision means that
an algorithm returned substantially more relevant results than irrelevant
ones, while high recall means that an algorithm returned most of the relevant
results."

------
jcvhaarst
Another project in the same field is"Deep Solaris" : [https://www.biss-
institute.com/cases/case-5/](https://www.biss-institute.com/cases/case-5/) and
(in Dutch) : [https://www.cbs.nl/nl-nl/onze-
diensten/innovatie/project/zon...](https://www.cbs.nl/nl-nl/onze-
diensten/innovatie/project/zonnepanelen-automatisch-detecteren-met-
luchtfoto-s)

------
kpennell
Several startups including PowerScout (which has since pivoted) have done this
rooftop solar panel counting using machine learning.

------
brianbreslin
I've been curious about using machine learning + satellite images to do sales
prospecting. Grab neighborhoods and map the locations of people's pools, or
certain types of roofs, or size of yards. Spit out a list of addresses and
send them a flyer. Not sure if searching public records would be
faster/cheaper.

~~~
q3k
Please don't.

------
tr33house
Didn't Google have something along these lines or was it purely for choosing
panel placement?

~~~
wingi
[https://www.google.com/get/sunroof](https://www.google.com/get/sunroof)

~~~
josefresco
Sorry, Project Sunroof hasn't reached that address yet

