
How big is the Google Earth database? - edward
http://www.gearthblog.com/blog/archives/2016/04/big-google-earth-database.html
======
Bedon292
This actually seems very low to me. The archive at DigitalGlobe is over 80PB
of data [1], and was growing at over 2PB a year [2] before the latest
satellite went up. There are more bands in the imagery than RGB, so it is
larger because of that.

But somewhere Google is storing the original resolution data. Aerial imagery
is at an even higher resolution, so even larger per square kilometer. They
also are holding different resolution data for different zoom levels. They do
not resample aerial imagery past certain zoom levels, they use a different
source.

At approximately 3225x3225 pixels per square kilometer (31cm pixels in
Worldview 3), that is approximately a 10 megapixel tiff, per square kilometer.
Which I believe is roughly 35MB. 496Million square kilometers of land on earth
not including antartica. Brings this over 17PB of raw data. And that is not
including any historical data either. Before starting caching the different
zoom levels of png's that are served out.

I know not all of the world is covered at that high a resolution, but there
are spots with even higher resolution to make up for it. Plus historical data.
And plenty of water included in the images not counted either. I highly doubt
they are holding anything less than 30PB.

[1]
[https://www.digitalglobe.com/platforms/gbdx](https://www.digitalglobe.com/platforms/gbdx)
[2] [http://www.networkcomputing.com/big-data/how-digitalglobe-
ha...](http://www.networkcomputing.com/big-data/how-digitalglobe-
handles-2-petabytes-satellite-data-yearly/767935503)

~~~
DonHopkins
I wonder if they use wavelet compression for all those aerial photos of ocean
water.

~~~
cmurf
Dedup. It's a patch set of ocean. If you mean the ocean topo data, that's
different, more like mountains but far less color info.

~~~
eli
I think that was a pun

------
johnjreiser
For some interesting historical perspective, here's what Microsoft and
partners had to come up with in 1998 to support Terraserver and it's 3TB of
imagery.
[http://research.microsoft.com/pubs/68552/msr_tr_98_17_terras...](http://research.microsoft.com/pubs/68552/msr_tr_98_17_terraserver.pdf)

~~~
mmanfrin
Neat that mapping _Terra_ required them to build a computer capable of storing
_tera_ bytes.

~~~
andylynchnz
Not a coincidence - it was built to demonstrate the scalability of their
technology to Terabyte datasets.

~~~
mmanfrin
The 'terra' I was referring to was earth :]

------
stuff4ben
I think a better question is "how" is all of that data stored? One large
Netapp? JBOD's all over the world? And how does that data make it into our
browsers so quickly?

~~~
themartorana
And how long until I have 3 PB of storage in my whatever replaces SSDs, and we
all laugh at how at one time that was considered a lot of storage?

~~~
BillinghamJ
I'm not entirely sure that will happen. My anecdotal info is that I actually
have less data stored locally than I did a couple of years ago. Most of my
data is now held online and retrieved on demand.

~~~
ctrl-j
Most consumers won't be the drivers of this. However gaming, workstations, and
enthusiast needs will cause enough demand for storage sizes to go up.

Modern AAA games are over 30 GB installed. Applications in general are taking
up more and more space locally.

An hour of uncompressed 4k video is approximately 1.72 TB. Photos have also
grown in size as resolution has gone up.

I'm sure there are other industries as well. I know personally between my
various hobbies, my SSDs are always full and I have several 4TB drives for
storage.

~~~
mortenlarsen
I am no expert, but I would think "an hour of uncompressed 4k video" would be
more like 6TB. Maybe even more.

    
    
       >>> 4096*2304*3*60*3600/1e9
       6115.295232
    
       Resolution: 4096*2304
       Colors: RGB (3 bytes)
       Frame rate: 60
       Seconds: 3600 (one hour)
    

Even using a lower frame rate like 25 would leave only two bytes for color
before we get somewhere near 1.72 TB.

    
    
       >>> 4096*2304*2*25*3600/1e9
       1698.69312
    

How did you estimate/calculate the 1.72TB?

~~~
mrpopo
Uncompressed video is typically stored in YUV colour space. A typical pixel
format, YV16 stores 1 luma for each pixel, and one Cb/Cr for every 2 pixels,
which does account for the two bytes per pixel you got :)

~~~
mortenlarsen
Thanks.

------
TimGremalm
"So our final estimate for the total size of the Google Earth database is
3,017 TB or approximately 3 Petabytes!"

~~~
comboy
Why do they have to estimate it? Aren't they able to check exact size of their
database(s)?

~~~
kyberias
I'm sorry to say I laughed out loud after reading your comment, but I actually
wondered about the same thing while reading the article. :)

You have to admit it would be hilarious if they didn't know the size of the
database or wouldn't be able to calculate it easily.

~~~
boxidea
The author of the article does not work at Google. So the article is basically
speculation. He has NO CLUE how Google stores the image data.

~~~
typon
Surely he could just email the Google engineers and they'll tell him.

------
shubb
This is the imaginary database but Google knows a lot more than that.

There are also vectors for the roads, with a link between that data and
streetview. You can click on the map and google will tell you that you're at
24 Burgess Street, and show you a photo, so it has a link between all that
information.

I suspect they also fuse this with information from phones. Do the phones
contribute to live traffic stats? Or the information about when a shop is
busiest (it will show you a bar chart of when a gym or store is busy). Is that
done based on knowing when android users are at that location? What about all
the Wifi access points it knows the location of?

In some ways, this is much harder to deal with than the image data because
although it's smaller in size, it's denser in information and the links
between are more complex.

~~~
SEJeff
Google's purchase of Waze is one of the main ways they get realtime traffic
data:

[http://www.businessinsider.com/how-google-bought-waze-the-
in...](http://www.businessinsider.com/how-google-bought-waze-the-inside-
story-2015-8)

For serious drivers, there simply isn't anything on the market (and free!)
better than Waze.

~~~
Raphmedia
Are all the features on Waze in Google yet? Or should I download Waze?

~~~
SEJeff
You should likely download Waze if you want the best experience. That being
said, using google maps, you'll see traffic updates or maintenance/cops/red
light cameras and it will say "Information from Waze" or something akin to
that.

------
hackbinary
As interesting as this is, the author should make it clearer on his main
landing page that his site is not affiliated with Google.

[http://www.gearthblog.com/about](http://www.gearthblog.com/about)

> This blog is not officially affiliated with Google. Google Earth Blog is
> dedicated to sharing the best news, interesting sights, technology, and
> happenings for Google Earth.

~~~
kzrdude
It's really not needed. This comment is entirely my opinion. Which is also
redundant to say.

------
sievebrain
They seem to have forgotten that the imagery database is calculated at
multiple zoom levels. If a country has high rest imagery for the max zoom, it
also has scaled down versions of that imagery for all the higher altitude
image sets.

So that would expand things quite a bit.

There are also height maps, which were not counted, and metadata.

~~~
brianwawok
Would it?

If full detail zoom took 1 MB... that same area zommed out 2x would take
256KB.. zoomed out by 2 is 64 KB... zoomed out by 2x is 16 KB... Don't know
all the zoom levels, but for a rough guess the zoomed in level should be
pretty close to the total (within 30% or so).

~~~
Bjartr
Depends on if it's precalculated or dynamically generated from the full res
and then cached.

~~~
Retric
Clients are clearly doing some scaling. So worst case is clearly less than 35%
overhead for multiple zoom levels.

PS: Sum (0 to N) of (1/4)^N = 1.33333...

~~~
brianwawok
Thanks for adding the maths I did the lazy guess ;)

------
erroneousfunk
Minor point: Can people stop using 3D pie charts? The "satellite imagery"
slice in the chart they show is slightly less than a third of the "historical
imagery (satellite)" slice, but it looks like it's about half because of how
the chart is rotated.

"3D imagery" is supposed to be slightly larger than the "Historical imagery
(aerial)" slice, but instead, it looks smaller!

There's just no reason to use this type of visualization.

------
nxzero
Huge database aside, I would be more curious to know what is the most valuable
data/metadata in the database. Anyone have any thoughts?

~~~
mike_hearn
Valuable as in cost? Aerial imagery in general is very expensive, especially
from satellites.

That's why the Earth/Maps team don't have an open protocol (or one reason).
They aren't allowed to just give the imagery away for free: it's licensed
specifically for that application. If you try and download the entire dataset
there are systems in place that will fight you, to defend the rights of the
imagery companies.

------
ant6n
I find it odd that they have to estimate this.

~~~
ucaetano
The blog isn't affiliated with Google.

~~~
persona
You are right - it took me a few secs to realize that. Better title should be:
How big I estimate to be the Google Earth database?

------
chris_va
Now multiply by the number of serving data centers, and disk replication.

------
IshKebab
Now estimate streetview!

------
peeters
Wow, I had no idea how creepily awesome Google's 3D imagery had gotten. It
used to only be skyscrapers in main cities. Now I can basically see in my own
window.

~~~
JoBrad
I get part of the "creep" factor, but anyone walking down the street has a
much higher resolution image of your house - and with real-time "video", to
boot! Maybe people just don't think about that very often?

------
eva1984
Those are some number flying...

But I have to say they dont surprise me that much, by the sheer amount of
data. The meat is in efficiently retrieve them, ie index.

------
zhte415
Our AI has developed a propensity for exploring and documenting regions
focused on cheese production.

We expect a report back in around 4000 years. Depending on results, we may
reset you.

------
wrg
>1024 TB

So... 1 PB?

~~~
elbigbad
There are actually an infinite number of numbers greater than 1024. For the
sake of brevity, I'll refrain from listing them though.

