

NSA to store yottabytes of surveillance data in Utah megarepository - MichaelApproved
http://www.crunchgear.com/2009/11/01/nsa-to-store-yottabytes-of-surveillance-data-in-utah-megarepository/

======
neilc
Did anyone read through to the article NYT review?
(<http://www.nybooks.com/articles/23231>) The yottabyte quote is:

"As the sensors associated with the various surveillance missions improve, the
data volumes are increasing with a projection that sensor data volume could
potentially increase to the level of Yottabytes (10^24 Bytes) by 2015."

Note the qualifiers and the future time date; also, this is talking about
_raw_ sensor data. It is not uncommon for raw data to be reduced by a few
orders of magnitude via aggregation, compression, and simply discarding
irrelevant sensor readings before hitting permanent storage (e.g. look at the
LHC).

------
timdorr
Some quick math:

With 2TB drives, that means 500m drives spinning in the building, assuming no
redundancy, fs overhead, OS data, etc. (but then again, who the fuck has this
sort of production capacity? Definitely not WD or Seagate or any of the major
players)

If you store then akin to a Sun X4540 server, you fit 12 drives per U of
space. In a 50U rack, if you take away 2U for networking, that's 48 _12=576
drives per rack.

If we throw back in some redundancy, that's about 1m racks.

Each rack takes up about 10sf of space (~19"_6', accommodating for
clearances), so that's 10m square feet of datacenter space, not accounting for
any sort of power, cooling, or logistics space.

Throw back in the extras and you're looking at roughly a mile square of
datacenter space.

Wow.

So, we all understand this is bullshit journalism, right?

~~~
MichaelApproved
To put it in context. A square mile has 27,878,400 square feet in it. The
Empire State Building has 2,158,000 square feet of usable office space.

But this also assumes they're going to use HD as standard storage devices.
They would probably be using a different technology. It's clearly not possible
with current HD capacity.

They could also be building for the eventual capacity and not going to
immediately support that much data. The comments on the article page makes
this and other good points.

~~~
jbellis
> They would probably be using a different technology.

not even the NSA can get a yottabyte of some new technology manufactured
without it being all over the news.

~~~
MichaelApproved
The data doesn't have to be actively online. Tape technology is one option to
reduce the amount of hardware and space needed.

~~~
bbgm
and that process is fairly routine at most national labs and supercomputing
facilities. Not highly available, but not sure these data need to be.

------
patio11
10^12 TB is a lottabytes and, also, a yottabyte.

This reminds me of being in college not too long ago when a CS professor said
"Keep this under your hat but the main server is going to have ONE TERABYTE in
storage as of next month" and I got chills. 10 years ago it was a Big Deal
that a major research institution had a machine with a whole terabyte attached
to it -- not quite a supercomputer, but man, it had a TERABYTE, what else
could you want?

These days teenagers routinely have access to terabytes, some fairly small
companies deal with petabytes, and major government organizations have to go
up a few orders of magnitude when they let their imaginations run wild about
future needs.

------
fnid
Am I the only one who actually thinks this is plausible? First, I don't think
all the storage will be on hard drives constantly spinning and requiring
electricity. Imagine huge tape spools like in the old days but with 10,000x
more data on them and miles and miles long.

Also, every conversation, every satellite image at full res. There's a lot of
data out there. Every website with every update. every database. every
conversation, in every country.

They didn't say they'd be at a yottabyte next year, but no doubt, there will
be that much information to store and the NSA doesn't work on startup
timelines.

------
electromagnetic
I'm actually glad the comments here are a debate on the technological aspects
and there's little commentary here on whether this is right or wrong. This
kind of commentary is exactly where HN should be and often is, I'd like to
take the time to point out to everyone who regularly bemoans HN is turning
into reddit: it clearly isn't!

I find the CG rhetoric to be aged and irrelevant. The paranoia is so prevalent
they sound like two kids smoking up in a dorm room, they use no explanation
and cite no information for the paranoia than 'its the man'. It's pathetic.
It's a tech website and there's barely even a passing mention of the
technological feasibility of this project, and what examination of the
feasibility they have is very weak and based on poor assumptions.

The assumption that the NSA will have thousands of HDD permanently running is
absurd. Quite literally 1 cubic meter of 50GB blu-ray discs easily stores
~2.75 petabytes of data at an expected lifespan of 10-100 years (this is based
on pure physical dimensions alone, and not storage dimensions). A single 100
disk stack of 50GB BR discs should store ~4.8TB of data. This makes the mass
storage solely dependent upon write capability rather than size requirements.

------
MichaelApproved
My math shows they can store 126,419,753,08,641 hours of uncompressed 640x480
video. This is based on 1 hour of video taking about 79GB of data.

Using basic compression can take the number of hours stored much higher.

~~~
emmett
Which is only about 1 billion years of video. Presuming they wanted to video
every person in the United States (let alone abroad) you'd fill up that many
hours in a little less than 4 years.

Now, you do get a nice multiplier from compression (10-100x, depending on the
video and the quality) but that only brings it up to 100 billion years of
video, max. What if you wanted to store all the video ever recorded? Bear in
mind that the amount of video being recorded is currently in the exponential
increase growth phase.

~~~
joubert
Depending on your goal, one doesn't need to store every single frame of a
video in order to capture its plot.

~~~
emmett
Right, definitely not, but this is leaving aside all the other kinds of data
you might want to save, and there's a _lot_ of that too. I'm just pointing out
that the quantity of hours we're talking about is not nearly as absurd as
you'd think at first.

~~~
Devilboy
But clearly they want to track everyone and everything everywhere. Scares the
crap out of me.

------
brown9-2
_There is talk of the NSA shutting down altogether or being rolled into
another agency,_

Seriously?

~~~
thwarted
Nah, they're thinking of the CIA.

 _...it approached the point where there was no substantive difference between
the Library of Congress and the Central Intelligence Agency. Fortuitously,
this happened just as the government was falling apart anyway. So they merged
and kicked out a big fat stock offering._

------
jbellis
quote from the article's source, itself quoting a non-online source: "a
projection that sensor data volume could potentially increase to the level of
Yottabytes (1024 Bytes) by 2015"

vapor.

------
xccx
source: <http://www.metafilter.com/86293/Yotta-vote-against-this>

question: via moore's law, if it can hold up, what year can i hope to get a
yottabyte on my netbook? thx

~~~
scythe
Assuming your netbook has a 32 GB SSD, and that drive capacity doubles every
two years, 2099. Assuming that SSD prices drop dramatically and you have a 512
GB SSD (still this year), 2091.

In other words: not for a long, long time.

The real question is: how could you possibly generate a yottabyte of data?

------
joubert
All in one place?

------
anonjon
This seems like a hugely un-necessary thing to do; I imagine the next headline
on the news will be "NSA rescued from 2 square mile apartment filled with 8000
cats and pizza boxes to the ceiling".

I honestly can't think of any technology that is stable enough to support this
magnitude of long term storage.

Dealing with the amount of bit rot alone leads me to boggle at the notion.

