Hacker News new | past | comments | ask | show | jobs | submit login
Yottabyte (wikipedia.org)
53 points by groundCode on June 10, 2013 | hide | past | favorite | 40 comments



I think I've tracked down this extraordinary claim to this 2007 DoD paper on the "Global Information Grid", where it states:

Key target GIG technologies include: Very large scale data storage, delivery, and transmission technologies that support the need to index and retain streaming video and other information coming from the expanding array of theater airborne and other sensor networks. The target GIG supports capacities exceeding exabytes (10^18 bytes) and possibly yottabytes (10^24 bytes) of data.

So someone had a vague plan for a storage centre and couldn't put a finger on how much it would store to within 6 orders of magnitude! I don't count this as evidence that the NSA has more storage capacity than everybody else on the planet put together.

http://www.msco.mil/documents/_7_GIG%20Architectural%20Visio...


Well, if this is some infrastructure that's designed to last they're going to be expecting to migrate to denser sorts of storage in the future. A storage system that stores exabytes today and can be scaled to store yottabytes in 2033 does seem like a hard but reasonable ambition.


Thank you for finding this! While exabyte-scale is fairly realistic, yotta-byte relies upon utterly massive improvements in technology in order to fit in a data center the size of the one in Utah. The article states that microSD cards would be the volume of the pyramid at Giza. 4TB drives at maximum density, full racks of 4U-80 drive enclosures, would occupy 360 square miles of racks. Tape is less dense than hard drives, though it has huge benefits for shipping, transport, and long term storage.

So yottabytes may be feasible, but it's a ways out there.


The GIG does not describe one data center, it is an all-encompassing term for the entire set of interconnected computing resources owned and operated by the DoD.


Someone at the DoD did not think this one through...


So, per person on the planet, that would be 10^24 / 7*10^9 ... lets say on order of say 10^14 or 100 Terabytes, or 25,000 DVDs worth of storage.. for each man woman and child.

That would be enough to store a few years of everyones life in considerable detail : videos, photos, phone audio, GPS, financials, dna, web surfing, ...

You would probably need to take the complete visual and audio stream that went into a persons eyes and ears over the course of a year of their life, to start to fill that up, and do that for every living person.

I just wonder how much of that data they really need to sift through, in order to find likely terrorists ?


This is a back-of-the envelope, rather than a serious analysis, but perhaps it gives some idea of the scale?


The amount of data produced per person and the amount of data storage capabilities are increased on a regular basis (annual?) as well.

Finding terrorists is a red herring, in this case it's all about mapping and measuring the movements of every person and device.


It says the NSA Utah data center is designed to store data on the yottabyte scale. My mouth literally dropped open when I read that.


So let's check the sources: [2] A journalist 'Estimates', probably based on the Wired article [3] Contains the word 'yotta', but says 'handle', not store (which makes sense, considering it's a lot easier/cheaper to process something then to store it). This actually references a DoD report. [4] Does not contain the word 'yotta'

Maybe NSA is doing something undiscovered, but this feels more like sensationalism then realism. I'm sure they have a lot of storage and can probably rival google on data storage, but I doubt they have some sort of magic god-like storage and search facility


Maybe we should call Utahbyte instead?


> To store a yottabyte on terabyte sized hard drives would require a million city block size data-centers, as big as the states of Delaware and Rhode Island. If 64 GB microSDXC cards (the most compact data storage medium available to public as of early 2013) were used instead, the total volume would be approximately 2500000 cubic meters, or the volume of the Great Pyramid of Giza.

So... how do they do it?


Maybe "designed to store" really just means "we have administrative structures and software that will be able to handle such amounts of storage when we get it".


Who is to say they actually did it?

Assuming they were consuming 10 Tbit/sec of inbound data it will still take about 30,000 years to fill a yottabyte of storage.


In addition to the absurdity of the scale, I can't imagine you could maintain any amount of secrecy with the number of people required to tackle the technical challenges. Unless the NSA happens to employ 1000 network engineers at the top of their field.


I found an article that claims a new technology could allow 10TB/in^2. http://www.sciencedaily.com/releases/2012/10/121010083826.ht...

With that, it would only require 1.639*10^6 cubic meters.


Key word there being "only". Still a pretty damn large and expensive facility. The fact that they most likely will have some sort of redundancy notwithstanding.


For reference, wolframalhpa says that is approx. equivalent to 3.3x the volume this can carry (in oil): http://i.imgur.com/jSzgKAB.jpg http://i.imgur.com/dVXdESI.jpg So, roughly four of those.


The Wired article they cite says:

> as a 2007 Department of Defense report puts it, the Pentagon is attempting to expand its worldwide communications network, known as the Global Information Grid, to handle yottabytes (1024 bytes) of data.

Sounds like the yottabyte is a global/total capacity or a figure they aim to be able to handle in the future instead of a figure that they're currently hitting.


just because it's "designed to" store yottabytes does not mean that it actually does right now. they're just designing the system for the future.


That doesn't make a lot of sense to me. With that logic I could say my laptop is designed to store yottabytes of data, just because I think that maybe at some point in the future it could. Either you are incorrect or that wikipedia article is severely overstating the truth. Edit: I do agree with you though that this seems highly implausible at best.


The difference there is that the NSA building could actually handle that much data in the future while your laptop would require some physics bending in order to store/process that much information. There's a hard limit on the amount of computational power and information you can jam into an area.

That said the Wired article just cites a DoD report which say their goal was to have Global Information Grid handle yottabytes when Utah was up and running.


> With that logic I could say my laptop is designed to store yottabytes of data, just because I think that maybe at some point in the future it could.

Except you'd probably be wrong. Your filesystem almost certainly can't handle that much space. ext4 maxes out at 1 Exabyte. NTFS might barely make it if you use GPT and oversized sectors and don't run into any unforeseen problems.


I believe it could store about 4000,000 times the current total data of Facebook.


That - is there real zinger. More striking than the definition itself.


Any idea of what sort of medium could the NSA be using to store such an absurd amount of data? Maybe robotized tapes? I doubt they are doing it either in hard drives nor solid state.

Maybe the NSA has access to one of those high-density holographic optical storage systems we've been hearing about this years?


I'd wager it's HD's. The government may have big ideas, but it tends to implement insane scale with normal things. Remember, the budget here is in the $80 billion range. Just because they have all that dough doesn't mean they don't spend it on stupidly large numbers of standardized items.


Wowzers.

Based on data from [http://defensesystems.com/Articles/2011/01/07/NSA-spy-cyber-...], a back of the envelope calculation shows that if the Utah data centre was really storing even a single Yottabyte using SD cards, they'd need 90 floors of servers.

Going off the only public images I could find of the Utah centre, it appears to only be 2 or 3 floors. If this is the case, then NSA electronic storage is at least 30 times denser than current commercial tech.


I'm continually amazed by the human brain's inability to intuitively grapple with exponential growth. E.g. see [http://rense.com/general87/onetrillion.htm]


Do I understand this correctly, that with the new NSA factility using Yottabyte-sized storage, we are now out of prefixes for sizing data?

I looked around a little bit. Heck if I could find what comes after "yotta"


I suggest we change the spelling to yodabytes and start naming subsequent prexifes after other Jedi. windubytes, kenobibytes, etc...


I propose quads with regards to Star Trek nomenclature[1], which are increasing by 4 steps between the byte steps.

1 quad = 1 terabyte (10^3^4 bytes)

1 kiloquad = 1 yottabyte (10^3^8 bytes)

1 megaquad = 10^3^12 bytes

1 gigaquad = 10^3^16 bytes

And so forth.

While in Star Trek no direct comparison between quads and bytes is ever explained, this scale seems appropriate.

[1] http://en.memory-alpha.org/wiki/Quad


Yoda would be the ultimate on that list too, bro.


After yottabytes come hellabytes.


Then hollabackbytes, then yolobytes.


That is just too funny...that would be awesome if they made that the next name.


SI stops at yotta.. but there are unofficial prefixes: http://en.wikipedia.org/wiki/Non-SI_unit_prefix



Quick, coin it! I suggest Xottabyte. As a fallback tough we can always say Octilion Bytes.


Time to buy stock in Sandisk?!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: