Hacker News new | past | comments | ask | show | jobs | submit login
Glass discs that can store 360TB and remain intact for billions of years (2016) (disclose.tv)
281 points by elorant on June 25, 2019 | hide | past | favorite | 240 comments

I dunno, but it seems to me universities increasingly focus on press releases and marketing departments...

5D and 13.8 billion years? Phony. I know no sane scientist or enigneer that would say they use more than 3D physics. Sounds like catch phrases to sell. With such logic plain old HDD is at least 4D, because it uses CHS coordinates and magnetic orientation, to store data.

To me such communication style undermines the real scine behind it. Do they have nothing better to brag about than excuses to call invention 5D? Why introduce such noise?

Yes, 13.8Gy is the approximate age of the universe and yes the researchers used that number. But it's not like they made this up ("phony", as you put it)!

Turns out the biggest decay factor is nanograting and the biggest contributing physical quantity is temperature. They plotted what decay would look like on an Arrhenius plot. They both computed estimates and measured to confirm: the measurements are quite accurate. The specific claim is that they computed that it would last 13.8Gy at a some reasonably high temperature (462K). They could've picked any other point on the time/temp scale, like "here's how long it would last at room temperature" or "here's how hot you could store it if you only cared about it for a billion years".

They did not, however, simply make up a number with no justification, let alone commit straight-up academic fraud, as you're implying.

The paper is here: https://www.researchgate.net/publication/297892219_Eternal_5....

I don't think they meant it was fraud, just that it's sensationalist drivel. It's choosing a way to interpret results for the sake of sounding impressive, rather than improving understanding.

Choosing a temperature in order to say "this will last as long as the universe's current age", or finding ways to count extra dimensions is the behaviour of marketeers, not academics.

And yet, I pulled those claims out of the paper, which I linked. Are you suggesting Peter G. Kazansky, a research professor with over two decades of experience in optics, does not count as an academic? Or are you saying the paper doesn't make those claims? Or are you saying the paper also isn't a real academic paper and just marketeer noise?

(I'm not quoting the other authors, who are of course also distinguished scholars :-))

The complaint here appears specifically to be a lamentation that academics are behaving this way, so in attempting to prove their priors you're only strengthening the argument. Your tone seems to be adversarial, so I'm guessing that's not your goal...

"The paper isn't a real academic paper and itself also marketeer noise" is one of the options I've outlined as a debate position. But let's be clear: we started by calling research "phony" over something that would've been trivially clarified by reading a short paper. In order to even get to that point, I have to acknowledge that maybe you get to call research "phony" without seriously implying fraud. I have a hard time taking that as a serious, bona-fide argument.

There are actual papers. Press releases and popularizations are not the same thing. If you look you can find the actual papers.

It appears you’re agreeing with me? I’m suggesting the confusion would have been alleviated by reading the actual paper.

While his tone seems a bit adversarial, at least he is providing relevant information to the topic instead of a generic rant. I greatly prefer the former.

The temperature at which it will last the universe's current age is 462K. This is 189 Celcius, or 372 Fahrenheit. They didn't choose an unreasonably low temperature to say it'll last the universe's age. They picked a temperature to say it'll last the universe's age even if you crank up the heat to an absurd degree. They could have given a much larger timespan if they stuck to room temperature.

"Room temperature" is a pretty meaningful rough measurement for storage of anything.

But the temperature was 462K (about 189C), far beyond room temperature, or what we'd expect to store data at. This is just the temperature at which the expected lifetime is the age of the universe, which has no bearing on the technology itself.

13.8 Gy is the age of the universe so far, in the past. Completely irrelevant to how long the discs will last starting from now.

What would be a good number of years to use? "Literally as old as the Universe is now" seems like as close a number as you can use to put timespans that big into _some_ sort of perspective.

How about an actual estimate? If it's 100 billion years that would be an even cooler number to throw around. 7 times the age of the universe!

A more useful number would be MTBF or a decay rate.

But the point is that decay rate is almost entirely a function of temperature, so you have to pick... something. Unless you feel like log/log years/thermodynamic Beta is a better way to communicate decay than “it lasts as long as the universe has been around”. (Which they do! In the paper! Just not in the summary because nobody has a feel for what log/log beta/time is.)

"They did not, however, simply make up a number with no justification..."

Right, the concept of storing data in crystalline like structures is not new and there's good technical reasons to believe the technology is a goer. (See my more general comment hereunder giving reasons.)

BTW, this is a truly exciting development, let's hope the practical implementation doesn't stall.

One of these is now orbiting the Sun for 30 million years in Elon Musk’s Tesla - see www.archmission.org

A light field is typically 5D: 3D position + 2D angle. The concept dates back from 1936 according to https://en.wikipedia.org/wiki/Light_field

And it looks like it is the idea behind that storage medium. I haven't read the paper though so I can't really comment on whether or not it is justified to call it 5D.

Also note that 4+ dimensions are relevant even in a 3D world. For example the Lytro camera really outputs 4D light fields (2D position + 2D angle). Of course, there is nothing extradimensional about that camera, it is just a microlens network in front of a regular CCD, with processing that converts a high resolution 2D image into a low resolution 4D light field.

The problem, I think, is when people conflate coordinates and properties. Each dimension is a component of the coordinate and is expected to increase the extent of simultaneously indexable positions, each of which is assigned its own properties configuration.

There is a common conceit to treat a vector of possible properties as one (small) dimension. For example, we can make your light field hold an RGB value at each coordinate. We can alternatively interpret this as a 5D field of 3D luminance vectors or a 6D field of luminance scalars.

However, if someone writes out a set of these light-field measurements as individual measurements, they might append the coordinate vector and the properties vector into one longer point (x, y, z, a1, a2, r, g, b) and claim it is 8D. However, that perspective ignores that a real light field cannot store more than one color combination at a given (x, y, z, a1, a2) coordinate. An arbitrary 8D point cloud can represent nonsensical, sparse superpositions of multiple light fields.

It’s perfectly reasonable to say you’ve increased dimensionality when you find a new physical property to store things with. If a CHS-addressed HD tomorrow starts using opacity or depth as a way to store 2 bits in each, obviously the dimensionality has increased.

In ML, we regularly accept 100+d data (much higher for imaging applications). Even simple mechanical systems have large state vectors.

I’m a regular critic of university systems but this isn’t the university making stuff up because it sounds impressive.

(Additional evidence: one of the authors is a Microsoft employee, not an academic employed by the University of Southampton.)

The criticism that I am also willing to go with is that dimensionality in statistics / linear algebra / machine learning assumes abstract dimensions, where readers of popular science assume physical dimensions.

When the number of dimensions is so close to 3, and the subject of discussion is ways to physically store data, this appears misleading.

I agree that’s probably what’s going on, but “the university is making stuff up to look busy” and “no sane engineer would say this” and "clearly they made this up, 13.8Gy is mighty convenient" isn’t exactly a charitable read.

There's a related Microsoft Research project https://www.microsoft.com/en-us/research/project/project-sil...

(I work in the same lab but on different things)

The one recent (late 2018) publication (1) from that lab hints at the current practical state of the technology:

“We have not (yet) built a full storage system and are currently building out (multiple) prototype read and write heads.”

“the prototype decoder achieves an accuracy of 99.47% across ... voxels written at two micron lateral separation, over 10 layers.”

“we anticipate that in a volume equivalent to a DVD-disk we can write about 1 TB. The technology can potentially get to 360 TB”

1. https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

Then you know of no sane scientist or engineer, at least any that might have a passing familiarity to n-dimensional data or basic mathematical concepts.

Dimensions are the conventional term when talking about aspects of data. It doesn't just have to be spatial data, although that's the common usage of dimensions.

If you're using the 3d position with two additional dimensions of size and rotation of the object at that position, you're literally processing 5 measurable dimensions of input in order to extract data.

It's not like they're claiming to record data in time and a wibbly wobbly 5th dimension. It's a literal mathematical dimension describing the properties of the recording medium and method.

> I know no sane scientist or enigneer that would say they use more than 3D physics.

It's completely common for scientists to use more than 3D, since D does not have to mean spatial dimensions. Relativity is 4D (space+time), M-Theory is 11 or 26 or 10 depending on context.

Even classical mechanics is formulated as 6D (3 position, 3 momentum). This leads to Hamiltonians and Lagrangians, cornerstones of modern physics, upon which all modern physics is based.

As to light, it's east to find multiple dimensions to store light patterns: 3D space, 1 D polarization, 1D frequency, and now we have 5 independent dimensions in which to store data.

5D, if they use 5 independent degrees of freedom to store data, does not undermine the science - it makes it precise.

The title of the paper is "5D Data Storage by Ultrafast Laser Writing in Glass" and is an invited paper at a top conference in their field. I suspect the '5D' being in the title means it is not marketing hype but actual physics.

EDIT: Here [1] is the paper. They do use 5 degrees of freedom as I expected: 3 position, slow-axis orientation, and retardance. So it most certainly is 5D storage.

[1] https://www.researchgate.net/publication/297892219_Eternal_5...

The 5 dimensions refer to the precise control of 5 degrees of freedom of nanogratings. 3 degrees of position in spatial coordinates and two degrees of polarization aspects. It’s in the paper and not phony, although the word “dimension” might be better replaced with some other word that doesn’t sound like 5 spatial dimensions - which is not what is intended. Before saying something is phony you should probably read the paper(s). That is from one of the top optoelectronics experts in the world.

Just speculating, but if you can index in 5 dimensions to get your data, should it not be called 5D?

data = read_bit(x, y, z, yaw, size)

But i agree, i would appreciate more information if they are going to use buzz-words.

That seems more like five degrees of freedom than five dimensions.

What's the distinction? Wikipedia seems to think there isn't one:

"In mathematics, the dimension of an object is, roughly speaking, the number of degrees of freedom of a point that moves on this object."


N dimensions can give you 2^N degrees of freedom.

If you have 3 spatial dimensions x, y, z, then you can have 3 degrees of freedom by translation and scaling along those dimensions. You can also have another 3 degrees of freedom by rotation around planes x^y, x^z, and y^z. I'm not sure what degrees of freedom are represented by the dimensionless scalar and pseudoscalar x^y^z, but probably one of them is mana points, and the other is hit points.

Two spatial dimensions give you two translation/scaling axes (x, y), one rotation (x^y), and one stamina points (scalar).

A 4x4 transformation matrix has 16 values, suggesting 4 dimensions, but it doesn't have 16 degrees of freedom, because several values are constrained. Nine values have three degrees of freedom, three other values have three more, and four are fixed, with zero degrees of freedom. Those six degrees of freedom suggest three dimensions, where two of the potential degrees of freedom are ignored.

So a higher-dimensional mathematical space can embed a lower-dimensional model, if the constraints are clever enough, and map the degrees of freedom onto the right dimensions or combination of dimensions. This might be useful for generating smoother animations using linear interpolation, or avoiding gimbal lock, or attempting to unify gravitation with electronuclear forces, or something else.

The difference is that people don't read the titles of press releases with pre-conceived notions of what a degree of freedom is, so it's not as useful for making clickbait.

5D has been a term of art for years, and was not made up by this article. https://en.wikipedia.org/wiki/5D_optical_data_storage

I don't know the optical storage space one way or another, so I was curious to learn more. As far as I can tell, that Wikipedia page you've linked to is solely about the same work by the same researchers as the article. Am I overlooking something? Has there been work by others using the same term? Or is the idea that the process of becoming a Wikipedia title is what makes it a term of art?

Even the Wikipedia page contains information about the original Tokio research team that only has one shared author. Furthermore, 5D is the natural extension of 3D, which was already used to distinguish it from traditional 2D (like, say, CDs).

"Term of art" might be a subjective concept, but the article making it up or not is objective.

To be the devil's advocate, it is often convenient to represent states of a system with N degrees of freedom as a subset of an N-dimensional space. For example, state space of two balls linked by a rigid rod in 3D would be seen as a 5-dimensional space.

That said, the article is heavy on marketing (13.8 billion year claim is not scientific, at least in spirit) and looks like PR department product.

The 13.8Gy claim is made by estimating, calculating and then confirming by experiment how quickly wear and tear works (the primary mechanism is nanograting, and the primary contributing physical quantity is temperature). It then specifically grabs 13.8Gy as an example data point because that's the age of the Universe, but they could've picked any other point on a time/temperature scale, like, say, "room temperature".

In what sense is this "not scientific"?

Because they first chose the result they wanted — age of universe and worked backwards to derive that result. Is it true? Sure, probably. The method flys in the face of the foundations of science and was done for sensationalism.

What’s special about 462K? Is it a local min/max? Doesn’t seem so.

They computed, verified and measured wear and tear (lifespan in Gy) as a function of temperature. The paper, which I linked, includes the full plot. They had to pick _some number_ (temp or years) in order to communicate it as one data point. There's nothing special about 462K -- it's what happens when you pick the age of the Universe as how long you want it to last form now. Pick a billion years, or 10 billion, and you get a different temp.

Given that they had a hypothesis, designed an experiment and validated the result, it seems a bit much to say it "flys[sic] in the face of the foundations of science".

And finally the disc will never reach that, because it will not physically survive (or it's surroundings) to reach that time.

Not on Earth, not in space, when real considerations like impact resistance are in.

By that definition I can double the dimensionality of the storage by adding a second disk.

Hmm... so the D could be for 'D'egrees of freedom? Who knows... like someone said, it is marketing, and hence a lot could be lost in translation just for easy/popular optics.

What is the difference between dimensionality and degrees of freedom? The standard definition of degrees of freedom is "number of variables that make up the state of the system" and the definition of the dimensionality of the state vector is...

There is no difference. I think people just get hung up on the word 'dimension' in the sense of degrees of freedom in a spatial state vector (to reuse your terms). Perhaps these are people who learned linear algebra only in the context of spatial dimensions / geometry and have never had to apply the concepts to any other problem domain.

Ya but that’s pretty much the same thing. Surely mathematically it’s the same thing.

From a linear algebra point of view, there's no difference.

Neural networks used for AI are represented by N-dimensional spaces of, for example, 1024 dimensions.

Aren't they synonyms?

I'd rather call them parameters or indexes. Certainly not dimensions analogous to 1/2/3D.

People have been storing things in 5D on wood substrate for years. Like clothes in the closet (the 2 extra Ds might be type and color).

The terminology is clearly mean to impress the general audience that would normally not even bother to read a classic "engineering" title.

With 1d data storage I can double the length to double the capacity.

With 2d I can double the length or width to double the capacity.

With 3d I can double the length width or height to double capacity.

That's why 3d storage is interesting.

> I dunno, but it seems to me universities increasingly focus on press releases and marketing departments...

A byproduct of public disinvestment in higher ed. Nowadays, you've got to show the taxpayers, and donors, where their money goes. Which is not all bad in my opinion.

There is a lot of investment in higher ed. Everybody wants more of the limited resources available. Since resources are limited the natural result is marketing trying to show you are worth more.

What bothers me even more is the precision of time they can keep the data: 13.8 billion years. Not 13.7, not 13.9 - no: they figured out 13.8 exactly. That's fake precision and remarkably close to the current age of the universe...

Yes, 13.8Gy is the approximate age of the universe and yes the researchers used that number. But it's not like they made this up and conveniently landed there!

Turns out the biggest decay factor is nanograting and the biggest contributing physical quantity is temperature. They plotted what decay would look like on an Arrhenius plot. They both computed estimates and measured to confirm: the measurements are quite accurate. The specific claim is that they computed that at a specific temperature (462K) the medium will last 13.8Gy: their point is that it would last as long as the age of the universe at that temperature. They could've picked any other point on the time/temp scale, like "here's how long it would last at room temperature".

They did not, however, simply make up a number with no justification, let alone commit straight-up academic fraud.

The actual paper is here: https://www.researchgate.net/publication/297892219_Eternal_5...

Are the error bars on the decay data they're computing from tight enough to justify three significant figures? Three significant figures doesn't seem like very much to me given the precision of modern metrology. Edit: I checked the paper and it looks like in this case the precision is actually ± one order of magnitude, so even one significant figure on the decay lifetime is overstating their precision. They can probably justify 1½ significant figures on the temperature, though.

>I know no sane scientist or enigneer that would say they use more than 3D physics.

Are they from flatland? If not, how do they calculate motion?

I think (and it's possible I'm being too charitable in the face of bad science reporting) it's a reference to the number of independent variables that affect a read operation, and not a statement about new spacial physics.

To clarify what I mean when I talk about what I think they mean (yes that sentence is confusing), consider the following example:

The pits on a DVD are subject to one-dimensional analysis because every position targeted by the laser for a read operation can be in one of 2 states (i.e. 2^1)

The pits on a dual-layer DVD are subject to two-dimensional analysis because every position targeted by the laser for a read operation can be in one of four states (i.e. 2^2, because there are two layers and each layer can be in one of 2 states independently of the other)

If I understand the article right, this technology uses 3 physical layers, and adjusts the size and orientation of each air pocket in the glass. So on each individual layer, a dot can be categorized in any one of 8 states (2^3) and there are 3 layers (2^3) of dots. That should give 2^6 possible states per read operation. So why do they say 5 instead of 6? There could be any number of reasons, such as certain adjustments to size and orientation at a higher layer affecting the ability to read from a lower layer, or error correction built into the decoding algorithm.

But yeah, that's my lukewarm take on their silly buzzwords.

It's fairly standard Physics jargon. Physicists don't like saying the word "impossible", because there's always someone out there that can't stand hyperbole. So they often refer to something as "not likely to happen within the age of the universe" rather than "impossible", hence the 13.8bn year estimate here.

Heads is a low resolution dimension, so CHS should probably be more like 2.5D. Magnetic orientation is just binary, so not a dimension.

The 5D in the article is kind of fishy, but in this case size and orientation are more of a low resolution dimension like multi-level cells in a flash drive.

If it's a property of an object that defines the object, it's a dimension. Spatial location gives us 3 dimensions, rotation and size give us 2 more. The dimensions don't all have to use the same units.

I work in Science (optical microscopy) and we use this sort of terminology for multidimensional data all the time. It's common in science.

This account is amazing! Every reply I looked at is a classic middlebrow dismissal (in the sense of pg’s essay)

Thanks to lvh, who pushed me to actually look at the paper.


I think your emphatic view is somewhat ill-informed. For years it's been theorized that glass-like crystals could store information indefinitely—even well beyond the age of the universe. Here's the reasoning:

1. The oldest crystals known on earth are already some 4.4 billions years old and they are still intact. They are zircon crystals found in various places across the planet but the oldest discovered to date were found in Australia.

2. For all intents and purposes these zircon crystals still contain most of their original encoded information from the time they were formed. The reason we know this is that these crystals have remained essentially intact and structurally undamaged over their lifetime of some 4.4 billion years—by definition, if they'd lost all their information they would no longer be intact crystals.

3. Yes, they will have lost a tiny percentage of their information integrity from the time when they were formed, this data loss would have been the result of small amounts of both internal and external radiation and from external cosmic ray radiation, heat and other environmental effects, nevertheless zircon is a very hardy material and this loss has been minimal.

(Essentially, restating the obvious, as zircon has been shown to keep most of its data intact for at least one third of known length of the universe (13.8/4.4 Gy) then by extension the storage time mentioned in the paper [13.8 Gy] for this glassy crystalline structure is highly feasible).

4. If these 4.4 Gy zircon crystals had originally been encoded in ways that allowed lossless data recovery such as Reed-Solomon encoding as used in CDs and DVDs to recover data then today we would be able to recover ALL of the original information from 4.4 Gy ago.

5. We know that under ideal conditions some crystals such as zircon and others have lives of much longer than 4.4 Gy. Why? Well crystal structures are known to be the most stable forms of matter in the universe that goes with the fact that they have the lowest entropy: https://www.livescience.com/50942-third-law-thermodynamics.h... — to quote: “The entropy of a perfect crystal is zero when the temperature of the crystal is equal to absolute zero (0 K).”

6. Theoretically, a perfect crystal with zero entropy in an ideal environment would keep its information for eternity—which is much longer than the age of the universe. Now, as we know zircon and its data/structural information can exist almost intact for at least 4.4 Gy and that a perfect crystal kept under ideal conditions will do so for ever, logic dictates that we cannot rule out that crystals cannot last the age of the universe which is ≈13.8 Gy.

7. That said however, making crystals that last the age of the universe is one thing but keeping them sufficiently intact to be able to recover all their data after this length of time is another matter altogether. To do so—as mentioned—we would have to encode the crystals with a data recovery algorithm and ensure that they were safe from cosmic radiation etc. (Even with hardy crystals such as zircon we may have to send them to a part of the universe that's very low in ambient cosmic radiation (but we'd have plenty of time to do so given their intrinsically long and hardy life).)

8. The ultimate life of encoded data in near-perfect crystals would be the product of the universe's environment and the degree of how perfect the crystal was (how close to zero its entropy was). A similar problem arises with ancient DNA, no matter how well DNA may me stored against damage it ultimately will succumb to damage from cosmic radiation which on current estimates is about 1 to 2 million years [right, Jurassic Park is not possible by resurrecting DNA at least]. Keep in mind however that zircon-like crystals are millions of times more hardy and resilient than is DNA.

9. As I said, the concept of storing data in stable crystal structures is not new. Seeing the crystal/glass-like photo in this article I cannot help be reminded of the remarkable similarity between it and HAL's crystal-like memory in Kubrick and Arthur C. Clarke's 1968 classic science fiction film '2001: A Space Odyssey'. I'm pretty certain, one way or other, crystal data storage of this type will be the norm before long.

10. Oh, BTW, for those who might worry about the concept of 5D not making much sense, I can assure you it's pretty common these days, it even extends down to what's called 5-axis machine tools: http://www.5-axis.org/


Please don't take HN threads into religious flamewar. That is not welcome here.


Edit: please stop posting flamebait generally. You've been doing it repeatedly, and we ban that sort of account regardless of its views.

The university in question is Southampton. Hardly a Christian fundamentalist stronghold. Meanwhile, there is a tradition of using the Bible for new printing media, including one of the first printed works, the Gutenberg bible.


Oh for Pete's* sake you're unnecessarily debasing the discussion. The real issue here is that these works are probably the best known in the 'The Western Canon' and of course you'd naturally use them as benchmark examples as they've rightly done here.

To not use such well known examples on the grounds of a form of political correctness is both absurd and it also denies us our culture and heritage.

I do not believe in the literal truth of the Bible as such, but I certainly do not deny its importance in Western Culture. Moreover, the King James Edition of 1611 is one of the most remarkable texts ever written [perhaps you should read a few bits of it some time], especially so given that it was written by a committee—committees usually produce Lowest Common Denominators but that certainly wasn't so for the KJE!

There is nothing inconsistent with what I have said here.


* I was nearly going to say "for Heaven's sake" but I thought better of it. ;-)

What’s wrong with the Christian bible?

What is it telling me about the researchers!

Absolutely nothing. The only theory I can come up with is that GP is a snobby atheist who associates Bibles with morons.

Using the Bible could be interpreted as ‘this is one of humanitys most important works so let’s preserve that first’ Yet the majority of people on planet earth are not Christian. I’ll go with the Gutenberg homage though

The Tanakh also has the distinction of being one of the few works of literature that has survived for 3000 years largely intact, and I think it's considerably longer than the older parts of the Egyptian Book of the Dead.

Way more than that many people couldn't care less about Gutenberg though. Point being the bible is the most read book in the world.

Since Christianity is not the largest religion, I find that claim needs additional proof.

Really? It's a google search away https://www.businessinsider.com/the-top-10-most-read-books-i... And it's in the Guinness book of world records.

Also, Christianity is the largest religion... https://en.wikipedia.org/wiki/List_of_religious_populations

The first reference quite explicitly says "number of copies each book sold over the last 50 years."

Sales is an approximation for readership. The second source also claims "figures may incorporate populations of secular/nominal adherents".

So, while I agree that populations that identify with Christianity may be larger, it's pretty obvious a lot of those do not observe the rituals the religions they identify with specify.

To get a better readership proxy, perhaps, one could use the share in sales of religious texts across the different demographics.

Then we need to also define what "read" means. Does it mean the book was read cover to cover or does it mean the book is read from occasionally? Religious texts are used in very different ways than other forms of literature, to the point comparison borders the questionable.

No doubt that is true. I was referring to the scientists motivation to ‘engrave’ the Bible

Or the KJ bible is considered by the English their most important literary work, and the university is English!

If researchers at KAUST had decided to etch the Quran as a sign of devotion and humility before the unknown, what would have been the problem?


For the decision of who to honor is solely of the creator(s) and not of some political commissar

A large number of people know roughly how big The Bible is. I think they picture was more about "look, this big book fits on this little thing", and less about some perceived proselytizing.

The image is misleading anyways. The whole Bible fits in 1Mb uncompressed. That image would be billions of Bibles if it is using this new technique.

That image is probably close to a decade old, if I'm recalling correctly the first time I saw it, so it's definitely not the new technique. The image just looks good for a headline, especially since any actual picture of the new technique would likely look like unremarkable frosted glass at a macro scale.

They also claim it will last 13.8 Billion Years and this is the problem you have with it?

Press releases. Whatta-ya-gonna-do?

The 13.8Gy claim is made by estimating, calculating and then confirming by experiment how quickly wear and tear works (the primary mechanism is nanograting, and the primary contributing physical quantity is temperature). It then specifically grabs 13.8Gy as an example data point because that's the age of the Universe, but they could've picked any other point on a time/temperature scale, like, say, "room temperature".

Are you suggesting they pulled this out of their backside? The research very much supports that number.

No, 4MB.

So the fact that the demonstration disc they chose to use in a photo for the article happens to be the Bible (the first book ever printed and most printed) tells you "everything you need" about "them"?

No wonder public discourse is so terrible nowadays, when people are willing to dismiss entire institutions based on an arbitrary image in a press release. But, I guess that's just a step or two removed from asking someone to be fired because one time they interviewed or were seen with someone else who's opinions you don't agree with.

I wish I could automatically append your comment to each similar statement people tend to make nowadays.

Yeah isn't this '5D' yet another eye grabbing title? I think size and orientation are still 3D data.

And yet, in machine learning we regularly talk about dimensions 100 and above!

Just because they eventually map to three special dimensions doesn’t mean it’s not useful to talk about the different degrees of freedom you’ve managed to use for storing information.

What were y’all expecting? Spider-Man: Into The Spiderverse?

If you make a market out of everything, the marketing departments will rule the day...

Heh, that number is suspiciously close to the age of the Universe. Phony indeed.

"WMAP estimated the age of the universe to be 13.772 billion years, with an uncertainty of 59 million years. In 2013, Planck measured the age of the universe at 13.82 billion years."

Yes, 13.8Gy is the approximate age of the universe and yes the researchers used that number. But it's not like they made this up and conveniently landed there: they computed a range of options and worked back from that specific number to prove it'd work for as long as we could possibly care.

Turns out the biggest decay factor is nanograting and the biggest contributing physical quantity is temperature. They plotted what decay would look like on an Arrhenius plot. They both computed estimates and measured to confirm: the measurements are quite accurate. The specific claim is that they computed that at a specific temperature (462K) the medium will last 13.8Gy -- but their point is that it would last as long as the age of the universe at that temperature, not that the pulled a number out of their backsides.

The actual paper is here: https://www.researchgate.net/publication/297892219_Eternal_5...

I don't think that's as long as we could possibly care, but it's a good concrete number. The Stelliferous Era is predicted to last some 10¹³ years; looking at their Arrhenius plot, we probably need to keep the temperature below about 400 K to last that long. (However, I suspect that over such long time scales, other decay mechanisms such as cosmic-ray damage will be more important.)

Most likely, for long-term archival, we need to make copies of the data from time to time. But this technique will likely allow those intervals to be longer than the current age of the universe, and it offers higher density than you can get with techniques like Norsam's planar discs, which is a useful improvement over existing media.

Ok, we get it. No need to copy-paste the same comment multiple times.

There's a long and mostly bad history of claims around 3D storage. About twenty years ago I followed a company called Constellation 3D, later Terastor. They made many strikingly similar claims, smaller absolute numbers but a similar multiplier vs. what was already in the market. At least they only claimed three dimensions. I can sort of accept orientation as an effective fourth dimension, but size seems like a real stretch. In any case, C3D/Terastor struggled along for a few years, with more claims and more excuses, before they turned into the predictable smoking crater. I've seen several more just like them come and go since then, so I think I'll wait for more concrete proof that this particular technology can work at scale in the real world instead of just once in a lab.

BTW, if you want to go even further back, who else remembers the promises about bubble memory?

Or what if it's a niche technology that works just fine at scale in the real world but nobody cares enough for super long lifetimes to pay the higher price for the equipment?

No "bad history of claims," just niche technology.

Not everything has to be "fake news" or "phony" just because it doesn't take over the world.

I can't help but think we're cheapening the idea of fraud when we accuse every company or technology of fraud just because they don't wildly succeed.

I think you're overreacting, or perhaps going off on a bit of a tangent from what I actually said. I didn't accuse anyone of fraud. I even offered kudos. I'm just trying to adjust expectations because the history is indeed bad even if nobody did any wrong. Would it have been better if I'd said "sad" or "unfortunate" instead?

A medium that really has these kinds of density and survivability traits is indeed a great thing, but "niche" is a bit of an understatement. Adding it to the payload of a multi-million dollar rocket is definitely a publicity stunt, so I think it's entirely fair to point out that it might be good for little else depending on how further development plays out.

It wasn't a publicity stunt by the University. Preserving knowledge over extreme periods of time is the whole point of the Arch organization (which is not for profit). It's weird, maybe not practical, but it's the whole point.

Also, such rockets often just use a mass simulator (i.e. block of concrete or metal) on inaugural flights, so there's nothing wrong with giving it a shot.

If you don't launch it into space, it will be destroyed when the sun engulfs the earth in only six billion years, if not earlier. There's a dismayingly large chance that the only thing surviving from our entire culture in only a few million years will be a mass extinction in the fossil record, a halo of geosynchronous metal debris, and whatever data is encoded in such stable forms as these glass discs.

In that context, describing it as "a publicity stunt" seems short-sighted to the point of self-parody, like a small child who thinks that the main distinguishing feature of money is that you can buy candy with it. In a very short time, it is likely that the only things humanity has done that are even detectable are the launching of satellites, a mass extinction, and the launching of such archival media.

I know it's fun to call people short-sighted and compare them to children, but grow up yourself. Sending something in this particular medium was a publicity stunt, and it worked. You think we're talking about it here for any other reason than Elon Musk was (tangentially) involved? Sending some sort of beacon or memorial into space is such a great idea we've done it many times before, with better-tested media. If we wanted to try something newer, a Rosetta Project disk would have been a much more obvious choice.

Practicality wasn't the point. Publicity was.

"Practicality wasn't the point. Publicity was."

The real issues involved here are the Laws of Thermodynamics, Entropy and 'Glass' (Crystals) being the the most stable state of matter in the universe. All of these indicate that such longevity is possible (see my main post).

Clearly, the reason that '13.8 Gy' is used here is that it's a well known time interval and it puts the longevity of this technology into perspective in ways that many will understand.

If actually achieved in practical terms then we ought to be hailing this work as a remarkable effort—not quibbling about trivia and silly incidentals.

Glass is the opposite of crystals. Crystals would presumably be longer-lived, but their anisotropy makes them somewhat trickier to work with. Otherwise I agree, and like you, I'm profoundly disappointed by the level of "notacoward"'s comments in this thread so far.

You're right of course. I've assumed the stuff would necessarily be crystalline (and would have to be to have such longevity). The word 'glass' here being used for easier digestion by the public. (See my longer post for more details.)

The Rosetta Project disks have a much shorter expected lifespan and weigh more, so they're less practical.

Since I've been talking about the Rosetta Project and related initiatives for about 15 years, long before Elon Musk was involved, you're also mistaken to assert that "we're [not] talking about it here for any other reason." I mean, I'll take your word that it's true of you, but it's not true of me. Maybe you've got a mouse in your pocket?


Hey, please don't break the site guidelines by crossing into personal attack, no matter how provocative another comment may be or feel.


Nice to see you again, dang. Keep up the good work.

Publicity stunt by whom? The whole point of the Arch Mission is stuff like this: https://www.archmission.org/

Using this exact media was a full test of the technology, to get it out of the lab and into the environment they intend for it long-term.

But I can see you're committed to this narrative, so please don't let me interrupt it with the actual organization responsible for it.

Are you assuming publicity is bad? I certainly never said so, and would appreciate it if you wouldn't interpret my words as negative based on your own misapprehension. Arch Mission seems like a good cause, and publicity in a good cause is a good thing.

To me it feels like a similar (or is it reverse?) problem to deep space telescopes and that they used to 'image' that black hole. You need so many fractions of an arc-second in order to 'see' the distance object.

Only in this case you also have to arrange the other objects so they don't occlude the ones behind them.

Imagine trying to lay out a warehouse so you can do inventory without walking through the building. Just a binoculars and a gantry crane, or a bicycle to go around the perimeter.

How densely are you gonna be able to pack things, really?

I mean, there is one option I know of. The guy who discovered NiMH battery chemistry had an optical storage format he was pushing where you use one laser to excite phosphors in a 3d matrix and a second to read the state. The bits are all transparent until excited. But I think you're limited by how fast you can activate the cells, so linear scans would be the slowest.

You don't have to avoid occlusion with nanogratings and other interference phenomena; occlusion only adds a little noise.

According to the paper, their laser pulse rate is 500kHz; if they're writing a single 1 bit per pulse, and half the bits are 1 bits, and there's no coding overhead from something like 8b/10b, that's 125 kilobytes per second, so 10 seconds per 1.2 megabyte 5¼" DSHD floppy. At this rate, filling the 360-terabit capacity of the disc would take 11 years. You would not expect such a slow storage system to be able to compete in the market with 50-megabyte-per-second SD cards, unless data retention of centuries or more was a key selling point, so the fact that such storage systems have failed in the market provides no evidence against the claims made in the paper.

Bubble memory, similarly, does work; it was just too slow to compete with DRAM and not dense enough to compete with spinning rust, except in a few niche applications.

Strawman. I never said there was anything wrong or untrue about the claims in the paper. I'm just trying to put that into perspective. You're not new to HN. You must know that people here mostly see "360TB super-durable media" and immediately jump to thinking about how they'll put their private copy of the npm repository on it tomorrow. Or their transient ML training data. Or their porn stash. Pretending perspective is irrelevant so you can score some internet points is - to borrow your own phrase - childish.

You said, "There's a long and mostly bad history of claims…" which I interpreted — I think reasonably — as asserting that these past claims were false. (In what other sense can a claim be bad?) I was rebutting that reasonable interpretation of what you said, not a strawman. I am pleased to see that you are not claiming that the paper is false.

False conclusions hypothetically jumped to by finance-obsessed man-children and brogrammers, in JWZ's immortal phrase, are not my problem. Mere assertion that their perspectives are relevant does not constitute an argument for their relevance. In fact, nobody in this thread has so far suggested storing their replica of the npm archive, their ML training data, or their porn collection on one of these discs, so your concern seems to be entirely without an empirical foundation.

> Mere assertion that their perspectives are relevant

Is it your reading comprehension that's lacking, or just your integrity? I said perspective is relevant. I didn't say their perspective.

What would it even mean for perspective per se to be relevant or not? We aren't discussing art history! I did indeed read your comment as saying that the perspectives of the hypothetical conclusion-jumpers were relevant, which would at least have been coherent. I can't make heads or tails of your comment about perspective being relevant — you apparently meant to accuse me of some kind of dishonesty, but it isn't even clear what I'm being accused of. In any case, your comments so far provide very little reason to take seriously anything you say.

"Perspective is relevant" means that it's important to have some. This is a cool scientific advance. I said so before you even showed up. But it's not something that's immediately useful to most readers. That's perspective: like this from one POV, like that from another. Is that clear enough for you?

Seems a bit of a waste to dedicate an entire 360TB disc to a single text document like a bible, which is probably just a few KBs... /s

More seriously, they don't talk about reading capabilities (retrieval speed etc). And what if it gets scratched? What is the error tolerance? At that density, a single speck of dust could have dramatic implications...

I hope this reaches industrial viability, because we desperately need a digital format that can approximate the lifespan of simple paper. At the moment we are chained to a maintenance nightmare of periodic hops between formats, with deadly consequences any time we miss a single jump.

What would IMO be a better solution is something like the pyramids - something huge, with a lot of data written down / chiseled in without any weirdness like parity bits or binary code, and a lot of redundancy. Imagine the pyramids but made out of modern materials, resistant to erosion and earthquakes. Ceramics? Silicon?

Either way it'd have to be legible without the need of a microscope or computers. Write the same text down in a few dozen vastly different languages, so you've got a modern day Rosetta Stone.

And of course, build it somewhere remote and seal it. Bonus points for a stable atmosphere, fill it with a heavy noble gas or make it a vacuum.

(I'm not well versed in any related sciences, I'm just sounding off some random ideas)

There is an ongoing project to carve knowledge on clay tablets and seal it in a salt mine in Austria. Sounds like it checks some of your boxes.




> The geology of the mountain will allow the MOM archive to fully close itself by a natural phenomenon: the salt “flows” with a speed of 2 cm/year into any void. This will protect the archive from the greatest threat; man himself.

> The pressure which results from the weight of the mountain and a hypothetical ice shield of 5 km thickness is approximately a fifth of the burst pressure of the used materials.

It depends on your goal, there's long term storage that assumes we'll retain the technology and knowledge required to read it (or it will be easily reproduced, ie the reader technology is trivial to produce in the future because technology has advanced really far) and then there's the civilization restart archive that assumes we'll be anywhere from cave people up to basically where we are today. Really for the second type I think both the macro scale and a micro scale (though not digitally encoded for most of it) in combination along with something like the Rosetta Disk [0] from the Long Now project is probably the way to ensure it's actually readable. The large scale text would provide a boot strap to get optics set up that could read the smaller etched text on archives that could then setup the tech to read the even finer etched stuff and so on maybe eventually reaching disks like the one in the article where we're able to store digital files and the knowledge to read them was introduced in more human readable formats.

[0] http://rosettaproject.org/disk/concept/

Yeah, actual museum-grade archival is so much more than just having an aging-resistant media. You also need to be able to read and decode it many years into the future, which is really tricky to guarantee - you can't assume the future generations will know how to make a reading device and remember your formats/encodings. So it's a bit like unzip.zip problem.


>For the extreme longevity version of the Rosetta database, we have selected a new high density analog storage device as an alternative to the quick obsolescence and fast material decay rate of typical digital storage systems. This technology, developed by Los Alamos Laboratories and Norsam Technologies, can be thought of as a kind of next generation microfiche. However, as an analog storage system, it is far superior. A 2.8 inch diameter nickel disk can be etched at densities of 200,000 page images per disk, and the result is immune to water damage, able to withstand high temperatures, and unaffected by electromagnetic radiation. This makes it an ideal backup for a long-term text image archive. Also, since the encoding is a physical image (no 1's or 0's), there is no platform or format dependency, guaranteeing readability despite changes in digital operating systems, applications, and compression algorithms.

>Reading the disk requires a microscope, either optical or electron, depending on the density of encoding and could be combined with an Optical Character Recognition system to read the text back into digital formats relevant at the time of reading. We are keeping our encoding at a scale readable by a 1000X optical microscope, giving us a total disk storage capacity of around 30,000 pages of text.

Wouldn't one have to encase this in a diamond shell or something to avoid scraping it?

By a delightful coincidence, there's another article on the front page of HN today that is relevant to the interpretability problem you're alluding to; see the discussion at https://news.ycombinator.com/item?id=20264848

This is where things like microfilm have an advantage- the only knowledge needed to decode it is magnification and the language it’s written in.

I’ve been wondering lately how well you could do with a microfilm strip full of 2d barcodes prefixed by an optical copy of the relevant specifications.

relevant: http://www.vpri.org/pdf/tr2015004_cuneiform.pdf talks about archiving computer programs so they may be read by future civilizations

>You also need to be able to read and decode it many years into the future, which is really tricky to guarantee

Not really, those future people don't just come out of nowhere. It's a gradual, incremental change as there are still systems using tape storage the technology or formats are not lost but migrated.

That depends on the time scales we're talking about. 30 years? Probably (and even then there are problems with abandoned/little-documented formats like Word 3.x). 300 years, and a historian will have a hard time decoding a digital file from the year 2019, even assuming the media is still intact. 3000 years and a nuclear war can make a weird shiny disc appear as a currency or a cult item for an archaeologist, not as a knowledge storage.

This assumes an unbroken chain of people devoting time and resources to migrating the data. Any gap would mean that data is lost or requires much more expense to recover.

Find a 3.5" floppy disk from 1995 with WordPerfect files on it and see how much effort it takes you to get the documents back. Now imagine 100 years have passed.

think nuclear war that wipes out 99.999% of earths human population. Even the few computers that remain would become useless because there's no power, fall apart due to degredation/weathering, and ignored because people will have much higher priorities like finding food and surviving, than preserving what would then be the previous generation's technology.

Just put the instructions in a plain txt file as the first file on the disc.

you'd have to know the (physical) reading mechanism (plus encoding) to read the README.txt itself.

Well encoding is not an issue. Any civilisation who have the physical means to read the disc will be able to figure out the encoding fairly simply.

It isn't that simple, many lost languages have only been restored because of the dual language books.

Similarly, I think what matters here is the higher-order redundancy. A single encoding that can be lost, is not straightforward to figure out, and holds the clue to a massive storage, isn't the most future proof decision; it's a single point of failure.

That's why you put the ASCII encoding as well, plus the blueprints for a digital computer capable of running EMACS to decode it.

> And what if it gets scratched? What is the error tolerance? At that density, a single speck of dust could have dramatic implications...

My understanding is that it doesn't matter. The data is embedded three dimensionally in glass. Assuming there was some small margin of glass around the patterned region, you could simply grind / polish off the scratches. Of course if you scratched it deep enough to destroy the data region then you'd be out of luck, but this can be made arbitrarily difficult.

Furthermore, you can arbitrarily scale error tolerance with error correcting encodings; data can be distributed, such that damage in one or a few areas is still available elsewhere, similar to RAID 5 and 6.

Just to follow up, here's a list of the problems I know about in getting 3D (or 5D) storage to market.

* Materials. You need something that can be manipulated or deformed at reasonable energy levels into two or more stable states to store data, then maintain that state for a useful lifetime without power. That's already a bit of a Goldilocks act, but it must also be non-reactive, not too brittle, not too expensive, etc.

* Media manufacture. Recording even a single layer at these feature sizes is a non-trivial challenge. As layers increase, yield drops exponentially - even before you consider how the layers interact.

* Read/write mechanisms. Focusing on a single layer on a moving medium, with nothing in the way, is also a challenge. Put other data-bearing layers in the way and it becomes much more difficult. Also, the "blast radius" for a single focus/alignment error becomes much larger, so you're going to need some serious error correction over and above what already exists for 2D.

* Transfer speeds. These are already problematic even at 10TB. 360TB without a corresponding increase in transfer speeds would be a nightmare.

A prototype involving a novel set of materials and/or mechanisms working once under ideal conditions is great. Science advances. Kudos for that. But that's only solving about one fifth of the problems that need to be solved to produce something of actual value in the market. You see a similar thing with battery technology. Everyone knows we need better batteries. There are always several companies claiming to have found the next breakthrough Almost invariably the new wonder material turns out to be a poor fit for real-world scale, economics, or conditions. Obviously we should keep trying, and keep learning, but nobody should get their hopes up too much too early.

I don't disagree, but it's not really different with spinning rust. the solution would likely be similar: seal it in an enclosure to prevent intrusion of dust, scratches, etc.

> spinning rust



> Spinning Rust – term for conventional hard disk drives with motors and using ferrous-based platters for data storage. Probably derogatory now that Flash Drives are beginning to replace conventional drives.

TIL? Goddamnit, how old am I?

But the rust which is supposed to be spun, doesn't live long when not being spun?

So far the most viable "solution" is to store data in the cloud service like Backblaze or AWS S3, where it's someone else headache to maintain high enough duplication and swap the drives as they reach their EOL.

I believe HDDs slowly lose stored data if you don't rewrite it periodically? I remember reading some data recovery forum a few years back with the local experts saying something about 10 years of reliable data storage for an HDD lying on a shelf. After that, all bets are off.

and/or add some degree of redundancy and error correction to the storage codec

BD-Rom is already pushing sane limits. It's why an important if under reported "feature" of BD-Rom over DVD was harder, more scratch resistant plastic coating.

If you really need the lifespan of paper... http://ollydbg.de/Paperbak/

Thanks for that link. It's clearly a joke and obsolete, but I wouldn't mind having something like that to backup a few small things (ssh keys, passwords meant to survive me, etc etc). After all, we will likely always have some sort of image-capturing device to leverage - unlike with discs of any sort.

What if HDD platter gets scratched?

What if SSD memory gets scratched?

What if <any storage> gets <X mechanical damage>?

The point of this tech is precisely to overcome the limitations of previous ones.

> What if <any storage> gets <X mechanical damage>?

It's more like "How much X damage can storage Y tolerate before losing Z amount of data?" You evaluate the different tech then you pick the best option.

Maybe some of this could be mitigated with logical techniques (e.g. make data redunant over 2 or 3 different parts of the disc, sacrificing capacity for durability), but it really depends on whether the process allows for this or completely borks out as soon as a scratch happens. They don't say anything about that and I think it's a pretty important aspect of any long-term storage solution.

Those are not touted as a very-long-term archival solution. This one is.

A small prototype of this technology already flew in the glovebox of the Tesla Roadster launched to deep space on the Falcon Heavy’s inaugural launch (third launch just succeeded this morning):


...it (appropriately) contained Asimov’s Foundation Trilogy.

> The Arch library that was included on the Falcon Heavy today was created using a new technology, 5D optical storage in quartz, developed by our advisor Dr. Peter Kazansky and his team, at the University of Southampton, Optoelectronics Research Centre.

Link to source (University of Southamton): https://www.southampton.ac.uk/news/2016/02/5d-data-storage-u...

Interesting side effect, this also gives us some clue about what kind of ... information storage devices we should be looking for in other heavenly bodies.

Had we found a bunch of silicate glass pebbles in Europa, we'd have considered them a curiosity. After this, we have good reason to find a way to ship it back and have another look.

Somebody should take another look at those moon glass beads.

I have a question for anyone here who understands how this works: is the glass from the article resilient to physical shocks like falling/shaking/heat/other mechanical stress?

Documents that are meant to last a long time used to be written on vellum because it is a very physically durable material. I understand that this glass method beats existing digital storage methods for resilience, but does it best traditional analog/legacy techniques?

Yes, fused quartz has roughly the mechanical durability of granite, and it doesn't melt until a considerably higher temperature than ordinary glass (though lower than the 1650° temperature at which crystalline quartz melts).

Vellum and other leathers have a lifespan of under 10000 years under ideal conditions, like those under which Ötzi was preserved. Under such conditions, the researchers extrapolate from accelerated-aging measurements that their medium will last 3×10²⁰ years, which is 3×10¹⁴ times as long. That is, this glass disc will last 300,000,000,000,000 times as long as a vellum document will, unless it's subjected to high heat.

They also extrapolate that at 462 K (189°, or, in obsolete units, 372°F) it will last the current age of the universe, some 10–100 billion years. At 189° I think vellum's lifetime is a few minutes.

Thanks for the answer. Can the durability of the fused quartz be compared to the durability of this storage format though?

(from a more detailed article [1])

> The information encoding is realised in five dimensions: the size and orientation in addition to the three dimensional position of these nanostructures.

Is the idea that these nanostructures are themselves hyper-resilient? Or would a significant impact alter them so as to render them unreadable?

[1]: https://www.southampton.ac.uk/news/2016/02/5d-data-storage-u...

Well, there are a couple of different questions here. One is about resilience to physical shocks such as falling and mechanical stress, which could destroy the disc but won't do anything to the nanogratings. The other is about gradual, continuous decay processes — over a sufficiently long period of time, random thermal fluctuations will destroy any solid object, and analogous but larger-scale relaxation processes will destroy galaxies as well. Such processes could cause the nanogratings to decay long before the fused quartz itself evaporates.

The nanogratings do have a certain amount of built-in redundancy; they're holographic phenomena.

It looks like somebody just said "it will last as long as the universe" and someone else translated it into an actual number using too many significant digits.

How long can we realistically expect glass to last, judging from e.g. beads of volcanic glass embedded in the geological record?

Not much, regular glass transitions to crystal quite fast compared to those geological timescales. And who knows what the transition will do to the data stored at these densities.

Nothing in the article states what kind of glass is used (which blows the bullshit detector right there).

If they use crystalline quartz, it should last indefinitely. If it's regular glass, then 50 years if you're lucky, controlled conditions, etc, etc.

I just mentioned this to a co-worker last week. This was news here 4 years ago - https://news.ycombinator.com/item?id=11140033

I don't know but the article does not talk about the credibility of the claim. I mean even the way they have described the whole thing feels more like a marketing attempt than an actual scientific breakthrough.

If you look at other articles of this website, they will seem more like a lame attempt at getting traffic than to provide something useful.

They have articles titled like - "Ladies Get dose of Radiation From Government UFO" and "Hackers, UFO's and Secret Space Programs - Oh My!"

I mean, this does not feel like an information source I'll trust.

Edit: As others are mentioning in this thread. From a researcher's perspective, they should have also talked about the read/write capabilities.

I remember chatting to someone working on the project at the University of Southampton years ago .. I think it was back in 2004. Great to hear it's come to fruition.

The length of time this type of project takes amazes me.

This is about research published in 2013: https://eprints.soton.ac.uk/364916/

It was recycled for a university press release in 2016: https://www.southampton.ac.uk/news/2016/02/5d-data-storage-u...

Then that press release was recycled for this HN submission.

In that case, I'll be amazed about the lifecycle of a news release instead ;)

These articles pop up every few years, and it always makes me shake my head. Such storage solves no market need. Technology hardly lasts five years let alone fifty years let alone five billion years. Virtually nobody is worried about being able to read their media 500 years out.

And even if that was a reason to pursue a technology, the actual storage capacity is meaningless on its own. 360TB of spinning disks isn't very expensive to buy, rack, or run. And you can read and write to them at a decent clip. Managing failures is fairly predictable. What benefit does a magical glass (or, in previous incarnations, quartz or other crystal) disk have? Optical media isn't known for its amazing random access speeds. Write-once media has very limited use and is almost never fast to write to.

So who's buying this? Who has data that needs to last that long, or needs to store lots of permanently immutable data that's read back sequentially? I honestly can't think of a market for this. The article says "museums" but I don't know of any museums that would prefer glass disks for their backups over an S3 bucket. This is "on prem backup" taken to a comical extreme.

I can see the academic value of exploring the technology, but this space has been exhausted many times over. I remember seeing similar articles in Technology Review and Scientific American about identical developments twenty years ago. It's just not a good idea.

Being able to store data densely on a commodity material such as quartz is great for archival storage (e.g. Amazon Glacier). If the storage media is cheap, then it tends not to matter if it can be reused.

For random access, rapidly created and destroyed data, SSD and HDDs will continue to dominate. But for the growing use case of hoarding data forever, this is a good fit.

Virtually nobody is worried about being able to read their media 500 years out.

While this is true, the lesson I draw from this is the opposite from what you seem to be concluding. I conclude that most people are self-obsessed idiots who waste their lives on meaningless and futile things like "market need" and "survival".

What happens at 13.9B years?

I'm guessing that's the estimated time it would take for entropy in the material to render the data unreadable. They specify at room temperature, so I'm pretty sure they're talking about entropic disruption of the structure due to thermal effects.

Note that 'glass' is a hugely varied class of materials, so without knowing much more we can't make spot judgements about the validity of the claim.

"Note that 'glass' is a hugely varied class of materials, so without knowing much more we can't make spot judgements about the validity of the claim."

As I say in part of my post the key issue here is the intrinsic property of this "glass" which will be of a crystalline-like nature, to quote:

<...> 5. We know that under ideal conditions some crystals such as zircon and others have lives of much longer than 4.4 Gy. Why? Well crystal structures are known to be the most stable forms of matter in the universe that goes with the fact that they have the lowest entropy: https://www.livescience.com/50942-third-law-thermodynamics.h.... — to quote: “The entropy of a perfect crystal is zero when the temperature of the crystal is equal to absolute zero (0 K).” <...>

No, 10–100 billion years is at 462 K, which is a lot hotter than room temperature. At room temperature the estimated lifespan (or, really, the time constant τ) is 3×10²⁰ years.

That's really great, precise information. Where did you get it from? Also which exact material is that for? As I said, there are a lot of different things called 'glass'.

From the paper: https://www.researchgate.net/profile/Ausra_Cerkauskaite2/pub...

They're using fused quartz, as one does. I'm guessing it would be a bit easier to use soda-lime glass, and also reasonably stable, but considerably less stable.

Most likely the defect rate will reach a specific value so that the storage medium is no longer considered reliable.

I don't know, but if it happens at 13.7B years you RMA it!

Nothing, it's roughly the current age of the universe.

Heat death of the universe

Not even close. I've seen figures in the order of 10^1000 years for the Heat Death.

I don't think so.. I think that's the estimated current age of the universe.

Hello people from the future looking back at this thread reminiscing about the old days when this claim was incredible to us.

Does anyone know a resource which explains the theoretical limit for retrievable, durable, information storage?

I would assume the most dense medium possible would be a collection of neutrons, since neutron stars are the most dense object other than a black hole, but retrieving information from them doesn’t seem feasible.

"'Dragon's Egg' is a 1980 hard science fiction novel by Robert L. Forward. In the story, Dragon's Egg is a neutron star with a surface gravity 67 billion times that of Earth, and inhabited by cheela, intelligent creatures the size of a sesame seed who live, think, and develop a million times faster than humans. Most of the novel, from May to June 2050, chronicles the cheela civilization beginning with its discovery of agriculture to advanced technology and its first face-to-face contact with humans, who are observing the hyper-rapid evolution of the cheela civilization from orbit around Dragon's Egg.

"The novel is regarded as a landmark in hard science fiction. As is typical of the genre, 'Dragon's Egg' attempts to communicate unfamiliar ideas and imaginative scenes while giving adequate attention to the known scientific principles involved."



This is one of the few science-fiction books that I still think back on. Good, fun story and hard scifi.

I absolutely loved this book. However, isn't it curious that the humans arrive just as the cheela are reaching the level of technology needed to communicate? In terms of human time, they could have come a year earlier, or a year later.

As I remember the book it was the humans probing the neutron star with something (x-rays?) in order to survey/study the star that triggered the evolutionary advances in the cheela. So no, it wasn't curious timing, the humans precipitated the rise of the cheela.

Great book btw, hope I'm remembering it correctly.

Thank you. This sounds like a book I need to read.

> Does anyone know a resource which explains the theoretical limit for retrievable, durable, information storage?

From an old ZFS weblog post on why they chose 128 bits for their data structures:

> Although we'd all like Moore's Law to continue forever, quantum mechanics imposes some fundamental limits on the computation rate and information capacity of any physical device. In particular, it has been shown that 1 kilogram of matter confined to 1 liter of space can perform at most 10^51 operations per second on at most 10^31 bits of information [see Seth Lloyd, "Ultimate physical limits to computation." Nature 406, 1047-1054 (2000)]. A fully-populated 128-bit storage pool would contain 2^128 blocks = 2^137 bytes = 2^140 bits; therefore the minimum mass required to hold the bits would be (2^140 bits) / (10^31 bits/kg) = 136 billion kg.

* https://blogs.oracle.com/bonwick/128-bit-storage:-are-you-hi...

Interesting. If we ignore whether or not this specific implementation will work, this would actually be a great way to store data for the Permaweb. I would imagine such resilient data would be a great asset when combined with content hashing!

Given that the sun will die in about 5B Years, this is a bit of a gimmick :-D

Sun will become a red giant in about 5 billion years. Space habitats, and habitats on moons of gas giants will still be possible.

By then we could probably tow Sol over to a comparable system using a Shkadov thruster and jump ship to a new sun.

After one billion years, the speed would be 20 km/s and the displacement 34,000 light-years, a little over a third of the estimated width of the Milky Way galaxy.


In 5 billion years, humanity will either not exist anymore, or will be a Type III+ civilization. We certainly won't be impacted by anything that might happen to the sun in terms of its life cycle. A coronal mass ejection might be a problem at some point though.

So if one of these discs was dug up by a civilization with the level of technology we had in, say, the 70's, would they be able to tell what it was, much less make something to read it?

They'd likely know it had information on it, even if they couldn't necessarily read it immediately. Anyone storing information is likely to mark it or arrange it in some non-natural way, otherwise it'll just look like rocks.

so how do you know that the crystalline structure of some rocks aren't encoding some information?

Alternatively, information encoded in rocks could've been encrypted. And encrypted information should be indistinguishable from random noise.

Our bodies are coded with DNA and rocks and geo formations are encoded with the earth's history.

"Hey look, I found a cool coaster!"


"Here we have an extremely rare ritualistic sex device, the male of the species placed this over his member before performing an elaborate dance and inserting his member into the female"

Im always sceptical about these claims of archive longevity. IIRC when CD-Rs were first available, they were touted as suitable for century long archives. The reality was far from that.

You can still buy archive CD-Rs rated for 300 years. That didn't prevent everyone from buying cheap crap that barely lasts 5 years before becoming unreadable.

I bought some archival CD-Rs 15 years ago and backed up a bunch of personal documents and photos. Different brands of CD-Rs too. Tested them, all worked fine. 7 years later not a single one works. I tried different drives.

It was a painful loss. I learned my lesson. Now I back everything up into at least 2 different methods: cloud + HDD. Cloud has been the most reliable so far. Some HDDs have failed (humidity killed the circuit boards).

High quality discs are capable of that, the problem is the drives to read them all broke down. For an ultra-long term storage medium like this that's moot. If someone wants to read one of these in a billion years, they'll just have to develop the tech, but that would be true of any such long term mass storage medium. It's not necessarily a strike against this particular implementation.

> High quality discs are capable of that, the problem is the drives to read them all broke down.

What are you referring to? I assume you're not just talking about "high quality" compact discs, since we have plenty of readers for those.

The OP meant the drives would've broken down before the disks would degrade.

I think the original comment about optical disks was talking about very early formats, before CDs were standardised, such as the drives used for the Doomsday Project.


> they were touted as suitable for century long archives.

I can't remember this statement from those times. Only people who said that tape is still the better/safer medium for music and backups. I even had a zip drive for that.

Magnetic tapes suffer from print-through though.


Facebook uses BluRay RW Disc for Cold Storage ( Not sure if that is still the case ).

The question is nearly 4 years later are they anywhere close to production?

I guess I'll have to buy the White Album again.

Is this another tech that's "just around the corner" like holo-memory from the late 90's?

TIL glass is not supercooled liquid after all.

There's a good Veratasium video about that [1]. But the short version is that glas is pretty much a solid, and lead is much more liquid at room temperature than glas.

1: https://www.youtube.com/watch?v=c6wuh0NRG1s

How does glass being amorphous affect the data integrity? What 'glass' specifically is being used?

Presumably the amorphous nature of glass lowers the energy barrier to disrupting the stored information. They're using fused quartz glass, as one does.

Looks like we need a bit more clarity on what they mean by "glass".

If you look at glass in medieval glass windows it's heavily distorted by gravity (fatter at the bottom than the top) and that's after ~900 years, so presumably they're actually meaning something more robust than that!

Turned out it was a mistaken belief. Basically, thickness invariance in medieval glass had to do with the manufacturing process, and the viscosity of glass is not observable in a human timeline.


That's great - I hadn't heard about that, thanks for putting me right.

would this offer a safe haven for instructional code on a space ship?

Why is this here now? The piece of news is from three years ago

The pitch sounds very cool. For all the information given, it could well possible take 13.8 billion years to write 360 TB of data and always in the lab environment that needed to be funded. They seem to be avoiding specifics.

I wonder how many years our file systems will remain usable.

So it’s like a Craftsman tools warranty.

I still remember the promise of the CD-Rom. Billions of years.

You could probably build CD-ROMs that last millions of years if you really tried. You can buy regular CD-Rs that are estimated to last hundreeds of years. But of course most people buy whatever is cheapest, and the cheapest CD-Rs won't last you a decade.

It might hold data for billions of years, but no one has CDRoms at home these days because interface and use-cases were deprecated years ago.

CDs that you burned chemically with a laser at home were much less durable than those physically pressed in a factory from a glass master.

They were only off by a couple orders of magnitude.. 10 vs 1000000000 is easy to confuse.

CDs used to be sturdier and last longer. Nothing close to billions of years of course.

Minority Report?

please add (2016) to this title, old news

dd if=/dev/data | tee >(dd of=/dev/glass1) | dd of=/dev/glass2

20:20 optical replication ;)

Can I rant about the 5D?

"It's been dubbed five-dimensional (5D) digital data because, in addition to the position of the data, the size and orientation are extremely important too"

Orientation is inherently important anyway. Have you ever tried reading a book upside down, and what does size bring? Other than lowering density. And obviously position is important, it's the difference between data and randomness.

Books only have one orientation, so they're not actually making use of that dimension to encode data.

The rest of your points... oh my. None of the dimensions count at all, really? So rant away, but it's nice if the sound and fury at least signifies _something_.

No 2 dimensions are important at least :)

Regarding orientation, are you saying they printed half the book over the first half, but at right angles over the top? Ie you see different data from different orientations?

Ps I've seen old (18th century?) letters where they wrote at 0, 90 and 45 degrees to get more on one page. Postage was charged by the page so it made sense. Newspaper also used to be taxed by the page (in the UK at least) so you had origami like folding of one sheet of paper into a newspaper, not aware of them 'double printing' in this way though. I assume that's why 'broadsheets' were so massive up until relatively recently.

I think they encode the data as asymmetric marks in the material, with different orientations of the marks corresponding to different values.

Imagine you could print a page containing just a grid of 100x100 numbers. If you can orient those numbers so the top of each digit could be facing up, down, left or right and use those 40 different glyphs to encode in base 40 instead of base 10, now you can fit 4x as much data into the same two dimensional area.

Absolute position isn't always important, and position may not always take full advantage of the available space. Take a magnetic tape (or a radio transmission), it's essentially a 1-d data steam (magnetic intensity), not a 2-d matrix. Moreover, depending on the encoding, just taking a specific slice may not let you decode that specific bit of data, you need to know some number of previous states.

Now, that said, the stream itself is defined by a 3-dimensional dataset: amplitude, phase, and frequency. Every modulation technique is some way of twiddling those 3 dimensions.

Yes, I've tried reading a book upside down. Reading speed decreased a bit but the information is the same regardless of orientation.

If orientation truly didn't matter, surely you could read the book whilst it was rapidly changing orientation?

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact