

Cassette tapes are the future of big data storage - vinkelhake
http://www.newscientist.com/article/mg21628875.500-cassette-tapes-are-the-future-of-big-data-storage.html

======
mistercow
This is sort of like saying that RAM will be replaced by HDDs because HDDs are
so much bigger. It's comparing apples to oranges. It's fascinating that tape
drives are making a comeback, but to suggest that they can replace HDDs is
just a fantasy

This quote:

 _The downside of tapes is that they are slower to access than hard discs
because they have to be fetched by a robotic mechanism, inserted in a reader
and spooled to the right point. But the Linear Tape File System, which is
being developed, expedites this process to make it comparable to disc drives,
Eleftheriou says._

indicates that the reporter did not understand what he was being told. LTFS
offers an abstraction layer so that software can treat a tape drive like a
hard drive. That makes programming easier, but it will not have any effect on
performance. A file system cannot, after all, magically bestow random access
unto a sequential access medium.

So if tape is to make a comeback in storing data for, say, the web (as implied
by the article's opening paragraph, then it's still going to have a HDD based
system in front of it as a cache. It would be an interesting future where
viewing a post that hadn't been accessed in a few years took several minutes
as you waited for that old data to be recached from tape.

~~~
brudgers
If tape is cheap enough, multiple tapes could store the same data in a
different sequences, e.g. beginning at 0%, 25%, 50%, and 75% of the run
length. Initial access could then be quicker, and predictive caching used for
the remaining data on the tape.

~~~
mistercow
In other words, a tape drive RAID. But since you can use related techniques to
improve performance with hard drives, I don't think that will make up much
ground for tape drives.

~~~
bduerst
Yes, but the point was to get read time with tapes up to that of HDDs.

~~~
mistercow
Which a RAID style system probably won't achieve. HDDs start ahead of tape
drives in performance. So it's reasonable to assume that HDD + RAID is also
going to perform better than tape drive + RAID.

~~~
bduerst
They weren't comparing RAID tapes against RAID HDDs, just RAID Tapes against
normal HDD

~~~
jdbernard
No, the discussion was whether they would replace HDDs. From the initial
comment:

> It's fascinating that tape drives are making a comeback, but to suggest that
> they can replace HDDs is just a fantasy

Tape + RAID vs. HDD with no optimizations is a useless comparison. Why would
anyone even care about that?

~~~
bduerst
Because brudgers talking about speed, specifically?

~~~
mistercow
But it still doesn't make sense to compare HDDs _without_ optimization to tape
drives _with_ optimization. Data centers are going to use every affordable
optimization at their disposal, so if we want to know "will tape drives be
competitive against HDDs" we can't handicap HDDs by comparing mirrored arrays
of tape drives to straight, un-optimized HDDs.

With that said, I outlined in another comment[1] why a mirrored array of tapes
would have pretty huge drawbacks regardless. Even if you have an expensive
array of tape _drives_ (as opposed to one drive with many tapes), a mirrored
array will drag your write seek performance toward a constant worst case as
the array grows. The same is true of RAID 1 for HDDs, of course, but the worst
case seek time of an HDD is orders of magnitude less than that of a tape
drive, so HDDs still win.

In any case, if you look at the details of the situation, the answer you come
to is pretty boring: tape drives can compete with HDDs in a small slice of
real world cases (often when used in conjunction with HDDs), and higher
density tape drives will slightly widen that slice.

If you have a situation where you are writing a _ton_ of data, but almost
never reading or erasing, then high density tapes will play well to that
situation (sans mirroring).

If you have a situation where you are writing a relatively small amount of
data over a long period of time, reading it back comparatively frequently, and
never erasing anything so that your total storage needs become very large,
then _maybe_ a mirrored array of tape drives would make sense, if it were
behind a sizable HDD array. But it's a stretch.

[1]<http://news.ycombinator.com/item?id=4690877>

------
stephengillie
_Researchers at Fuji Film in Japan and IBM in Zurich, Switzerland, have
already built prototypes that can store 35 terabytes of data - or about 35
million books' worth of information - on a cartridge that measures just 10cm x
10cm x 2cm. This is achieved using magnetic tape coated in particles of barium
ferrite._

Where does the gain in storage density come from, New Scientist? This article
is so breathless to capture the linkbait of using old technology that just
about any useful info has been left out. The prototype, at about 4" x 4" x
0.75", is just larger than a desktop HDD, and can hold about 8.5 times as much
data as a 4TB 3.5" HDD.

Does barium ferrite allow for tighter magnetic fields, which would allow for a
higher data density? If so, is that material being used in spinning disk HDDs?
And how does this extremely physically complex system stack up against SSDs,
as they lower in cost?

~~~
johngalt
Tape has always been denser geometrically than HDDs. More surface area on a
reel vs a platter.

SSDs would be the opposite of tape. Big$/GB low latency vs. Low$/GB high
latency. Obviously SSDs would replace almost all other media if it could beat
tape on $/GB.

~~~
stephengillie
When you consider the $/GB, do you consider the cost of operating these as
mechanical devices - mechanical wear and fatigue, increased electricity usage,
administrative costs (tape storage, space for robot to work, robot repairs,
etc), and possibility of damage (dropped tapes)?

~~~
johngalt
Yes of course those are considered and the results may surprise you. One of
the beautiful things about tapes is that you can store them unpowered. All you
power is the robot(s) and the drive(s). 100 tapes sitting idle in a silo will
last much longer than 100 hard drives spinning at 7200rpm. SSD's failure rates
are _even higher_ than HDDs.

------
arb99
"Current projections by the trade body Information Storage Industry Consortium
show that although hard drives will be able to store 3 terabytes a piece in a
decade's time, that still amounts to at least 120,000 drives a year."

huh?

~~~
aristidb
I think that's for each individual platter, of which a modern 3 TB hard drive
has typically 3 or 4.

~~~
arb99
think its a typo tbh

------
jlarocco
Off on a tangent, but does anybody know why they store the data from the radio
telescope array for such a long time?

Wouldn't it be more efficient to analyze it, save the interesting data, and
get rid of the rest?

~~~
andrewcooke
am not sure re-analysis is that important myself, but it looks like the SKA
will be used in a pretty traditional way, with groups of astronomers applying
to do particular observations. that means that the processing of data will
likely be different for each group, and will probably take place over time
(some poor grad student will likely spend most of their thesis - several years
- on it).

the other approach - taken with something like the SDSS - is to make a
telescope for just one task (typically a survey). then you _can_ process the
data and throw it away. i am pretty sure that is what SDSS did, and what, say,
the LSST will probably do too (if it ever gets finished).

the advantage of the first approach is that you are much more flexible, which
means more likelihood of making a big discovery (particularly when, as with
this telescope, collecting area is larger than ever before, which means that
you can see fainter and further back in time, making it ideal to study
isolated, unusual objects - in contrast, survey telescopes make different
technical compromises so that they cover a wider field of view), and also more
chances to make and exploit upgrades over time. the downside is dealing with
issues like this (another issue is data transport - traditionally you go to a
telescope and then take the data home with you; i imagine we're now getting to
the point where instead you will process data local to the data storage).

disclaimer: i _was_ an optical astronomer, not radio, and that was years ago,
so this may already be old news / incorrect in details. but the general idea
should be ok.

<http://en.wikipedia.org/wiki/Square_Kilometre_Array> <http://www.sdss.org/>
<http://www.lsst.org/lsst/>

ps often telescopes do make data public after a certain time. but the idea is
not so much to allow reanalysis as to make sure the people who originally took
the data reduce and publish it. it's easy to postpone that kind of work, but
the idea that someone else might do it and publish first is quite a motivator.

~~~
schiffern
SDSS raw data is available online:
<http://www.sdss.org/dr7/algorithms/dataProcessing.html>

------
bitwize
True, but cassette tapes are also the past and present of big data storage. If
you have gobs of data you need to stash somewhere, tape is still massively
more cost-effective than disk, and this is no secret.

------
Evbn
Hard drives use more power because they are usually turned on for use, while
tape drives use less because they are usually turned off and not used?

Thanks for the insight, New Scientist.

The most interesting item in that story was the claim (probably incorrect,
based on the article's overall sloppiness) that LTFS will be "comparable" in
speed to hard drive access.

------
zupreme
I shudder to think of all the guys who will use this article as ammunition the
next time their wives ask them to toss out the old cassette collection once
and for all.

Seriously, though, I think that this is great news. If this really gains any
traction this could represent a resurgence for some Japanese companies which
hold patents on cassette tape technology. After the recent Fukushima debacle,
they could certainly use it.

