
Why the Future of Data Storage Is Still Magnetic Tape - sohkamyung
https://spectrum.ieee.org/computing/hardware/why-the-future-of-data-storage-is-still-magnetic-tape
======
peterburkimsher
My dad's PhD thesis is stored on a magnetic tape formatted for a PDP-11. He'd
like to copy it off, but it's difficult to find a computer that can read it!
Back in 2006 we went from Geneva to Lausanne to visit Musée Bolo, but their
PDP-11 had a problem with its Winchester drive (hard disk) and wouldn't boot.
After a few hours, we gave up. Please get in touch if you have the tools - I
know he'd like his digital copy back!

Even more modern formats are difficult to copy forwards. I'm currently tidying
out my old computer collection, copying off data from SCSI disks, 5.25"
floppies, DD floppies, HD floppies, and CDs. My USB floppy drive can't read DD
floppies.

The SCSI path requires copying to a PowerBook G3, removing the internal hard
drive, putting it into a USB-PATA controller, and copying it to my laptop.
I'll need to do the same for all the 2GB JAZ cartridges. I don't even have a
ZIP disc reader, so I gave those two discs away to another collector so he can
read them for me. The 5.25" floppies are the hardest - they don't mount even
on old Macs, so I'll need to figure out how to set up ADTPro to copy them over
serial from an Apple IIgs.

If you're in the area and want some old hardware/software/magazines/user
manuals, please let me know!

[https://www.reddit.com/r/VintageApple/comments/99223h/peters...](https://www.reddit.com/r/VintageApple/comments/99223h/peters_den_geneva_switzerland/)

The lesson? If you want to keep the data, also keep the computer that can read
it. And make backups.

~~~
oftenwrong
Alternatively, print the data with a laser printer on acid-free paper and
encase it in epoxy resin:

[http://carlos.bueno.org/2010/09/paper-
internet.html](http://carlos.bueno.org/2010/09/paper-internet.html)

~~~
tantalor
Missing one more criteria,

4\. Be recoverable

How exactly do you extract the information after you drown it in resin?

~~~
etatoby
If you read the article and look at the pictures, you will notice the resin
does not touch the paper stack to be conserved.

So when you are ready to recover it, you can just break the resin shell to get
at the paper stack inside.

~~~
tantalor
> just break

Elaborate?

~~~
etatoby
Sledgehammer?

[https://open.spotify.com/track/2aDbkEEcAwh1upaMngsvnF](https://open.spotify.com/track/2aDbkEEcAwh1upaMngsvnF)

------
TheAceOfHearts
It would be great if magnetic tape were more accessible for consumers. A while
back I was reviewing local backup and storage options, and I read up a bit on
it. I think it scales better when you have massive amounts of data, but the
initial price is fairly steep.

At the consumer level it makes way more sense to buy HDDs. My suggestion would
be to pick up something like the EasyStore 8TB external HDD [0] (on sale right
now at Best Buy for $160, will probably drop again soon). You can open it up,
extract the drive, and drop it into your computer if you want internal
storage.

If you want to reduce the risk of data loss you can also fill one up with an
encrypted snapshot of your important things and ship it to a family member.
Very important if you live somewhere that's at risk of being destroyed by
natural disasters.

[0] [https://www.bestbuy.com/site/wd-easystore-8tb-external-
usb-3...](https://www.bestbuy.com/site/wd-easystore-8tb-external-usb-3-0-hard-
drive-black/5792401.p)

~~~
Johnny555
For the majority of consumers, cloud backup makes more sense than tape backup
-- few people have the discipline to stick to a safe backup regimen.

~~~
skookumchuck
Cloud backup will never work for 8T of data :-)

~~~
Johnny555
With a gigabit internet connection, it would take about a day to backup 8TB.
100mbit could do it in 222 hours. At 10mbit, about 90 days.

~~~
s3m4j
Last year I took a subscription to Backblaze, (incidentally) I have around 8TB
of data, I have a gigabit upload (a real one), but Backblaze servers are on
the other side of the planet for me. It took around 3 or 4 months to upload it
all.

That's to say, there's more to it than upload speed, if Backlaze had servers
in Europe I'd recommend it much more than I do.

~~~
skookumchuck
What about another 3 or 4 months to download a restore?

~~~
Johnny555
The core data that I'd want to restore immediately if I lost my primary
fileserver is only around 100GB or less

I have several TB's of other data (mostly photos, videos, etc) that I'm fine
waiting for weeks or months for if needed, but if you're in a hurry, Backblaze
will sell you a hard drive that they restore your backup to and mail it to
you.

~~~
skookumchuck
At that point, one might as well just wind it off to a hard disk in the first
place, then store it offsite.

~~~
Johnny555
Then I'd need multiple hard drives for redundancy, and I have to bring them
back and read them regularly to make sure the data is all readable. And while
I have a large amount of data that I rarely touch, I have a small amount that
changes frequently, so I need to include that data in my off-site backup too.

Or I could just sign up for a cloud backup service, spend a month or so
uploading my initial snapshot of data, and then backups are automatic and
always offsite, replicated, and scrubbed. For $10/month.

------
gHosts
Ahh, the old old old saying, "Never underestimate the bandwidth of a
stationwagon full of tapes" still holds true.

~~~
samschooler
This actually seems like a fun math calculation... Has anyone done this
before?

~~~
gamegoblin
You could easily fit 10PB in the back of a pickup truck (a thousand 10TB hard
drives). San Francisco to NYC is a 4 day drive at 12 hours a day.

So that’s 10PB/4 days = 231 Gbps

Not bad!

~~~
klodolph
I would expect ~2,000 tapes in the back of a pickup, at 6TB each. Note that:

\- Tapes are easier to stack and load in boxes \- Tapes are are more resistant
to shock and damage from vibrations \- Tapes are generally more resistant to
damage from the environment \- Tapes weigh less than hard drives (this is true
whether measured per tape or per byte)

~~~
derekp7
Don't forget to include the time it takes to write to and read from the tapes
(or HDDs)

~~~
woliveirajr
But if you transmit the same data using networks, you also need to read /
write from the tape/hdd/whatever, unless you're talking about RAM-to-RAM
speed.

IMHO you should just account for the time to mount/connect the HDDs to the
computer.

~~~
mbreese
And you could read/write transported HDDs or tape in parallel. If you’re
transmitting over the internet, it’s much more difficult to have parallel
transmission paths.

~~~
mmt
HDDs, yes, but tapes, no.

That is, HDDs are generally unbounded in the parallelism because disk bays are
cheap [1] and plentiful.

OTOH, tape drives are expensive, multiple thousands of dollars, which makes
them scarce.

[1] $100/bay in quantity, but, in some cases, effectively $0 if counting
unused bays in existing servers.

~~~
mbreese
You can still have multiple drives on either side. Sure they are expensive,
but if we're talking hundreds or thousands of tapes (we have the size of a
station wagon to work with), there's a budget for more than one drive on
either side. Even with drives, if you get too elaborate of an enclosure or
JBOD, you're looking at thousands too.

Hell, you could even bring the drives with the tapes!

~~~
mmt
My point isn't that parallelism isn't _possible_ with tapes, but, rather, that
it not comparable, due to cost, at same order of magnitude or two.

> Even with drives, if you get too elaborate of an enclosure or JBOD, you're
> looking at thousands too.

This actually supports my point (except the "too", which is misleading). One
can reasonably expect that $2k-$3k enclosure to support 16-44 disks for that
price [1], compared to a tape drive's singleton.

[1] Full-featured Synology NAS with 8 drives is under ~$1k, for example, while
a CSE-847E1C-R1K28JBOD holds 44 drives for $2800.

~~~
mbreese
If you’re saving on the order of a hundred dollars per tape (12TB), you’re
going to have enough left over for a few readers.

But - this is a silly argument. I think we actually agree on all of this and
we’re talking about an extreme hypothetical that while I’d love to test out,
isn’t something I’m going to do in the near future!

Although, a few years ago, I did have a similar issue. I needed to love about
500TB of data from one data enter in Menlo Park to a new one in SF. We tried
everything we could to make sufficient backups, but it was hard to backup
500TB of data over even a 10Gb/s link. We ended up just moving the physical
JBODs in the back of a rental car. Most stressful drive I’ve ever made.

~~~
mmt
> If you’re saving on the order of a hundred dollars per tape (12TB), you’re
> going to have enough left over for a few readers.

That's a good point, and it certainly improves the attractiveness of tape
overall. Still, a $200 saving in medium doesn't make up for a $2600 difference
in drive cost. How well does it improve the appeal of tape for parallelism?

The media+drive cost of LTO-8 for 12TB would be around $2850, while for disk
is only $550. Considering LTO-8 has transfer speeds of 1.6x or so that of
disks, that means only 5 tape units are needed to match 8 disks. That's still
$14.2k compared to $4.4k, over 3x. (With disks, I'd want to do RAID6 or
RAID-Z3 at the very least [1], which further reduces that ratio, but not below
2x).

That's certainly much closer to parity than an off-the-cuff estimate, but it's
still not close enough to be practical. Anyone needing to transfer data in
bulk, fast, would _still_ do well to choose disks, not tape.

> this is a silly argument. I think we actually agree on all of this and we’re
> talking about an extreme hypothetical

I disagree as to silliness, and as to argument, since I also think we
generally agree. This is a discussion in the spirit of HN, satisfying
intellectual curiosity.

I also disagree that it represents an extreme hypothetical. I believe your own
anecdote is nowhere near as rare as you imply. Even at lesser scale, it's an
issue: the existence of the AWS Snowball is a testament to that.

[1] Ideally true sector-level ECC, though AFAIK, ZFS can provide a close
enough approximation

------
sqldba
The problem is it's near impossible to get a tape backup system at home.

It's all fun to say "oh a single tape can do 15TB" but it must cost in the
thousands and thousands of dollars.

~~~
CaliforniaKarl
I think that's because the market has shrunk so much.

I remember when I was younger, I got a single SCSI external tape drive, and
some tapes, from the Micro Center in Cincinnati Ohio. I then used that to back
up my computer while I was in college. I've forgotten so many of the details,
like the model of drive and tape, and the backup software (it was Mac OS). I
do know it wasn't LTO, though, because it had to be rewound.

These days, were I to do something like that, I'd do BD-R. Yeah, you're only
talking 50 GB or so (uncompressed) per disc, but that's not bad!

~~~
jabl
As a kid, we had one of those home/small business targeted drives that
attached to the PC via the floppy drive controller. QIC, or whatever it was
called. Native capacity was ~120 MB, and the drive had some built-in
compression so it was marketed as 250 MB. It made a horrible screeching noise
when running.

Linux even had a driver for it at some point (ftape), but IIRC it has since
been removed due to lack of use. Never used it though, by the time I got more
into Linux that drive was already obsolete.

But yeah, today the startup cost of a tape drive is so high that it doesn't
make sense for home usage. At home I use borg backup
([https://www.borgbackup.org/](https://www.borgbackup.org/) ) nowadays,
backing up to external USB hard drives.

~~~
agumonkey
I had a HP colorado backup (400/800), I think it's QIC like.

Tried to use it on an old p3 box with win95, and was utterly surprised that
win95 backup application had support for generic backup tape drives that
allowed me to restore the tape content.

I also love the super smooth mechanical sounds of tape drives, even at the
cost of slow seek.

------
oliwarner
I was thinking —like other top-level comments here— that it's a great shame
that consumer level drives have died out. They're thousands of dollars and all
wired for enterprise interfaces.

But I'm really surprised that nobody[1] is offering "tape backup as a
service". Pay $30 per 3TB cassette (slightly over market rate), a $10 loading
fee and then $1/h for 100MB/s write access. When you're done, pay another $10
and it's shipped to you. Or $5/month for fireproof storage.

This might have to be a pre-booked service, or they could buffer upload onto
spinning disks and write to tape after the fact (much faster), but for storing
more than 1TB, this method could be vastly cheaper.

[1]: That one quick Google could find.

~~~
kqr
How would this, in effect, be different from any serious backup service in
existence today? I'm sure they do _their_ backups of your backups on tape, and
at least rsync.net is willing to ship physical media with your data to you.

~~~
oliwarner
Price!

In the simplest version of my post, you're paying for media, the time for
somebody to pop it in the machine, the hire of that machine while you copy
data to it over the network, and the postage of a tiny little LTO back to you
for archival.

On higher density tape, you could be looking at a one-off $100 total for
writing 10TB of data to tape and mailing it back. There's a lot of room
between that price and the nearest incumbent to make a strong profit there.

rsync.net seems very bespoke, so they may be able to give you a tape, but
you're looking at $250 per "incident" plus hardware. They waive that fee if
you store more than 100TB ($2k/month).

Yeah, they're much more available but most of us aren't after cloud storage,
we want a lasting backup in case the NAS dies, or the house burns down.
Restore speed doesn't matter.

~~~
kqr
Ah, true. I didn't consider the "one time dump of a boatload of data" as a use
case. My backups tend to run relatively often, and at that point, $10 for each
tape load would easily make the tape service more expensive!

------
Rafuino
While there's an explosion of data that we keep hearing about, the aspect I'm
curious about is how much of that data will at any one point be "hot" or even
"warm," in that it needs to be processed and analyzed in short order (talking
microseconds or nanoseconds, not in tens of seconds like tape (or more as the
data then needs to be sent over a network to a server somewhere else)). Cold
storage can be archived on tape or HDDs, sure, but as compute continues to
grow, surely more and more data will need to be hot or warm at any given
moment. Just whether that hot/warm data will grow its overall share of data in
existence or simply grow at the same rate is what I wonder about.

~~~
dingaling
When I started working on mainframes at a big financial company there were
jokes about the 'tape monkeys' which loaded tapes when we requested old
datasets. We laughed and dismissed that, surely everything was mechanised.

It was only after a year or so that I was cleared to visit the data centre and
discovered that the mechanical tape silos had failed years ago. Every tape had
to be located and loaded by hand. I felt guilty about all those datasets I had
ordered on whim... and was surprised that the DC guys didn't have a 'Hit List'
of progammers.

~~~
Rafuino
I'd be very paranoid if I were you :)

------
ksec
It is suggesting there are research that shows it is possible to have 20x
times the density of current types, which tops at 15TB. But these research are
like battery breakthrough, good in theory but not thought through in practice.

While we have a real roadmap, and a working solution of MAMR, much more so
than the proposed HAMR from Seagate, that will scale us to 40TB per HDD by
2025, and may be more so given we don't really know the limits of MAMR yet,
and we manage to put more platter inside a helium drives. Which I don't think
is possible with HAMR, laser heating plate with helium filled drive has got to
fun inside labs.

NAND Flash is also dropping in price, the current NAND spot price are already
backed to 2016 price or may be even slightly lower. We have Fabs from Toshiba,
Samsung and SK Hynix all coming online in 2019, all while 3D NAND are
achieving better yields. The "Ruler" that Intel proposed has now standardises
and becomes EDSFF, with up to 1PB storage in 1U. All current roadmap suggest
we should hit 8PB or 16PB in 1U by 2025. In terms of 48 Drive in 4U server,
HDD offers only 480TB per 1U by 2025. That is 16 to 32 times the density
difference comparing to NAND.

In a system where information is stored in different drives like black
blaze[1] , your reliability reaches a point ( I think, forgive me if I am
wrong ) no different to those offered by Magnetic Type. What are the advantage
left for using magnetic type? When cost, performance, space, reliability does
no longer flavours them?

[1] [https://www.backblaze.com/blog/cloud-storage-
durability/](https://www.backblaze.com/blog/cloud-storage-durability/)

~~~
robdachshund
Tapes can last for 50 years. They are the MOST stable and reliable form of
storage we have.

We have also not even reached any kind of limit on density. Sony has tapes
that hold hundreds of terabytes.

So, they last forever, are extremely reliable, and extremely data dense.

HDDs exploit the same concept (magnetic storage) but trade speed for
reliability. NAND is more reliable but still has a shelf life, and it also
relies on support chips to constitute the overall media. It's also expensive
as hell.

Tape is tape. You could smash a cassette and respool the tape and read it. Its
also cheap as hell per gb.

Tape will be around for decades if not centuries to come. NAND is not a
challenger whatsoever.

A key piece of the puzzle here is that not only are tapes used for backup, but
to free up space from data center HDDs in order to increase their shelf life.

Stop thinking of tape as archaic. Magnetic storage is the foundation of modern
technology and computing, and in past decades we barely scratched the surface
of its potential.

We aren't talking about audio cassettes with analog audio recorded to them. We
are talking about a storage medium that is more data dense than HDD or NAND
could ever be and we have yet to reach any sort of limit.

With tape you can achieve more and more space by manipulating the width and
length of the tape, the tape heads, the angle at which tape is read, and the
way that the data is read and stored. It's also just a plastic cassette with
polyester tape coated with rust particles. It's inherently cheaper, requires
no circuitry, power, or motors within the media itself, and will last 50
years.

Tape is the purest implementation of magnetic storage concepts. As such it is
going to improve faster than HDDs because there are no other design factors
like platters, motors, and heads. If an HDD dies, you might be able to recover
the data through expensive and time consuming procedures. If your tape drive
stops working, you buy another drive and still have an intact tape with all of
your data.

Additionally, while SSDs are wonderful, they are a pretty backwards design
when it comes to efficiency. You are essentially trying to model magnetic
storage by using integrated circuits. You are storing values with digital
logic in a physical state on a chip.

Its fast and more reliable than a spinning metal disk, but it's expensive and
not data dense at all.

The only threat to tape is public perception of it being "old" when it's
really the most robust and best engineered data format we have.

~~~
mmt
> Tape is the purest implementation of magnetic storage concepts. As such it
> is going to improve faster than HDDs

I agree with the former, but I think your conclusion misses a critical piece
of technology development: volume (which drives funding).

Of course, HDD volume is going down, while SSD volume is going up. That may
well give tape enough of a comparitive advantage, if it hasn't already.

> The only threat to tape is public perception of it being "old"

For a definition of "public" narrowed a bit to include only/mostly computer
professionals, I'd add "slow" as a mis-perception.

Other perceptions I, personally, have is that it's expensive and inflexible,
mostly in the context of startup companies. I don't believe that's any kind of
_threat_ to tape, however, since it's likely these same companies will gladly
over-pay for a cloud version of tape.

------
ed_blackburn
GDPR - please delete me from _all_ your systems.

JFC. Harry get the key to the basement, need to dig out some tapes. San, can
you pop down to the store and get some matches and lighter fluid...

~~~
supertrope
I don't know how GDPR handles backups and what time tables are allowed for
reasonable data retrieval and expunging.

Encryption decreases the volume of data that needs to be most carefully
protected from the entire data set down to the key (and backups of said key).
I imagine that data on mediums that make data retrieval and writing expensive
can be encrypted with compartmentalized keys. So when Jane Doe wants her data
deleted, delete the keys.

~~~
spiralx
> I don't know how GDPR handles backups and what time tables are allowed for
> reasonable data retrieval and expunging.

Keep a log of user ids who have requested deletion and purge their data
whenever doing a restore. And don't keep backups for longer than required,
which I think is generally something like five to seven years barring specific
legal requirements.

------
CharlesW
> _Mark Lantz is manager of the Advanced Tape Technologies at IBM Research
> Zurich_

Seems a bit odd to see such a blatant P.R. piece in IEEE Spectrum.

Wasn't optical going to eat rust-based cold data storage technologies? What
happened?

~~~
wmf
Optical discs are now up to 300 GB and they have caddies the size of an LTO
tape that hold 12 discs. Seek time is obviously better than tape but I don't
know about the price. The only remaining vendors are/were Sony and Panasonic.

[https://panasonic.net/cns/archiver/](https://panasonic.net/cns/archiver/)

[https://perspectives.mvdirona.com/2016/03/everspan-
optical-c...](https://perspectives.mvdirona.com/2016/03/everspan-optical-cold-
stroage/) (Sony Everspan dropped off the Internet; did they go out of
business?)

~~~
Jaruzel
From a cheaper end-user/consumer point of view, the only real option for
optical disc storage is BD-XL, which is a 3 or 4 layer Writable BluRay Disc
supporting 100GB/128GB per disc.

In comparison to tape, it's not that much storage, but fairly cheap to
implement at home. £60 for the drives, and about £15 per 100GB disc.

------
raziel2701
I can't seem to find this information with my google-fu, but does anyone know
if magnetic tapes such as in vhs and cassette tapes have perpendicular
magnetic anisotropy?

~~~
reaperducer
I’m not entirely sure what it is that you’re asking. But if you’re asking if
the heads for an audio cassette are angled, the answer is no. They’re
perfectly perpendicular.

For VHS, yes, they are angled. 30 degrees, IIRC.

If that’s, indeed, what you’re asking.

~~~
raziel2701
I was referring to whether the magnetization lies along the normal of the
film/tape or if it's in plane. HDDs were able to increase density by moving to
materials that had a magnetization with perpendicular anisotropy but I can't
seem to find confirmation if cassette tapes and vhs had in plane anisotropy
back then. More so I can't seem to find if today's magnetic tape storage has
moved to perpendicular magnetic anisotropy.

It seems

~~~
gammatrigono
Doesn't seem like it would be possible given the substrate material.

------
nishantvyas
While, Intel Introduces "Ruler" Server SSD Form-Factor.

"Intel on Tuesday introduced its new form-factor for server-class SSDs. The
new "ruler" design is based on the in-development Enterprise & Datacenter
Storage Form Factor (EDSFF), and is intended to enable server makers to
install up to 1 PB of storage into 1U machines while supporting all
enterprise-grade features."

ref: [https://www.anandtech.com/show/11702/intel-introduces-new-
ru...](https://www.anandtech.com/show/11702/intel-introduces-new-ruler-ssd-
for-servers)

~~~
nishantvyas
Well my point was not to show flash as superior but to point out the
development in that area... every use-case has reason be it flash or
Magnetic... author choose to paint Magnetic as better alternative to Flash (in
capacity dimension), which is obviously can not be true for all the use-
cases... (same for flash) my comment it to high-light that fact...

~~~
mmt
> author choose to paint Magnetic as better alternative to Flash (in capacity
> dimension)

I'm having trouble finding where in the article that occurred.

To me, it seemed to ignore flash/SSD, as the principle premise was that tape
survives due to its low cost (at scale), so only the highest-density HDDs were
mentioned.

------
Aaron1011
> And tape is very secure, with built-in, on-the-fly encryption

What exactly is this referring to?

~~~
wmf
It probably means the drive does the encryption, but so do hard drives and
SSDs.

~~~
rout39574
yeah, that. KMS involving the library and maybe an external key server.

------
sytelus
TLDR;

Today, a modern tape cartridge can hold 15 terabytes. And a single robotic
tape library can contain up to 278 petabytes of data.

~~~
sliken
Anyone know what exact tape they are talking about? When I google 15TB tape I
only find a LTO-7 which is 6TB native, and 15TB compressed. Assuming a
compression a compression ratio of 2.5 to 1 seems pretty crazy to me.

~~~
userbinator
They're probably referring to the TS1155 variant of
[https://en.wikipedia.org/wiki/IBM_3592](https://en.wikipedia.org/wiki/IBM_3592)
.

To be honest I never really understood the main reason for adding compression,
especially since you can compress individual files far better using existing
algorithms and then write those to tape. It must be a marketing thing.

~~~
sliken
Ah, thanks, they do list a 15TB tape. I looked around and could find the 10TB
variety for $212 or so, but couldn't track down an actual 15TB tape for sale.

The 10TB tape I found was "IBM 3592 JD Advanced Data Tape Cartridge 10TB/30TB
(2727263)"

~~~
slavak
The cartridge is the same. It can be reformatted from 10 to 15TB using the
newer tape drive.

[https://spectralogic.com/features/ts1155-technology-tape-
dri...](https://spectralogic.com/features/ts1155-technology-tape-drives/)

------
ggm
IBM bubble memory

------
Osmanthus
What about scratches

------
coding123
So... who actually uses them in the age of AWS / S3. Is this becoming a much
much much smaller group?

~~~
gwbas1c
[https://aws.amazon.com/glacier/](https://aws.amazon.com/glacier/)

Looks like tape as a service.

~~~
sneak
Glacier’s actual storage backend technology has never been disclosed, although
this is plausible. Other proposed technologies are racks of older (less
efficient) disks that are not always spun up, huge slow flash memory banks, et
c.

~~~
jjeaff
It's expensive enough to be anything but ssds really. Glacier is more
expensive that b2 from backblaze and they use hot storage regular HDDs

And that's not taking into account the extra money Amazon makes from the
requests and data transfer that could offset storage costs.

I wouldn't be surprised if the glacier data was mixed right in with s3
storage. Just deprioritized when io is high.

