Hacker News new | past | comments | ask | show | jobs | submit login
400 TB Storage Drives in Our Future: Fujifilm (anandtech.com)
205 points by caution 10 days ago | hide | past | favorite | 139 comments





Many people don't appreciate tape, but for long term offline storage nothing else comes close.

Offline storage is important if you care about things like ransomware.

The problem I find with disks for offline storage is that they're very easy to turn on and very easy to zero. Tapes on the other hand are much more difficult to get to if you're in the business of ransomware.


Tape has two really important features that HDDs don't:

o Linear speed

o transportability

Assuming you can feed the data to a tape drive, it will run at top speed until the tape is full. When you have a tape robot with 25+ drives, this means you can easily keep up with a 40/100gig pipe

LTO tapes really are portable. They have been designed to stand up to dropping. The abuse our tapes would get when being transported by the understandably grumpy runners was considerable. We could always get the data out of them. Portable HDDs, not so much.


My lab acquired a lot of imaging data (10 TB a year) and notoriously would store it long-term by putting the hardrive back in thr grey static plastic bag and putting it in a wooden office drawer. Surprise, 80% of the drives were unreadable after a few years. I suggested tape, but it was so hard as a grad student to figure out anything about how to even get it set up! Not to mention academic labs in bio departments are notoriously stingy!

I found a tape drive in Newegg and took the better part of a year to set it up (learning half baked that I needed this card and that cable etc). And then finally set up the tapes to back up.

Its been years since I left the lab and I heard that no one there knows how to read the data back from the tapes! I wish things were better than this!

My only anecdotal insight was that hard drives don't fail easily if they are continuously running. So I managed to hoard enough drives to keep at least two copies of data I cared about, and used a program called Mirror folder to make backups that were still easily accessible without any software lock-in by any grad student in the future.


You can now get tape drives that run over Thunderbolt 3.

Oh yes, the software that runs tape is appalling.

> Linear speed

360 megabytes per second for LTO-8. You can consistently match/beat that linear speed with two hard drives, each costing a tenth as much as a tape drive.


Tapes have become "enterprise" gear.. that is why they are expensive. HD's are looking to replicate those markups if the pricing on WD reds vs the usb drives are any indication.

There was a time that a tape drives cost about the same as a floppy/zip/etc drive. AKA a hundred to two bucks.

In the early 1990's QIC80 tapes were pretty popular in certain circles as an exchange medium since they fit a hundred or so floppies in a package that was smaller and a little thicker. Later things like the 3080 and the like bumped the capacity into the GB range for similar pricing.

Back then pretty much ever novell/etc server in existence had a tape drive in the front being used as a backup mechanism. The sysadmin dropped a tape in on friday, and took it offsite on monday (or whatever). In unix land it was possible to completely single step boot from tape and recover using things like AIX's mksysb. The death of tape, wasn't really a death so much as a giant margin increase, where a drive that cost $200 suddenly cost $2000 and came with a fibre channel interface rather than IDE..


Cost per gb tapes are not expensive tape drives on the other hand

LTO-8 is 12tb

a 12tb drive is more expensive than LTO. Even at volume.

2*6tb drives is still more expensive.

However thats not the use case right, HDDs do general storage really well.

A tape robot is the place where your data goes to rest. Its the last line in your tiered storage.

In AWS land it goes RAM->EBS->s3->glacier

tape is the same stage as glacier.

You have RAM cache (be it redis et al, or some fancy block storage) Screaming fast SSDs, but only a few TBs, Slower cheaper SSDs, but ten times as more. Then if you need lots of storage you'd have a few PB of Spinny disks.

After that it goes to Tape. Its denser, cooler and far less expensive to run than the equivalent spinny disk array.

Now this only makes sense if you have data in the PB range, and its more data than you can store online. Anything less and it doesn't make sense. It never did, its just that the capacity of storage has ballooned so much in the last 15 years that people's perceptions haven't caught up.


You're neglecting that the tapes have to go in something.

LTO-8 drives are vastly more expensive than 12TB hard drives.

A pile of tapes, where the vast majority of the tapes are not in drives, is cheaper than hard drives. But then you're not getting those high speeds anymore.

-

Let's do some simplified math. 12TB hard drives for $26/TB, LTO-8 tapes for $9/TB, and tape drives for $3500.

Since you're saving $17/TB, you need 205TB before tape gets cheaper. So 17 tapes.

Each hard drive takes about 18 hours to fill. So 17 hard drives take 18 hours to fill, at a total speed of 3GB/s.

Each tape takes about 9 hours to fill. So 17 tapes take six and a half days to fill, at a total speed of 360MB/s.

If you have enough tapes per drive to make tape cheaper than hard drives, it's slooowwww.

If you have fewer tapes per drive, it can be reasonably fast, but never as fast as hard drives.

Tape has a lot of advantages. But not speed.


Does the $26/TB for hard drives include the cost of the JBOD chassis you'll need to to have them all online at once, in order to get those speeds?

No, but I didn't include the cost of the tape robot either. I'm seeing prices like $10k for a 6U/80 tape robot, and $10k per 80 drives is more than enough for a very fast JBOD chassis. You could more than double the budget of a Backblaze pod, and upgrading to an EPYC and 100 gigabit ethernet would only take about a thousand dollars.

Also even in some ridiculous circumstance where your tape robots were free and you have to pay $7.2k per JBOD chassis, that still only bumps the cost to $36/TB and the overall analysis doesn't change much.

$26/TB was an overestimate anyway. Especially when you can get 5400RPM drives for $18/TB, and it only hurts your speed by about 20%.


Maybe I’m just not techy enough, but in my old fashioned mind I feel like creating petabytes of data regularly implies data cruft more than anything else.

I don’t know. I have this deep suspicion that as humans we are creating more and more bits that are just building up like our plastic waste is in the universe. At the moment there is obviously no major downside, but a part of me wonders if that will always be the case...


I feel this comparison is a little glib, considering that when the hard drives fill up you need to buy two new ones, while the tape drive only needs a new tape.

Fair, but the upfront price of a tape drive is $3-5k.

The cost benefit comes down to the equilibrium and how much storage you need.


Can I take a drive to a copy shop to convert it to tape for $50? (If not, why not?)

There are a few good reasons why not:

1) An LTO-8 tape (capacity approx 30TB) is around $200 to $300 AUD (depending on brand), so that figure could only work if you're excluding the cost of the tape. LTO-7 (~15TB) costs around $100 to $150 AUD and would probably be more suitable, but it's still a lot of money for what's essentially a consumable item.

Also, I would imagine that if the choice was between buying a tape that requires you to go to a third party to use or invest thousands of dollars in a tape drive, or just buying an external HDD you can use at home and unplugging it when not in use (so cryptolockers etc. can't eat your backups), most people would go for the external HDD.

Unfortunately LTO-8 drives can't read LTO-6 or earlier (previously the rule of thumb was drives could read/write N -1 and read N - 2 of the drive generation e.g. an LTO-7 could read but not write LTO-5 tapes), and it probably wouldn't make sense to invest in outdated drives capable of writing LTO-6 (~6.25TB) or earlier, even if you could obtain the older LTO-6 or smaller media relatively cheaply. Vendors would not want to support it.

2) Most backup software I'm aware of is designed around recurring jobs (e.g. BackupExec), not one-off jobs, and to store data from a number of backup targets onto a single tape (or set of tapes).

You could work around this, but there's no off-the-shelf software for doing so I'm aware of (I know some versions of windows server backup can write direct to tape, but this would have to cope with a variety of source media and also make sure that jobs from different customers are never co-mingled on the same tape, etc.).

3) The cost of the hardware required to achieve a decent turnaround time on the tapes (depending on target time e.g. some customers might pay a premium for same-day conversion) would probably exceed what you're likely to earn selling this service. I doubt it would be financially viable, simply due to the low number of customers.


"You could work around this, but there's no off-the-shelf software for doing so I'm aware of (I know some versions of windows server backup can write direct to tape,"

I think what you are looking for is LTFS.


Not so much the filesystem, more that assuming you have a server with a multitude of our customers HDD's attached as well as one or more tape libraries / drives, you need to:

* Map 1 to N HDDs to one customer (by USB port ID or perhaps SATA port if you shuck them for performance reasons, maybe a mix of SATA and USB drives) - will need to have some means of clearly indicating ports, or have blinky LEDs next to them and have the server indicate where stuff ought to be plugged in. And handle SATA hot-swap if required.

* Allocate a tape or tapes to them (and if they provide their own tape/s make sure to use specific slot/s in the tape library for that customer) and otherwise handle the tape library logistics. This becomes more complex if you have multiple libraries in play.

* Perform scheduling and ETA estimation to see if you'll actually get their data onto the tape within the deadline (you might offer a turnaround of a day, but offer a premium service to have it done within N hours) and see if it will affect other customers. This may even involve deciding that an existing currently running job is early enough that you might stop it and scratch the tapes, then run the priority job instead.

* Handle predictive maintenance, drive cleaning, etc. for the tape library (and how it will affect customer deadlines / capacity planning)

* Actually copying the data onto the tape/s and verifying the result

* Integrate into a monitoring and notification system so customers are alerted when their order is ready to pick up, talk to customer billing/sales systems, etc.

* Probably a bunch of other stuff I haven't thought of yet

The system would generally have to be resilient enough that it can run unattended overnight / for long periods of time, but flexible enough in scheduling that it can handle sudden rush orders, drives being offline for maintenance, and so on. And ideally could be operated by staff with minimal technical background.

Maybe Iron Mountain or someone like that have this sort of system already, but I am not aware of any commercially or generally available software that could do the job.


You don't need any of that. A dedicated device could transfer data between a hard drive or two and a single tape drive. You'd have a few of these on a shelf somewhere, doing a couple jobs per day. Easy to use, easy to schedule.

Pretty sure that there are colocation places out there that have this, but for copy shops I think it wouldn't work because in general, people have too few data.

Most people are perfectly served by the HDDs/SSDs that ship with their computers, and the minority that's not served often buys one or two external HDDs of the newest generation. That's enough for the largest part of the population.

Fewest private people have data on the order of dozens of TB where HDDs become so expensive that you are motivated to use the user unfriendly tapes.

Why are tapes user unfriendly? For each read/write operation you need to visit the copy shop. This limits the use cases to ones where like backups and archival. But you can't really use them for archival as when you don't archive the drive together with your tapes, who guarantees that a copy shop will exist in 10 years that has drives that can read such old tapes? Yes, tape drives have limited read compatibility to older standards. https://en.wikipedia.org/wiki/Linear_Tape-Open#Compatibility

But maybe there is a niche for it somewhere, who knows.


Thats not where tape it useful.

Tape only makes sense when you have tape robot. A bit rack type thing that has more than 5 drives and thousands of slots.

Unless you are on a very tight power/heat budget and you have a brilliant asset management/storage system then tape isn't the answer.


I don't see how that matters in a discussion of speed.

Pick any budget. For that budget, hard drives are faster than tape.

Tape's advantages are elsewhere.


"Offline storage is important if you care about things like ransomware."

There are other solutions to this problem.

Specifically, if your cloud storage provider performs zfs snapshots for you, those snapshots are immutable and cannot be altered, or destroyed, by the user login:

https://www.rsync.net/products/ransomware.html

I'm sure there are similar mechanisms available in S3 and other cloud storage implementations (albeit, perhaps not using ZFS).


That solution is an order of magnitude more expensive, so isn't really a viable alternative.

For say 200TB of backup over 5 years:

Rsync.net: $0.015/GB/month [0] * 200000GB * 60 months = $180000

LTO-8: $3600 for a drive [1], $5000 for 20x12TB tapes [2] = $8600.

Rsync.net is 21x the price of tape.

The breakeven point is surprisingly low as well. A single tape costs $140 [3], so the minimal entry cost for tape is $3740 for 12TB. For $3740 you only get 249TBmonths at Rsync.net, which is equivalent to 4.2TB for 5 years. So if you're storing more than 4.2TB of data, it's cheaper to buy an entire tape drive than to use Rsync.net.

Rsync.net seems like a great service but that pricing is really crippling. Even the "discounted" Attic pricing just barely beats a single tape, at 12.5TB for $3740.

And before the "but this is an online service" argument comes up, Hetzner offers a 40TB dedcicated server for 64€/month, or 0.0016€/GBmonth. That's cheaper and comes with an entire dedicated server thrown in.

[0]: https://www.rsync.net/pricing.html

[1]: https://www.bhphotovideo.com/c/product/1454545-REG/mlogic_ml...

[2]: https://www.bhphotovideo.com/c/product/1482276-REG/hp_q2078a...

[3]: https://www.bhphotovideo.com/c/product/1389774-REG/hp_q2078a...

[4]: https://www.rsync.net/products/borg.html

[5]: https://www.hetzner.com/dedicated-rootserver/matrix-sx?count...


What happens if a customer wants to delete their backup? Do you require a phone call or something? 30day grace period?

If you need to reliably delete any data, encrypt it before uploading and wipe key.

That doesn’t delete the data, just makes it unusable until the encryption algorithm is broken. Which will certainly be a while, but not forever.

But [1] will the data be worth anything and [2] will you still exist when/if it is broken?

I suspect there's a not-insignificant number of people out there who are glad they didn't make the encryption on their old data very strong (and likewise, those who regret having used very strong encryption and lost the key.)


> Which will certainly be a while, but not forever.

You can't really prove that. There are physical limits of computation improvements.


He can't, but we also lack a proof that any of encryption is at all computationally hard to decrypt, so to err on the side of caution you should assume it.

We know, provably, that (properly implemented) one time pads are perfectly secure and will remain so forever.

TL;DR Encrypting is still better than not encrypting

Normal delete operations on common file systems don't wipe the stored data with zeros or random data either. They just mark the space as unused and make the directory entries invisible. Any tool that can read the disks directly can see the data even after it's deleted (though it may have trouble constructing it back depending on fragmentation of the data before deletion and the reuse of the freed up space). There are "undelete" tools available for different file systems that can try to recover your data after deletion (some additional conditions apply).

For cloud based backup providers, I doubt if any of those would spend time and energy wiping your data areas clean just because you deleted the data or asked for it to be deleted (since storage is also shared across customers, wiping clean your data would have a negative impact on other customers' data access speed too). They would just let the OS do its job, which would be what's described above.

You can run free tools to wipe a magnetic hard drive with zeros, run multiple passes, etc. With SSDs, that's not reliable because of the additional abstractions used and presented by the controller to the OS.


You can always delete the data in the root of your account - the base of the rsync.net filesystem that you interact with. You don't need us for that.

But yes, if you wanted to also delete snapshots within your .zfs/snapshot then you would need to call or email us. Our support folks would be happy to help with that.


Indeed, and if someone has root at rsync.net, then it is game over.

However, if they were shipping to tape and used an offsite tape storage facility, then it is just a matter of time to get the tapes loaded. I'll settle for that.

ZFS is a really good FS. Really, really good. Archives should be offsite and disconnected, though.


If you use a Mac, you can create an encrypted HFS disk overlaid on top of another filesystem that TimeMachine can then back up to. Depending on the size of the disk and frequency of the backup, you can keep snapshots that are as long as a year old. You can then take nightly backups of the disk (which is literally just a folder) and store it on a cloud provider or any remote storage. I've been doing this for the past year or so and it's been flawless.

Both AWS Glacier and S3 Object Lock are immutable.

I had a great conversation with a couple of IT folks once where they objectively covered most of the variables. It came down to capacity, linear write speed, and the relative likelihood of tape adhesion versus bearing stiction. In any given time interval, the balance could shift. Seek time (find beginning of backup N) and cost felt like tie breakers.

To be fair, this could be addressed via a firmware enforced write protect switch on the drive... or an electronically settable ball-point-pen resettable write-breaker.

There are company now that sell drives with special WORM firmware.

I love tape but the economics of tape really just don't look good today. The patent litigation over LTO8 basically assured that. Whole nearline SATA drives are significantly less expensive per byte than LTO8 tapes, even ignoring LTO drive/library costs.


> Whole nearline SATA drives are significantly less expensive per byte than LTO8 tapes, even ignoring LTO drive/library costs.

In Germany, using non-bulk prices, HDDs seem to start at 19€ per TB while I could find multi-write LTO8 tapes costing 8€ per (compression disabled) TB.


While the media is is cheaper LTO8 drives are not. A single Disk writer with manual change can be around 4-5K USD.

That is alot of tapes to recover the cost savings over HDD


Yup. This article made me recheck the economics of tape backup. You'll have to have a lot of data before it becomes cheaper than simply duplicating the storage. Worried about ransomware? Have a dedicated backup machine that does nothing else, it's as isolated as possible, it reads the data to be backed up and writes it to the backup drives.

Indeed, it's not economical for most people. Even rental is probably not something that most people would bother with. Most people are served enough by the one or two HDDs they own, maybe a NAS.

Don't forget to average the cost of your drives and robots.

Oh. That is a massive change over the last 6 months. Okay, they're much closer now.

The option has also been there to format LTO-7 tapes in LTO-8 drives and get 9TB each. That cost 7€/TB a couple years ago and 6€/TB today.

there was a huge patent fight that delayed the production of LTO8 tapes for quite some time, when they finally released they commanded a huge premium, they are still inflated in price compared to LTO7 tapes per GB.

I really really don't understand why devices don't routinely come with a physical (not software) write-protect switch on them. Disk drives should come with them. Routers should come with them (so malware cannot install itself). Etc.

I'm sure you know this, but for the young'uns here:

It's even worse than that. The SD card spec provides for a "write protect notch" and corresponding physical switch. On the card itself.

But, it's optional and it's not implemented by the card. Instead it's up to the host software to honor it.

Morons!

USB thumb drives should also have a physical switch. But they don't.

I'm from the old school of "no ring, no write" on 9 track tape. Saved many people's asses back in the day. But that knowledge has been lost in time ... like tears in rain.


I'm surprised that people who really want security, like the military, do not demand hard switches.

I think military key exchange still used paper/mylar tape as a "data diode" until recently: https://www.cbronline.com/feature/punched-tape-ukkpa

I agree. Hardware switches should also be put on computer cameras, microphones, network cards, etc. There will still be vulnerabilities, but most of us use locks on our house and car doors even though thieves can (and do) break the windows. Hardware switches on computer equipment can remove some vulnerabilities and (sometimes) add a means of recovery.

Doesn't get you much, unless the vulnerability requires a writable filesystem in order to be exploited or you really really really want to persist across reboots (not hard but not usually needed) or firmware updates (much harder to do robustly, expertise required).

Malware can operate entirely in memory, not touching disk. If access is lost, reinfection can take place (has been used by nation state actors). Even the Mirai botnet which was done by amateurs didn't persist and simply reinfected. So it comes down to your potential adversaries. Actors deploying wide, non-targeted attacks Mirai-style get little value from persistence. It's simply an extra layer of unneeded complexity.


Doesn't get you much

It can do a lot in the right circumstances. E.g. if a USB thumb drive had a physical write-protect, it could be plugged into a potentially hostile computer without concern about losing the data on the drive.

Years ago I saw a few USB thumb drives with that feature. But I don't know if they're out there any more. They're almost as scarce as hens' teeth.


Iirc the kanguru series of USB flash drives make bank now that they are one of the few remaining manufacturers of flash drives that have hardware write protect switches.

> There are company now that sell drives with special WORM firmware.

Like what? I’d love to check this out.


E.g. https://greentec-usa.com/

I have no experience with them-- I've just assumed it would be overpriced.

I'd certainly be interested in worm drives for personal use (e.g. for access logs, and photo archival), but not so interested that I'd pay a big premium over standard drives. :)


Say I have around 25TB of a personal data hoard. Does tape make sense yet cost wise, instead of buying have it backed up over 2 HDDs?

I bought a Fibre Channel LTO-5 drive and HBA off eBay a few months ago for $211.81, including cables and a power supply. New LTO-5 tapes are $8/TB right now, whereas the cheapest spinning disks are around $15/TB. At those prices, the break-even is about 64 TB, anything more than that and tape is cheaper.

https://diskprices.com/?locale=us&condition=new&disk_types=l...


You also need to account for duration of storage. i.e. the spinning disks will need lot more replacement than the tape would over period of few years.

IBM's documentation specifies 10 years shelf life for the tapes in non-ideal storage conditions and 30 years if kept in stable temperature/humidity. Even the most reliable hard drives only come with a 5 year warranty, so yeah, you'd probably want to double the relative price.

Do you have a link to the $8/TB cartridges?

Are there used tapes on the market?

Yes, but they are only specified for ~200 full writes. Hard to economize on backups.

Interesting. I guess incremental backup or archive applico would still leave a lot of lightly used media on the aftermarket, when those users upgrade to newer tape tech.

Most likely not, since you’ll need to purchase a drive for reading and writing the data. That alone will cost more than a couple hard drives.

Idea for someone: IsTapeBackupWorthIt.com - showing the amount of data you need for tape backups to make sense at any given point in time. Niche audience for sure, but revenue per click is going to be pretty high.

If tapes are worth it for someone -> show referal links to some trusted tape provider partner based on region

Otherwise, show good deals on hard drives, per region. And recommendations on QNAP/Synology enclosures.


Someone registered the domain just now, fantastic! I hope they also implement something suitable, instead of just putting some crappy google text ads on it.

Tape makes more sense at higher volumes and more difficult retention needs. Even then, as disk sizes crossed the TB boundary, VTL and S3 made many of these use cases difficult to justify tape investments.

Early in my career, I worked at a place that had to retain certain records for up to 100 years. At the time (circa 1999), they had a big tape silo solution that probably cost something like $10M/year (or more) to own and staff. Today, capacity for "cool" data is essentially unlimited and near-free. You could host all of that data on a small device locally (if needed), and retain the data (and outsource the engineering for long term retention) in one or more cloud services for <$100k/year>.


The limiting factor of tape is in access times and bandwidth relative to size. If the data is written and read too slowly, and your seek times go too high, you start to lose your ability to usefully test, verify and recover the archive, and looking at that chart in the article, the time to fill has exploded recently, from under two hours in the 2000s to over nine now. Can you guarantee operation for nine hours without interruption or power loss?

Of course, that doesn't make huge tape storage useless, just less useful as a format for accessing a single giant archive. One answer would have it that not using the whole capacity is OK, and the goal should be to archive a smaller dataset more often. But it's definitely something that demands real engineering work, while a lot of organizations can barely manage to plan for any backup.


Tape works well where storage is partitioned over many units of storage media, access is infrequent relative to total storage, media are well indexed (such that identification, retrieval, and access are trivial), storage is redundant (2+ copies of any given data), and there are multiple read & write heads allowing for simultaneous access.

Tape is a very large tank accessed through a very thin (and expensive) straw. But storage quantities offered are immense and reliable.

Addressing your concerns, increase the available heads if access is constrained. Tape libraries provide for this.


Both S3 and Google offer file storage where you can set that it can't be overwritten or deleted until a certain date. So you get the best of offline and online storage.

I imagine it to be more expensive than both running disks or tape backup yourself.

AWS S3 Glacier is dirt cheap as long as you don't need to restore. AWS S3 Object Lock is about $.02/GB, Glacier about $.004/GB.

If you have a decent connection, cloud storage is hard to beat.


Hmm, the marketing for glacier says "starting at $1 per terabyte per month" (at the top of https://aws.amazon.com/glacier/) but the pricing page seems to quote $.004/GB = $4/TB as the cheapest storage in the US and the random selection of other regions I clicked on.

At $4/TB why would I pick this over blackblaze B2 which is $5/TB (pretty basically the same) but works like live normal block storage instead of taking hours to retrieve (or minutes if you pay extra)?

Moreover the download cost from glacier to AWS is $0.02/GB plus at least $0.003/GB in retrieval costs (or $0.09/GB to $0.05/GB to the internet), while the download cost from blackblaze to the internet is only $0.01/GB and is free if it's to many of the members of the bandwidth alliance (notably including cloudflare).

I'd have to be looking at many petabytes of data before the miniscule cost savings of glacier data that I actually don't touch is worth it. Or I guess if the data was already in AWS and the egress to blackblaze was going to be too expensive to be worth it.


S3 Glacier Deep Archive is $0.00099/GB which comes out to $1/TB. AWS also doesn't charge you to transfer from Glacier to any AWS service as long as you're staying in the same Region.

Does Backblaze B2 have multi-site redundancy? We have our S3 buckets automagically replicated to a different AZ. That was a must have since we're highly regulated, and our backups need to persist for over 7 years.


It’s been a while since I looked further into this, but are you accounting for data transfer fees as well? I recall that being a non-trivial cost.

I don't know the costs of data egress for Glacier, but standard S3 is around $.09/GB. Considering that most backups are never used, the cost of data egress is usually pretty minimal in the grand scheme of things.

> AWS S3 Glacier is dirt cheap as long as you don't need to restore.

What does that even mean? If I don't have to restore, /dev/null will be even cheaper!


What are your egress costs?

I have not used the tape personally but I really would like to. The problem is balance between price and size. Only affordable technology seems to be 2 generations old but it is too small for current generation hard drives.

Can’t you avoid ransomware by just not using Windows?

Linux is not invulnerable to ransomware, it's just easier to secure.

To add to that, Linux is easier to secure because is found mostly in the datacenter (I'll ignore Android for this one). It's not only more robust but it's secured according to a whole different standard than desktops since it doesn't need to be operated by Average Joe.

On the other hand the vast majority of Windows installations are either regular home users (with no knowledge for properly setting up their computer, and ideally with the updates disabled because they read on the internet that updates are bad) or corporate desktops (which might be secured better but still at the mercy of a user clicking on everything).


Linux and BSD are found in a lot of places, it is more to do with its design being supportive of all kinds of deployments rather than only just popularity i think.

Windows Server administrators are lot more likely to be "certified" and "trained" by MS approved courses than linux admins and of course it does not help.

I would say its more likely to find an average joe ( from a devops perspective) running a linux box than a windows box.


> average joe ( from a devops perspective)

That's a whole different kind of average joe. A mediocre tech person is still an order of magnitude more qualified to take care of a machine than a mediocre user.


I would argue that typical desktop linux distros are as difficult if not more to secure than desktop windows.

It's hard to keep track of the current state of things, but issues that come to mind are, all of the design issues with X, such as poor isolation of applications, X server typically running as root. Desktop apps are typically not deployed with SElinux. Compromised apps can generally access all of the users' data.

Something like ChromeOS on the other hand, is pretty secure, but that is a whole different beast.


I knew a guy who knew a guy who decided to get revenge for some stupid perceived offense on a MUD.

This guy had the patience to put a time bomb onto their system and not trigger it until it was on all of the backups. How does someone have the patience to do something like that and yet do something so petty when they have a month to think about it? The older I get the less sense it makes, and it didn’t make much sense even then.

Years later another person did a similar thing to a different victim, hacked the kernel on the machine to clone all conversations. Published them, and then published the new conversations after the admins restored from the tainted backups. Surprise, motherfuckers!

Both sites went offline for months.

It’s easier for me to imaging doing something like this on a fishing expedition. You’re gonna try with a dozen potential victims and then enjoy the two who get bit the hardest. Vendetta is also possible, but more fanciful and can often be mitigated by not being an asshole in the first place. Also motive? Easier to catch a jilted lover than a serial killer.

In part because of these experiences, I had been known from time to time to keep a secret cold backup of the 10% of our assets that would be hardest to replace. But if that were ever stolen? Oof. I’d be fired so fast they’d throw my belongings out the window, so I stopped at some point. Git provides a little bit of that now anyway.


This seems poor hiring rather than anything tech can or should solve.

On a general note, Infra could be more like cattle than pets perhaps ? If configuration is applied via version controlled and reviewed automation (ansible, helm , hand-crafted YAML...) perhaps could have been mitigating ? - dev-ops like this is considerably easier on linux than on Windows

However like i said tech cannot solve poor employee choices, setting up and following the systems without bypasses depends on the same employees, so nothing someone with malicious intent cannot undo.


People are complicated and unpredictable. Nobody can solve poor hires.

Interesting replies.

(1)I said avoid, not eliminate. The comments replying “no” are really saying, “yes“. Doesn’t ransomware typically install an executable on the victim’s machine? And won’t the mast majority of those payloads run only on Windows?

(2)Desktop Linux actually exists. I use it exclusively, every day.

(3)Security, obscurity. A proverb that is almost exclusively applied incorrectly. I’m not talking about cryptographic security. Just asking whether it would help to lock the door and take down the “please rob me” sign.


You know the best way to avoid attacks against your computers?

Don't have any computers.

That's roughly the equivalent of what you're saying. A business cannot just throw out its entire software stack and start over, not to mention the complete lack of equivalents to a lot of Windows only software that's out there.

Even for individuals, it's often not reasonable to expect people to switch to a different platform and drop tools they've been using for a decade.


Those are not equivalent. Some businesses don't need tons of software, and can switch. Other businesses have not even started yet, and could avoid windows from the start.

An extra backup is cheaper than switching software stacks, sure. But you're stretching the hyperbole pretty far.



Interesting. But, from the article: “So far, researchers have only seen Tycoon targeting Windows in the wild.” That was my point. The article you link to is an example of how you can avoid ransomware by not using Windows.


How would that avoid ransomware? For individuals that might work but it’s not scalable. It’s just security via obscurity.

There are both mac and linux ransomware out there.

When I graduated from college I stored all my projects on a magtape. About 10 years later I discovered that there were essentially no drives in existence that could read it, because 1) the drive that created it was hopelessly obsolete and 2) it had drifted so far out of spec that even a machine of the same make/model could not read the tape.

I threw it out, and have been soured on tape for long term storage ever since.

Fortunately, I had also downloaded a few files over the phone to a PDP-11 with an 8" floppy drive. I can no longer read those floppies, but I regularly copy all my files to new media every year or so.


Drive availability is the real downside of tape. You have to rotate tape libraries to the new media every few generations to avoid getting stuck-- LTO drives can only read the two previous generations of tape.

This is easy enough to do for tape libraries where you have robots to manage everything, but really dampens the effectiveness for personal backups.


According to the table in the article, improvement from 2010 to 2020 was 7.5×, from 1.6TB to 12TB. The projected improvements by 2030 are 32×, so I'm a bit skeptical.

It's too bad that cheaper drives can't be made with old technology. Perhaps by making much larger cassettes? Everywhere I've worked the total cost of a backup system built on tape was higher than buying HDDs due to the drive itself.


This doesn't seem like a fair comparison. Based on the table provided, 2010 retail is 1.6 TB, 2020 retail is 24 TB, and 2030 retail is 384 TB. That's a 15x improvement last decade, and a projected 16x improvement next decade.

Whether they're fudging the numbers for the current year to make that math work is up for debate, as it's 2020 and they aren't yet on LTO-9, with 24 TB drives


7.5× in 10 years, 7.5× in another 10 years. That's 56× overall. If anything, it's a conservative estimate.

Can these LTO drives and tapes be used reasonably for a continuous backup system? What I want is something that just mirrors all disk writes to the tape so the tape ends up as essentially a log of all disk writes since the tape was inserted.

That could have a lot of starting and stopping on the tape, which might be bad so it would be acceptable to have some sort of buffering so that the tape only needs to be written when a lot of changes have accumulated.


A number of older systems (zseries ztpf/nonstop/others) are used like this (basically a log dump on an interval).

LTO is designed to speed match to avoid shoe-shining (rewind/rewrite). AKA the drive changes the speed the tape is moving past the heads to accommodate the read/write speed of the application. Which is one of those fun computer cases where you can literally hear how fast your application is running. The more recent drives have very large ram buffers to smooth it all out too.


To some extent. https://superuser.com/questions/1121980/lto-tape-speeds https://www.quantum.com/globalassets/documents/lto-tech-brie... If these are representative, then you get about a factor of 3 to vary the speed, and the minimum is still way too fast to record most live drive use.

FWIW an option I use at my home/office is a 2u 12bay supermicro box running freenas with extra hard drives as the backup DST (Destination).

Using periodic replication on the source (also freenas/ZFS) and some Scripts I wrote using ipmi to start up the DST box-

I have the DST machine downstairs at my house And once a week for about 12 hours it auto powers up, the auto application starts and after about 12 hours shut down. (10g uplink)

So outside of 12 hours a week the machine is powered off. Not the best solution and it does have Some sec holes, but it was something cheap to backup alot of data.


This is my approach as well. It's probably not great for the drives since they have to spin up every power cycle, but it's a lot more energy efficient than having a server running 24/7!

I honestly hope tape drives drop in price. I'd love to be able to shelf a lot of data at home.

LTO specifications update at regular schedules employing forced regularized innovations, something like P times capacity every Q years and R times more effective storage with compression thanks to new XYZ technology invented after last update, so that corporate users can easily plan ahead for equipment upgrades.

That’s not how innovations work but they have been doing that for decades.


I like the top comment chain there where there's a conversation like this:

User A: Tapes are OK, but I wish there was new tech. Probably optical.

User B: Impossible to get this density with optical given wavelength requirements.

User C: In 2D yes, but maybe with 3D disc storage. The technology isn't there though.

User B: Consider a tape on its side...


> As reported by Chris Mellor of Blocks and Files, Fujifilm points to using Strontium Ferrite grains in order to enable an areal data density on tape of 224 Gbit-per-square-inch, which would enable 400 TB drives.

What keeps us from applying similar tech to HDD?


Hard drives are much higher density already. A quick search shows estimates of 1-2 Tbit/square inch for recent techniques like SMR or HAMR.

Remember that a tape has a huge amount of surface area compared to the small track area of a few discs. The material science aspects of that flexible tape flying by a helical scanning head are quite different from the circular tracks of a hard disk.


HDDs already have higher data density on their platters. In excess of 1000Gbpsi

Has anyone ever tested their claimed storage life of 50 years?

The problem is you can only read it with an LTO drive 1-2 generations newer, so by that point all existing drives would likely be at least 30 years old...

Since LTO hasn't existed for 50 years the key proposition can't be tested. the grandfathering clause is only incidental because an older drive, assuming you could cable the SCSI up correctly would work.

(I recently tried to get a Sun U10's DAT 9mm working, sourcing the SCSI cables was a nightmare. And then, the drive refused to rotate under UNIX mt <subcmd> patterns, ejected the tape unread.)

To be less flippant, most 50y claims are tested by doing more intense service life e.g. high heat, many rewind re-read cycles, to simulate mechanical/thermal ageing. You cannot really test print-through from just "sitting there" except by sitting something unread on a shelf for 50 years.


I kind of wonder with drive tech if they know they could have gotten 400TB out of tape 10 years ago, but they were like.. Meh, we'll slowly scale it up so we can sell a new drive every year. The progression in the technology is just so smooth. Well I'm not one of the 200 or so people on planet earth who actually knows how to manufacture this kind of drive, so I guess I'll never know.

I was excited until I saw it was tape. Drives will probably be thousands or tens of thousands of dollars and then the individual tapes would probably be in the hundreds or more. Since this tech is almost exclusively sold to enterprises and priced accordingly.

Current prices for LTO-6 is ~20$, LTO-7 ~50$ and LTO-8 under 80$. Try finding a disk drive that holds at least 12 TB (typically more like 25 TB) for 80 bucks.

Tapes are cheap. Tape drives are very expensive. I can buy 200TB of hard drives for about the price of an LTO-8 tape drive.

200 TB of LTO-6 : drive ~1500$ + ~50 to 80 20$ tapes = 2500 to 3100$

200 TB of disk : 25 * ~150$ 8TB disks = 3750$

So not really. Tape is generally cheaper overall for any volume over 100 TB. The question is what you're storing and why. For archival, long conservation, tape is unbeatable.

I recently opened a case of 32 2 TB drives shelved since 2010 or so. These are Hitachi Ultrastar drives, the more reliable drives by a large margin. However, 3 of these were dead. Non-running disk drives actually fail as much or even slightly more than running ones (I have hundreds such drives in machines running 24/7 for the past 10 to 12 years).

OTOH I've restored 120 LTO-2 tapes from 2002 recently. only one had problems (but most of the data was readable in the end).

Tape is a great offline storage medium. Disks aren't.


I wish there was a middle ground. A way I could archive around 10 TB for around $500. I'm basically stuck with using hard drives + frequent backups and ZFS scrubs. Which isn't such a bad setup, but I'd like to have something offline.

I've looked at M-Discs, but at ~$0.12 per gigabyte (for something completely non-rewritable), the price still isn't right.


>I recently opened a case of 32 2 TB drives shelved since 2010 or so. These are Hitachi Ultrastar drives, the more reliable drives by a large margin. However, 3 of these were dead.

HDDs can get contaminated by siloxanes or hydrocarbon vapours, the former can come from adhesive tape and turn into abrasive SiO2 particles on the head.

Only anecdata, but I used silicone 'covers' to protect a couple of backup HDDs in storage, and one failed within a year :(

https://www.jstage.jst.go.jp/article/trol/6/1/6_1_96/_pdf/-c...

Wonder if the helium filled drives would fare better in storage?


14TB drives were on sale for $200 recently. I picked up another 5.

15 drives * $200 = $3000 total 210TB

I also said LTO-8 not LTO-6. I can manage 15 drives. LTO-6 is 2.5TB which would require 84 tapes to equal the 210TB of hard drives. 84 tapes is going to require tons of swapping. What is the price of an auto switching LTO-6 library?

I would consider LTO-8 since it is 12TB per tape but those are going to be even more expensive.


The cheapest LTO-8 drive I can find is around $3000 I'd pay $80 for tapes if I could get the drive for a couple hundred. I can't afford a drive that costs $3000 plus $80 per tape that I need.

That's not even how much this new 400TB tech is going to cost. At least LTO8 is fairly recent apparently, but it's only 12TB, not 400TB.

My whole point is that this is out of reach for consumers, hence my disappointment. Even the older tech is out of reach.



Mentioned in the article. There's at least a decade delay between leading edge and commercialization.

Question: Is Amazon's “S3 Glacier Deep Archive” storage tape storage?

price: $0.00099- 1 GB, SPEED OF RETRIEVAL: 48 hours for bulk, 12 hours for standard, FIRST BYTE LATENCY: Select hours


> price: $0.00099

This is the advertised price on https://aws.amazon.com/glacier/, but when I click through to regions, every region is 4 to 5 times as much. Do you know if I'm missing something?


Any recommendations for a reliable LTO5 (or newer) autochanger that can be found on the used gear market?

Old dell powervaults?

Be careful with them. If you get one who's firmware is locked, it can be impossible to unlock.

Saying from experience. Though if it were to happen again these days, I'd probably just rip out its controller and replace it was an Arduino. :)


What causes a firmware lock?

It was a few years ago, but from rough memory the on device display itself asks for a password.

I could be mistaken though, and it might be more a web UI thing along the lines shown here:

https://www.dell.com/community/PowerVault/PowerVault-124T-Us...

In my case, I do remember being able to control the device using the physical controls on it, but being blocked when doing "other stuff".

Not really remembering what the "other stuff" was though. The default admin/password combo, and "how to reset the password" stuff found online definitely didn't help.


at least, my nodejs projects will all the space necesary.

used to run TSM feeding into 6 LTO6 drives off a gpfs cluster with snapshots, much fun

Meta:

Should the format of these types of headlines be:

* Some Important Statement: Speaker (Fujifilm here)

Or should it be:

* Speaker: Some Important Statement

Is one more 'proper' than the other?


The colon just seems like a copula to me, and both "the sky is blue" and "blue is the sky" work. I guess putting the smaller thing first (like the quote) comes off as a little poetic?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: