Offline storage is important if you care about things like ransomware.
The problem I find with disks for offline storage is that they're very easy to turn on and very easy to zero. Tapes on the other hand are much more difficult to get to if you're in the business of ransomware.
o Linear speed
Assuming you can feed the data to a tape drive, it will run at top speed until the tape is full. When you have a tape robot with 25+ drives, this means you can easily keep up with a 40/100gig pipe
LTO tapes really are portable. They have been designed to stand up to dropping. The abuse our tapes would get when being transported by the understandably grumpy runners was considerable. We could always get the data out of them. Portable HDDs, not so much.
I found a tape drive in Newegg and took the better part of a year to set it up (learning half baked that I needed this card and that cable etc). And then finally set up the tapes to back up.
Its been years since I left the lab and I heard that no one there knows how to read the data back from the tapes! I wish things were better than this!
My only anecdotal insight was that hard drives don't fail easily if they are continuously running. So I managed to hoard enough drives to keep at least two copies of data I cared about, and used a program called Mirror folder to make backups that were still easily accessible without any software lock-in by any grad student in the future.
360 megabytes per second for LTO-8. You can consistently match/beat that linear speed with two hard drives, each costing a tenth as much as a tape drive.
There was a time that a tape drives cost about the same as a floppy/zip/etc drive. AKA a hundred to two bucks.
In the early 1990's QIC80 tapes were pretty popular in certain circles as an exchange medium since they fit a hundred or so floppies in a package that was smaller and a little thicker. Later things like the 3080 and the like bumped the capacity into the GB range for similar pricing.
Back then pretty much ever novell/etc server in existence had a tape drive in the front being used as a backup mechanism. The sysadmin dropped a tape in on friday, and took it offsite on monday (or whatever). In unix land it was possible to completely single step boot from tape and recover using things like AIX's mksysb. The death of tape, wasn't really a death so much as a giant margin increase, where a drive that cost $200 suddenly cost $2000 and came with a fibre channel interface rather than IDE..
a 12tb drive is more expensive than LTO. Even at volume.
2*6tb drives is still more expensive.
However thats not the use case right, HDDs do general storage really well.
A tape robot is the place where your data goes to rest. Its the last line in your tiered storage.
In AWS land it goes RAM->EBS->s3->glacier
tape is the same stage as glacier.
You have RAM cache (be it redis et al, or some fancy block storage) Screaming fast SSDs, but only a few TBs, Slower cheaper SSDs, but ten times as more. Then if you need lots of storage you'd have a few PB of Spinny disks.
After that it goes to Tape. Its denser, cooler and far less expensive to run than the equivalent spinny disk array.
Now this only makes sense if you have data in the PB range, and its more data than you can store online. Anything less and it doesn't make sense. It never did, its just that the capacity of storage has ballooned so much in the last 15 years that people's perceptions haven't caught up.
LTO-8 drives are vastly more expensive than 12TB hard drives.
A pile of tapes, where the vast majority of the tapes are not in drives, is cheaper than hard drives. But then you're not getting those high speeds anymore.
Let's do some simplified math. 12TB hard drives for $26/TB, LTO-8 tapes for $9/TB, and tape drives for $3500.
Since you're saving $17/TB, you need 205TB before tape gets cheaper. So 17 tapes.
Each hard drive takes about 18 hours to fill. So 17 hard drives take 18 hours to fill, at a total speed of 3GB/s.
Each tape takes about 9 hours to fill. So 17 tapes take six and a half days to fill, at a total speed of 360MB/s.
If you have enough tapes per drive to make tape cheaper than hard drives, it's slooowwww.
If you have fewer tapes per drive, it can be reasonably fast, but never as fast as hard drives.
Tape has a lot of advantages. But not speed.
Also even in some ridiculous circumstance where your tape robots were free and you have to pay $7.2k per JBOD chassis, that still only bumps the cost to $36/TB and the overall analysis doesn't change much.
$26/TB was an overestimate anyway. Especially when you can get 5400RPM drives for $18/TB, and it only hurts your speed by about 20%.
I don’t know. I have this deep suspicion that as humans we are creating more and more bits that are just building up like our plastic waste is in the universe. At the moment there is obviously no major downside, but a part of me wonders if that will always be the case...
The cost benefit comes down to the equilibrium and how much storage you need.
1) An LTO-8 tape (capacity approx 30TB) is around $200 to $300 AUD (depending on brand), so that figure could only work if you're excluding the cost of the tape. LTO-7 (~15TB) costs around $100 to $150 AUD and would probably be more suitable, but it's still a lot of money for what's essentially a consumable item.
Also, I would imagine that if the choice was between buying a tape that requires you to go to a third party to use or invest thousands of dollars in a tape drive, or just buying an external HDD you can use at home and unplugging it when not in use (so cryptolockers etc. can't eat your backups), most people would go for the external HDD.
Unfortunately LTO-8 drives can't read LTO-6 or earlier (previously the rule of thumb was drives could read/write N -1 and read N - 2 of the drive generation e.g. an LTO-7 could read but not write LTO-5 tapes), and it probably wouldn't make sense to invest in outdated drives capable of writing LTO-6 (~6.25TB) or earlier, even if you could obtain the older LTO-6 or smaller media relatively cheaply. Vendors would not want to support it.
2) Most backup software I'm aware of is designed around recurring jobs (e.g. BackupExec), not one-off jobs, and to store data from a number of backup targets onto a single tape (or set of tapes).
You could work around this, but there's no off-the-shelf software for doing so I'm aware of (I know some versions of windows server backup can write direct to tape, but this would have to cope with a variety of source media and also make sure that jobs from different customers are never co-mingled on the same tape, etc.).
3) The cost of the hardware required to achieve a decent turnaround time on the tapes (depending on target time e.g. some customers might pay a premium for same-day conversion) would probably exceed what you're likely to earn selling this service. I doubt it would be financially viable, simply due to the low number of customers.
I think what you are looking for is LTFS.
* Map 1 to N HDDs to one customer (by USB port ID or perhaps SATA port if you shuck them for performance reasons, maybe a mix of SATA and USB drives) - will need to have some means of clearly indicating ports, or have blinky LEDs next to them and have the server indicate where stuff ought to be plugged in. And handle SATA hot-swap if required.
* Allocate a tape or tapes to them (and if they provide their own tape/s make sure to use specific slot/s in the tape library for that customer) and otherwise handle the tape library logistics. This becomes more complex if you have multiple libraries in play.
* Perform scheduling and ETA estimation to see if you'll actually get their data onto the tape within the deadline (you might offer a turnaround of a day, but offer a premium service to have it done within N hours) and see if it will affect other customers. This may even involve deciding that an existing currently running job is early enough that you might stop it and scratch the tapes, then run the priority job instead.
* Handle predictive maintenance, drive cleaning, etc. for the tape library (and how it will affect customer deadlines / capacity planning)
* Actually copying the data onto the tape/s and verifying the result
* Integrate into a monitoring and notification system so customers are alerted when their order is ready to pick up, talk to customer billing/sales systems, etc.
* Probably a bunch of other stuff I haven't thought of yet
The system would generally have to be resilient enough that it can run unattended overnight / for long periods of time, but flexible enough in scheduling that it can handle sudden rush orders, drives being offline for maintenance, and so on. And ideally could be operated by staff with minimal technical background.
Maybe Iron Mountain or someone like that have this sort of system already, but I am not aware of any commercially or generally available software that could do the job.
Most people are perfectly served by the HDDs/SSDs that ship with their computers, and the minority that's not served often buys one or two external HDDs of the newest generation. That's enough for the largest part of the population.
Fewest private people have data on the order of dozens of TB where HDDs become so expensive that you are motivated to use the user unfriendly tapes.
Why are tapes user unfriendly? For each read/write operation you need to visit the copy shop. This limits the use cases to ones where like backups and archival. But you can't really use them for archival as when you don't archive the drive together with your tapes, who guarantees that a copy shop will exist in 10 years that has drives that can read such old tapes? Yes, tape drives have limited read compatibility to older standards. https://en.wikipedia.org/wiki/Linear_Tape-Open#Compatibility
But maybe there is a niche for it somewhere, who knows.
Tape only makes sense when you have tape robot. A bit rack type thing that has more than 5 drives and thousands of slots.
Unless you are on a very tight power/heat budget and you have a brilliant asset management/storage system then tape isn't the answer.
Pick any budget. For that budget, hard drives are faster than tape.
Tape's advantages are elsewhere.
There are other solutions to this problem.
Specifically, if your cloud storage provider performs zfs snapshots for you, those snapshots are immutable and cannot be altered, or destroyed, by the user login:
I'm sure there are similar mechanisms available in S3 and other cloud storage implementations (albeit, perhaps not using ZFS).
For say 200TB of backup over 5 years:
Rsync.net: $0.015/GB/month  * 200000GB * 60 months = $180000
LTO-8: $3600 for a drive , $5000 for 20x12TB tapes  = $8600.
Rsync.net is 21x the price of tape.
The breakeven point is surprisingly low as well. A single tape costs $140 , so the minimal entry cost for tape is $3740 for 12TB. For $3740 you only get 249TBmonths at Rsync.net, which is equivalent to 4.2TB for 5 years. So if you're storing more than 4.2TB of data, it's cheaper to buy an entire tape drive than to use Rsync.net.
Rsync.net seems like a great service but that pricing is really crippling. Even the "discounted" Attic pricing just barely beats a single tape, at 12.5TB for $3740.
And before the "but this is an online service" argument comes up, Hetzner offers a 40TB dedcicated server for 64€/month, or 0.0016€/GBmonth. That's cheaper and comes with an entire dedicated server thrown in.
I suspect there's a not-insignificant number of people out there who are glad they didn't make the encryption on their old data very strong (and likewise, those who regret having used very strong encryption and lost the key.)
You can't really prove that. There are physical limits of computation improvements.
Normal delete operations on common file systems don't wipe the stored data with zeros or random data either. They just mark the space as unused and make the directory entries invisible. Any tool that can read the disks directly can see the data even after it's deleted (though it may have trouble constructing it back depending on fragmentation of the data before deletion and the reuse of the freed up space). There are "undelete" tools available for different file systems that can try to recover your data after deletion (some additional conditions apply).
For cloud based backup providers, I doubt if any of those would spend time and energy wiping your data areas clean just because you deleted the data or asked for it to be deleted (since storage is also shared across customers, wiping clean your data would have a negative impact on other customers' data access speed too). They would just let the OS do its job, which would be what's described above.
You can run free tools to wipe a magnetic hard drive with zeros, run multiple passes, etc. With SSDs, that's not reliable because of the additional abstractions used and presented by the controller to the OS.
But yes, if you wanted to also delete snapshots within your .zfs/snapshot then you would need to call or email us. Our support folks would be happy to help with that.
However, if they were shipping to tape and used an offsite tape storage facility, then it is just a matter of time to get the tapes loaded. I'll settle for that.
ZFS is a really good FS. Really, really good. Archives should be offsite and disconnected, though.
There are company now that sell drives with special WORM firmware.
I love tape but the economics of tape really just don't look good today. The patent litigation over LTO8 basically assured that. Whole nearline SATA drives are significantly less expensive per byte than LTO8 tapes, even ignoring LTO drive/library costs.
In Germany, using non-bulk prices, HDDs seem to start at 19€ per TB while I could find multi-write LTO8 tapes costing 8€ per (compression disabled) TB.
That is alot of tapes to recover the cost savings over HDD
It's even worse than that. The SD card spec provides for a "write protect notch" and corresponding physical switch. On the card itself.
But, it's optional and it's not implemented by the card. Instead it's up to the host software to honor it.
USB thumb drives should also have a physical switch. But they don't.
I'm from the old school of "no ring, no write" on 9 track tape. Saved many people's asses back in the day. But that knowledge has been lost in time ... like tears in rain.
Malware can operate entirely in memory, not touching disk. If access is lost, reinfection can take place (has been used by nation state actors). Even the Mirai botnet which was done by amateurs didn't persist and simply reinfected. So it comes down to your potential adversaries. Actors deploying wide, non-targeted attacks Mirai-style get little value from persistence. It's simply an extra layer of unneeded complexity.
It can do a lot in the right circumstances. E.g. if a USB thumb drive had a physical write-protect, it could be plugged into a potentially hostile computer without concern about losing the data on the drive.
Years ago I saw a few USB thumb drives with that feature. But I don't know if they're out there any more. They're almost as scarce as hens' teeth.
Like what? I’d love to check this out.
I have no experience with them-- I've just assumed it would be overpriced.
I'd certainly be interested in worm drives for personal use (e.g. for access logs, and photo archival), but not so interested that I'd pay a big premium over standard drives. :)
If tapes are worth it for someone -> show referal links to some trusted tape provider partner based on region
Otherwise, show good deals on hard drives, per region. And recommendations on QNAP/Synology enclosures.
Early in my career, I worked at a place that had to retain certain records for up to 100 years. At the time (circa 1999), they had a big tape silo solution that probably cost something like $10M/year (or more) to own and staff. Today, capacity for "cool" data is essentially unlimited and near-free. You could host all of that data on a small device locally (if needed), and retain the data (and outsource the engineering for long term retention) in one or more cloud services for <$100k/year>.
Of course, that doesn't make huge tape storage useless, just less useful as a format for accessing a single giant archive. One answer would have it that not using the whole capacity is OK, and the goal should be to archive a smaller dataset more often. But it's definitely something that demands real engineering work, while a lot of organizations can barely manage to plan for any backup.
Tape is a very large tank accessed through a very thin (and expensive) straw. But storage quantities offered are immense and reliable.
Addressing your concerns, increase the available heads if access is constrained. Tape libraries provide for this.
If you have a decent connection, cloud storage is hard to beat.
At $4/TB why would I pick this over blackblaze B2 which is $5/TB (pretty basically the same) but works like live normal block storage instead of taking hours to retrieve (or minutes if you pay extra)?
Moreover the download cost from glacier to AWS is $0.02/GB plus at least $0.003/GB in retrieval costs (or $0.09/GB to $0.05/GB to the internet), while the download cost from blackblaze to the internet is only $0.01/GB and is free if it's to many of the members of the bandwidth alliance (notably including cloudflare).
I'd have to be looking at many petabytes of data before the miniscule cost savings of glacier data that I actually don't touch is worth it. Or I guess if the data was already in AWS and the egress to blackblaze was going to be too expensive to be worth it.
Does Backblaze B2 have multi-site redundancy? We have our S3 buckets automagically replicated to a different AZ. That was a must have since we're highly regulated, and our backups need to persist for over 7 years.
What does that even mean?
If I don't have to restore, /dev/null will be even cheaper!
On the other hand the vast majority of Windows installations are either regular home users (with no knowledge for properly setting up their computer, and ideally with the updates disabled because they read on the internet that updates are bad) or corporate desktops (which might be secured better but still at the mercy of a user clicking on everything).
Windows Server administrators are lot more likely to be "certified" and "trained" by MS approved courses than linux admins and of course it does not help.
I would say its more likely to find an average joe ( from a devops perspective) running a linux box than a windows box.
That's a whole different kind of average joe. A mediocre tech person is still an order of magnitude more qualified to take care of a machine than a mediocre user.
It's hard to keep track of the current state of things, but issues that come to mind are, all of the design issues with X, such as poor isolation of applications, X server typically running as root.
Desktop apps are typically not deployed with SElinux.
Compromised apps can generally access all of the users' data.
Something like ChromeOS on the other hand, is pretty secure, but that is a whole different beast.
This guy had the patience to put a time bomb onto their system and not trigger it until it was on all of the backups. How does someone have the patience to do something like that and yet do something so petty when they have a month to think about it? The older I get the less sense it makes, and it didn’t make much sense even then.
Years later another person did a similar thing to a different victim, hacked the kernel on the machine to clone all conversations. Published them, and then published the new conversations after the admins restored from the tainted backups. Surprise, motherfuckers!
Both sites went offline for months.
It’s easier for me to imaging doing something like this on a fishing expedition. You’re gonna try with a dozen potential victims and then enjoy the two who get bit the hardest. Vendetta is also possible, but more fanciful and can often be mitigated by not being an asshole in the first place. Also motive? Easier to catch a jilted lover than a serial killer.
In part because of these experiences, I had been known from time to time to keep a secret cold backup of the 10% of our assets that would be hardest to replace. But if that were ever stolen? Oof. I’d be fired so fast they’d throw my belongings out the window, so I stopped at some point. Git provides a little bit of that now anyway.
On a general note, Infra could be more like cattle than pets perhaps ? If configuration is applied via version controlled and reviewed automation (ansible, helm , hand-crafted YAML...) perhaps could have been mitigating ? - dev-ops like this is considerably easier on linux than on Windows
However like i said tech cannot solve poor employee choices, setting up and following the systems without bypasses depends on the same employees, so nothing someone with malicious intent cannot undo.
(1)I said avoid, not eliminate. The comments replying “no” are really saying, “yes“. Doesn’t ransomware typically install an executable on the victim’s machine? And won’t the mast majority of those payloads run only on Windows?
(2)Desktop Linux actually exists. I use it exclusively, every day.
(3)Security, obscurity. A proverb that is almost exclusively applied incorrectly. I’m not talking about cryptographic security. Just asking whether it would help to lock the door and take down the “please rob me” sign.
Don't have any computers.
That's roughly the equivalent of what you're saying. A business cannot just throw out its entire software stack and start over, not to mention the complete lack of equivalents to a lot of Windows only software that's out there.
Even for individuals, it's often not reasonable to expect people to switch to a different platform and drop tools they've been using for a decade.
An extra backup is cheaper than switching software stacks, sure. But you're stretching the hyperbole pretty far.
I threw it out, and have been soured on tape for long term storage ever since.
Fortunately, I had also downloaded a few files over the phone to a PDP-11 with an 8" floppy drive. I can no longer read those floppies, but I regularly copy all my files to new media every year or so.
This is easy enough to do for tape libraries where you have robots to manage everything, but really dampens the effectiveness for personal backups.
It's too bad that cheaper drives can't be made with old technology. Perhaps by making much larger cassettes? Everywhere I've worked the total cost of a backup system built on tape was higher than buying HDDs due to the drive itself.
Whether they're fudging the numbers for the current year to make that math work is up for debate, as it's 2020 and they aren't yet on LTO-9, with 24 TB drives
That could have a lot of starting and stopping on the tape, which might be bad so it would be acceptable to have some sort of buffering so that the tape only needs to be written when a lot of changes have accumulated.
LTO is designed to speed match to avoid shoe-shining (rewind/rewrite). AKA the drive changes the speed the tape is moving past the heads to accommodate the read/write speed of the application. Which is one of those fun computer cases where you can literally hear how fast your application is running. The more recent drives have very large ram buffers to smooth it all out too.
Using periodic replication on the source (also freenas/ZFS) and some Scripts I wrote using ipmi to start up the DST box-
I have the DST machine downstairs at my house And once a week for about 12 hours it auto powers up, the auto application starts and after about 12 hours shut down. (10g uplink)
So outside of 12 hours a week the machine is powered off. Not the best solution and it does have Some sec holes, but it was something cheap to backup alot of data.
That’s not how innovations work but they have been doing that for decades.
User A: Tapes are OK, but I wish there was new tech. Probably optical.
User B: Impossible to get this density with optical given wavelength requirements.
User C: In 2D yes, but maybe with 3D disc storage. The technology isn't there though.
User B: Consider a tape on its side...
What keeps us from applying similar tech to HDD?
Remember that a tape has a huge amount of surface area compared to the small track area of a few discs. The material science aspects of that flexible tape flying by a helical scanning head are quite different from the circular tracks of a hard disk.
(I recently tried to get a Sun U10's DAT 9mm working, sourcing the SCSI cables was a nightmare. And then, the drive refused to rotate under UNIX mt <subcmd> patterns, ejected the tape unread.)
To be less flippant, most 50y claims are tested by doing more intense service life e.g. high heat, many rewind re-read cycles, to simulate mechanical/thermal ageing. You cannot really test print-through from just "sitting there" except by sitting something unread on a shelf for 50 years.
200 TB of disk : 25 * ~150$ 8TB disks = 3750$
So not really. Tape is generally cheaper overall for any volume over 100 TB. The question is what you're storing and why. For archival, long conservation, tape is unbeatable.
I recently opened a case of 32 2 TB drives shelved since 2010 or so. These are Hitachi Ultrastar drives, the more reliable drives by a large margin. However, 3 of these were dead. Non-running disk drives actually fail as much or even slightly more than running ones (I have hundreds such drives in machines running 24/7 for the past 10 to 12 years).
OTOH I've restored 120 LTO-2 tapes from 2002 recently. only one had problems (but most of the data was readable in the end).
Tape is a great offline storage medium. Disks aren't.
I've looked at M-Discs, but at ~$0.12 per gigabyte (for something completely non-rewritable), the price still isn't right.
HDDs can get contaminated by siloxanes or hydrocarbon vapours, the former can come from adhesive tape and turn into abrasive SiO2 particles on the head.
Only anecdata, but I used silicone 'covers' to protect a couple of backup HDDs in storage, and one failed within a year :(
Wonder if the helium filled drives would fare better in storage?
15 drives * $200 = $3000 total 210TB
I also said LTO-8 not LTO-6. I can manage 15 drives. LTO-6 is 2.5TB which would require 84 tapes to equal the 210TB of hard drives. 84 tapes is going to require tons of swapping. What is the price of an auto switching LTO-6 library?
I would consider LTO-8 since it is 12TB per tape but those are going to be even more expensive.
That's not even how much this new 400TB tech is going to cost. At least LTO8 is fairly recent apparently, but it's only 12TB, not 400TB.
My whole point is that this is out of reach for consumers, hence my disappointment. Even the older tech is out of reach.
price: $0.00099- 1 GB, SPEED OF RETRIEVAL: 48 hours for bulk, 12 hours for standard, FIRST BYTE LATENCY: Select hours
This is the advertised price on https://aws.amazon.com/glacier/, but when I click through to regions, every region is 4 to 5 times as much. Do you know if I'm missing something?
Saying from experience. Though if it were to happen again these days, I'd probably just rip out its controller and replace it was an Arduino. :)
I could be mistaken though, and it might be more a web UI thing along the lines shown here:
In my case, I do remember being able to control the device using the physical controls on it, but being blocked when doing "other stuff".
Not really remembering what the "other stuff" was though. The default admin/password combo, and "how to reset the password" stuff found online definitely didn't help.
Should the format of these types of headlines be:
* Some Important Statement: Speaker (Fujifilm here)
Or should it be:
* Speaker: Some Important Statement
Is one more 'proper' than the other?