
Backblaze lost over 13TB of my data - ValentineC
https://www.reddit.com/r/DataHoarder/comments/6j36o2/well_backblaze_lost_over_13tb_of_my_data/
======
thaumaturgy
So much facepalm here. A Backblaze rep is in that thread confirming that they
do have a failure case where they can lose backup data -- or at least, it'll
become inconsistent. Ok, that's bad. I don't think they're idiots, but it'd be
interesting to know what's so technically challenging about the index problem
that they haven't been able to solve it.

But ... 17 TB total? For _online_ backups? That's just not what they're
designed for, unless you've got one hell of an internet connection. The
complainant says it's going to take over a year to re-upload the 13TB. That's
insane.

I think you have to give up on online backups for that use case and settle for
a pretty beefy home rig in a fire-resistant box. Moving that kind of data up
and down an internet connection just doesn't work real well yet.

~~~
atYevP
Yev from Backblaze here - as far as the technical aspects, it's tough for me
to say b/c I don't know what happened in this case, but we do have use-cases
where we need to "Freeze" backups (more info:
[https://help.backblaze.com/hc/en-
us/articles/217666178-Safet...](https://help.backblaze.com/hc/en-
us/articles/217666178-Safety-Freeze-Your-Backup-is-Safety-Frozen-)). I would
venture to guess that since this occurred on their C: drive, something similar
to that happened here, and they needed to re-upload the data (in most cases,
previously uploaded data is still available for restore - but with 13TB of
data to re-upload and not a lot of bandwidth, it would have taken them way
longer than our 30-day history to do that).

------
chx
If you want to store data as cheap as possible online, you are looking at 4
EUR / TB / month of capacity.
[https://www.reddit.com/r/seedboxes/comments/6d1h34/oneprovid...](https://www.reddit.com/r/seedboxes/comments/6d1h34/oneprovider_advertises_2x2tb_hw_raid_unltd_1gbps/di44nzg/)

To lower that with a significant up front cost, you could buy the drives
yourself, perhaps shucking external ones and then send them over to Vapornode
[https://vapornode.com/storage-vps](https://vapornode.com/storage-vps) so they
can host it at 10 USD a month per disk. If you have a Fry's close by, they
have 8TB units for 200, recently Best Buy had them for 180 but I think that
ended. So that's 200 + 10 / month vs 4.5 * 8 = 36 USD / month, you are saving
26 USD a month, it takes eight months for this to be cheaper. I would say it's
not worth it because replacing dead disks are on you but it's up to you. Also
getting many smaller disks make for better RAID.

Finally, lowendtalk regularly has discussions on bring your own disk
providers. You need to know what you are buying but it's doable.

~~~
Veratyr
You can get lower than that even, if you need a lot of data.

Buy a used SC846 second hand, that costs ~$900 inclusive of rails and a SAS2
backplane upgrade. Then buy drives. You can get 8TB drives for ~$230 (Seagate
Archive or shucked Reds) or $0.027/GB. Add cheap colocation for say $70/mo.

Assuming the hardware lives about 5 years, total cost is $900 + (70x12x5 =
$4200) + num_gb x 0.027.

If you fill all the bays you're looking at 192TB for 5 years for $10284, or
$171.4/month or $0.90/TB/mo.

~~~
chx
Well, yes, but that's just one machine and you are responsible for keeping it
running.

------
user5994461
>>> It took 8 months or so for my initial backup and will definitely take over
a year with the added data now [upload 13 TB again] so I just cancelled and
they gave me a refund for the last payment.

>>> So, all in all just a waste of time and money.

Dude wanted to backup 18 TB of backup on a home connection until he realized
it's not going well.

Messaged the customer support to complain => he got a timely reply, then
decided to leave the service and got a refund.

That looks like good service to me.

P.S. I feel sorry for companies who have to deal with that kind of customers.

~~~
arsoon
Also from what I can tell, he didn't lose any data - he just lost the ability
to continue his backup, and had to start it again.

I mean it's not an ideal circumstance, but not the end of the world. He really
just needs a faster connection to support that much data.

------
capnrefsmmat
I wonder what the overhead is like on 13TB of data.

I have about 80GB with Backblaze right now -- just documents, photos and music
on my laptop -- and the metadata it stores in /Library/Backblaze.bzpkg takes
up 46GB, or about half the size of the actual backup. Most of this is in
uncompressed plain-text log files showing every file which has been backed up,
going back years. On a 256GB SSD that overhead hurts.

I asked their support, but their only solution was to delete the backup,
reinstall the client, and start from scratch, which seems... less than
optimal.

(On the other hand I do understand the impulse to squirrel away terabytes of
data... I now make a point of archiving all of my mail accounts and PDFs of
papers I read, knowing that with good search, they'll be valuable for years to
come. But I haven't quite read 13TB worth of papers yet.)

~~~
bluedino
Have other BackBlaze users reported similar amounts of metadata? That's
enormous.

------
bane
I'm really surprised by how many people think 17tB is a lot of data. It's
what...<$400 in storage? [1]

Crap, just between the main drives on computers in my house I easily have
20-30tB of raw disk space.

It might be a lot to transfer...yeah. At 50Mbps that's what...a month of
sustained transfer? But how valuable is your data?

1 -
[https://www.newegg.com/Product/Product.aspx?Item=N82E1682217...](https://www.newegg.com/Product/Product.aspx?Item=N82E16822178951&cm_re=external_hard_drive-_-22-178-951-_-Product)

~~~
jbob2000
What the heck are you storing that you've used 20-30TB? I don't really get it.

UHD Movies? Why would you save those in the age of streaming? Family
pictures/videos? At 20TB, you must have hundreds of thousands of pictures.
There's not enough time in your life to look at all those! The only thing I
can think of is that you shoot photo/video professionally, or maybe do some
kind of 3D/video rendering?

But I agree with you - when you have this much data, why the hell are you
transferring it to the cloud? Buy a couple harddrives, load them up, and stick
them in a safe.

~~~
bane
Who said I'm storing anything? A handful of machines with the cheap default
spinning rust HDs in them and you get to 30tB very quickly. A typically family
of three probably has double digits tB just in their Costco purchased laptops.

Heck, my _parents_ have >10tB of raw storage between their home and work
computers. Even though they aren't data hoarders, between software, important
documents and email and videos and pictures of grandkids and friends they
probably easily sit at 4 or 5 tB of total storage that deserves backup.

------
cmurf
Seed drive service. ~20 hours to copy 10T of local data to a few drives, few
days to ship, ~20 hours to copy the data to the cloud. But I guess some people
think that cost is greater than 8 months of uploading? Whatever...not my data.

------
therobot24
I've stated this before in a previous thread, but I had previously used
backblaze on a few computers but found it was taking up almost 8GB of space on
my laptop for the backup list of files. This seemed very odd to me so i
reached out to backblaze and the response was to make a new account to re-
upload everything. Given the time it takes to re-upload everything (i didn't
quite have 13 TB, but i had enough) i just terminated my account.

------
JohnJamesRambo
Has anyone tried Sia? From what I've read it is an order of magnitude cheaper
form of cloud storage but I don't know anyone that has tried it.

~~~
imaginenore
Maybe a binary order of magnitude.

Sia = $2/TB/month.

Amazon Glacier = $4/TB/month.

Time4VPS Storage = $4/TB/month.

Backblaze B2 = $5/TB/month.

~~~
prirun
I just checked out Time4VPS. Their storage VPS's have a 200 IO/s limit. At the
typical 4K per IO, that's 800K bytes/sec, 1.6MB/s, or 2.1TB per month. It's a
little misleading that they advertise a 10TB bandwidth limit per month on a
100Mbit/s port, when the IO limit kicks in way before that.

Backblaze B2

------
madebysquares
if you have 17TB of data. Where is the best place to back up that much data?
Other then redundant back up drives... Is there a suitable cloud solution for
this? If not is there enough demand to build one? I dont own that much data. I
cant imagine having to keep track of that much data, I hate having more than 1
TB of data to keep track of.

~~~
atYevP
Yev from Backblaze here -> Cloud Storage services are better suited for that
much data. I wrote a post recently where I went in to the differences between
Sync, Backup, and Storage as it relates to the cloud ->
[https://www.backblaze.com/blog/sync-vs-backup-vs-
storage/](https://www.backblaze.com/blog/sync-vs-backup-vs-storage/).

I'm not sure what happened in this user's case - but my guess is that
something happened on their machine where our backup client decided that data
on the machine didn't match with what is in the DC and restarted backing up
the data so that we'd have a clean copy, resulting in 13TB of data getting re-
scheduled for backup.

~~~
mistermann
For a backup service, you seem strangely unconcerned about what went on here.

~~~
atYevP
I wouldn't say that, it's just that I don't have a lot of data about this
particular case to go on. In some cases if you overwrite our index files, you
can get something that's known as a Safety Freeze. Again - I have no idea what
happened with this customer, but I would venture to guess that since this
occurred on their C: drive, something similar to what causes a Safety Freeze
([https://help.backblaze.com/hc/en-
us/articles/217666178-Safet...](https://help.backblaze.com/hc/en-
us/articles/217666178-Safety-Freeze-Your-Backup-is-Safety-Frozen-) happened
here, and they needed to re-upload the data (in most cases, previously
uploaded data is still available for restore - but with 13TB of data to re-
upload and not a lot of bandwidth, it would have taken them way longer than
our 30-day history to do that).

------
JohnJamesRambo
I just wonder what someone could have 17TB at home of. In 2010 the Library of
Congress text database was 20TB. I know videos can take up some room but do
people back up their video libraries to the cloud?

~~~
quadrature
A small blu-ray ripped movie collection could do it. Also the subreddit is
called "DataHoarder".

~~~
rys
Coincidentally I just ripped my Blu-ray collection. 82 films, 542GB (x264 with
a high profile encode). 17TB for me would mean close to 3000 films.

~~~
bluedino
In that case why backup what you can just re-download?

------
fleitz
I loaded 5 tonnes of sand into my 4runner and now it's ruined, Toyota claims a
4Runner is not a 5 tonne truck won't give me my money back or honor the
warranty, I swear the mechanics were laughing at me.

~~~
lancefisher
Backblaze should be able to handle 17TB of data.

~~~
blunte
BB handles many PBs of data. Actually I think it's ZBs.

However, I'm not sure how many of their clients have single drives with 17TB
of data. Perhaps this is an edge case that they didn't adequately test for.

And since a vast majority of their customers will not have more than 3TB on
any single drive, I can see why this might not be well tested.

~~~
dastbe
i think you meant exabytes, not zettabytes, and as they state on their website
"350 million GB stored ... (and counting)" that means there at about 1/3 of an
exabyte now

~~~
blunte
Probably so. I can't count past a million anyway.

------
blunte
Let's just be real. It's assholes like this that take advantage of the buffet
line, thereby costing all of the rest of us. And then they complain when they
can't go back for the 45th plate of food.

People like this are why Amazon's unlimited cloud storage just became limited.
The people out there abusing the spirit of the offer know who they are. The
finance world is FULL of those people, and it's a good part of why the
economic world is really fucked up.

Be a team player. Be thoughtful of your peers and neighbors. We'll all end up
better off, even you. /rant

~~~
vortico
This is why I hate services that offer "unlimited" of some resource, like
Dreamhost and AWS, so I will never use them. Everyone is worse off. People who
use less than a ridiculous amount are paying for a few outliers using 100x
more. People who use a ridiculous amount can no longer be sure whether their
service will be active the next day because an employee deems their use
"ridiculous".

And I don't think they're at fault like you're suggesting. Suppose I have a
legitimate use case for storing 17TB of data and need to do it on a budget.
During my shopping, I discover ten $1000/yr services for 20TB storage and
three $100/yr services for unlimited storage. To me, the $100 services seem
like scams or that they have no idea what they're doing. Amazon, Backblaze,
and Dreamhost are literally giving off these vibes! Regardless, I'm on a
budget so I purchase Backblaze and upload my 17TB. But in the back of my mind
I'm nervous that they will shut off my service, despite the contract I entered
with them to keep an "unlimited" amount of data for the price I paid. Or, if
the contract says they have the right to terminate my service if they deem the
amount to be too much, that would make nervous even as a normal user of the
service, since the definition of "ridiculous" might be decided by some judge
in a court if the service decides to cancel the contract and my company
decides to sue. Would you really want a judge to decide that, or would you
prefer a hard limit to be placed on the website and the contract in the first
place? I truly don't understand the appeal of "unlimited" plans compared to
simple hard limited, tiered plans.

~~~
blunte
> Suppose I have a legitimate use case for storing 17TB of data and need to do
> it on a budget.

Can you give me a theoretical example? Seems like saying, "Suppose I have a
legitimate case for needing 45 trips to the buffet line."

I get your points, but I think outliers are outliers, and correspondingly they
should be thinking as such. So if I had, for whatever reason, a need to store
17TB of data, I would need to be as clever about how I stored it as I
obviously was clever enough to even possess 17TB.

~~~
vortico
There could be many reasons. In order to run simulations at my work, I request
databases of input data and results of previous experiments. Each experiment
is 1-100GB of xray images and dumps from other instruments. Mostly the
experimenters dump the data they don't need, so often my team becomes the only
possessor of the raw data. It would be convenient to have an off-site backup
in case someone rm -rf's the drives.

When I was younger I worked at a recording studio. Each song consumed about
10GB, and it was convenient to return to the sources at a future date once the
mix was delivered to the artist. Each artist recorded about 10 songs, and in a
year we could do 50 artists.

I can't imagine what animation and video production houses go through per day.
Having a backup is completely necessary for these folks. I could go on, but I
feel 17TB is pretty common for businesses, so I'm not sure why you're asking
the question. These days you can get 17TB for only $700.

~~~
blunte
If you are betting your business on an "unlimited" $5/mo service...

So either the guy has a legitimate business reason (in which case he's
definitely chosen the wrong service (level), or he's not using it for
business.

~~~
vortico
Backblaze is $59/month, right?

