1 Petabyte (and they have multiple)
S3 - $30,000 a month, $360,000 a year
S3 - reduced redundancy - $24,000 a month, $288,000 a year
S3 - infrequent access - $13,100 a month, $157,000 a year
Glacier - $7340 a month - $88,000 a year
They could also always use tapes, for something as critical as the data that is the blood of your business.
Imagine if facebook lost everyones' contact lists, how bad would that be for their business? Backups are cheap insurance.
Same problems with buying things like antivirus software or even IT management utilities; when they're doing their job, there's no perceivable difference. It's only when shit goes sideways that the value is demonstrated.
Hell you could take this a step further for IT as a whole; if IT is doing their job well, they're invisible. Then they can the entire department, outsource to offsite support, and the business starts hemorrhaging employees and revenue because nobody can get anything done.
Yeah, but what exactly IS the benefit? The business doesn't die if something really bad happens? Is that really important though?
Consider the two alternatives:
1) The business spends $x00k/year on backups. IF something happens, they're saved, and business continues as normal. However, this money comes out of their bottom line, making them less profitable.
2) The business doesn't bother with backups, and has more profit. The management can get bigger bonuses. But IF something bad happens, the company goes under, but then what happens to the managers who made these decisions? They just go on to another job at another company, right?
I'm not sure I see the benefit of backups here.
I mean the way management gets on me when we have outages, you'd think that was a significant priority?
Tumblr is apparently fragile and tech-debt laden on engineering side, stagnant on users, and unprofitable. At a certain point, it's a coherent decision to just say "a few days of downtime would seal our fate, the business can only be saved if everything goes right", and not spend any money on mitigation.
So far as Myspace (or Tumblr apparently) is concerned, it is "somebody else's computer of uncertainty".
1PB is nothing today.
1 petabyte = 125 drives = $17,500 (one-time cost).
It will probably cost more to connect all these drives to some sort of a server. Though 125 is within the realm of what a simple USB should be able to handle (127 devices per controller).
Getting petabytes of storage isn't the problem, transferring the data back and forth is.
Taking a full month to recover a downed social media platform isn't really acceptable, but it's still better than being literally unable to recover it at all. Spending a small fortune to ship hardware to an AWS datacenter and convincing/paying them to load it directly would probably also be worthwhile, when we're talking about simply losing a $500M company. If the claim here about "no backup" is true, it's so profoundly stupid that everything I know about best practices sort of goes out the window. Approaches that any sensible person would consider unacceptably slow and unreliable are still a step up from a completely blank playbook.
(I guess the theory might be that Tumblr is such a trashfire it can't be restored, or would lose so much value in days/weeks of downtime that there's no point in even planning for that. Again, I don't really know how you run cost-benefit analyses when it's not entirely clear the project has benefits.)
That's not even talking about availability, as you are now getting into the realm where it starts to get questionable whether even Amazon has enough backhaul capacity available at those locations so that you can actually max out 50+ 10Gb connections simultaneously.
Remember when Microsoft lost all of the data for their Sidekick users? Basically they were upgrading their SAN and things went badly.
Why is your name green?
(Don't ask for a rigorous definition of "one click away", though.)
but the availability numbers speak for themselves :/
- The mobile and desktop sites are completely separate products with vastly different behavior. Some privacy features (relevant to both) can only be accessed on one, some on the other. Tags are rendered in all-lowercase on mobile, but as written on desktop. Block quotes on desktop render as enlarged-font cursive on mobile, for some awful reason.
- Tumblr support(s/ed) font coloring, with no documentation of that fact. You enable it by using the HTML editor and picking among color tags with Friends-themed names like "Monica Pizazz Orange". Oh, and the preview feature won't honor the tags, but actually posting will.
- NSFW content is flagged even in drafts, but if that content is reviewed and approved, it's automatically posted publicly, not returned to drafts where it started.
- Tumblr's desktop sign up page use(s/d) semi-random images from the site as backgrounds. Yes, they did serve cartoon porn to people trying to make accounts.
- Certain posts were impossible to view. Tumblr accounts can have their own themed pages, or simply be popup sidebars over the main news feed. Tumblr "read more" content hiders took users from the news feed to the poster's account - if that account was in popup format, a readmore opened from the wrong location would simply force a circular redirect.
- All Tumblr links are actually pushed through a site-specific forwarding system to track users. As a result, Twitter and many other sites are inaccessible because they view all link clicks as bot traffic from a "single source".
It looks like I was simply wrong on #2, thank you; I remembered it as something that had been around for ages but was noticed, then publicized. If it was found before a planned announcement, that's different.
#3 was fixed within a few days, but frankly I think "posting people's drafts with no warning" is a "damage done" thing, the same as an email client sending drafts to all listed recipients. There are reasons like the "private post" option that you would draft something and never openly publish it, and even beyond that it's reason to draft anything you might not want to publish as-is offline instead of in the site's draft feature.
#6 is complained about by plenty of other people, and happens to me perhaps 90% of the time. I realize I missed one thing: it's mobile-only. Opening a Twitter link on mobile produces a "you're rate-limited" blocking page which sticks around even if you try again later, but choosing "open in Chrome" to escape the Tumblr app immediately solves the problem. I haven't seen comparable behavior in any other app where I've followed Twitter links. Mobile-specific implies it's not purely the link tracking, granted, but it's very much a real Tumblr-specific issue.
aws s3 rm bucket —-recursive
It won’t let you just go into the console or delete the stack that made it if the bucket isn’t empty.
aws s3 sync --delete ./ s3://your-bucket/
Do you think it would be good to extend said argument to say scp / ftp clients?
From the S3 management console user guide:
> You can delete an empty bucket, and when you're using the AWS Management Console, you can delete a bucket that contains objects. If you delete a bucket that contains objects, all the objects in the bucket are permanently deleted.
On the other side the Yahoo services were so heavily integrated that it was hard to carve out any piece of them, and the few times we tried it was a slow and painful process because Yahoo’s piece was glitchey and unreliable outside of it’s home turf and the Tumblr engineers defensive and argumentative about everything and not willing to help.
Having worked at Yahoo, I understand this stance.
Dell used to offer an online backup service. It wasn't even running on Dell equipment!
Basically they acquired a company that offered the service, and while it would be "nice" if a Dell company ran on Dell gear, a lot of the time it's simply impractical/expensive to overhaul things.