The cloud is a murky, ambiguity-laden concept though. Both Netflix and my 92 year old grandmother on Facebook 'use the cloud', but the former is much more sophisticated in their network and data management practices. My grandmother just wants to see fun pictures of her family and great grandkids.
No one realizes that the Cloud is running on the same crap we've always had and is vulnerable to the same issues as everything else. MAYBE the company is better at data management, MAYBE the employees take pride in their job and do it properly, but that's all MAYBE MAYBE MAYBE, and could just as well be "no" and you're entrusting your data to people who really have nothing to lose if it goes into the garbage tomorrow.
I think it is sort of like a delivery system. If USPS or FedEx or USP started losing a massive number of packages (I know that they do lose some) then they would get abandoned, just like "Cloud" companies have an incentive to maintain a baseline level of quality. The alternative is that every business would have to create their own shipping services. I think it makes sense to assume that in most cases, unless the business is already massive enough to warrant it, that it is cheaper and more reliable to use the aggregate, dedicated ones for hire. The "Cloud" will be cheaper than individual implementations, and it won't be nearly as suspect to individual implementation errors because the identical system will have been proven by many other customers (otherwise it would be abandoned).
If amazon loses more than 3 datacenters (only total loss of external connectivity for all of your instances in an entire availability zone, or total loss of hard disk access, again only counts if all your instances completely lose hard disk/EBS access) for more than 45 minutes in a month you get 10% of what you pay as a voucher for future ec2 usage. If they lose it for more than 7 hours you get 30%.
So no, Amazon, or at least their legal department, does not trust their own competency. Or at least, they're not willing to risk any revenue on that, but they're willing to give you a small future discount to encourage you to restart using the service. Oh and you only get that if you explicitly ask for it.
If they lose your data on EBS/S3/Dynamo/..., you get nothing. So having any data exclusively on any Amazon service should be cause for getting fired, and this of course also means that using Dynamo for storing anything non-trivial is a big no-no from a disaster recovery standpoint.
So I have to say, I would suggest you do not trust Amazon with either your data, nor with keeping your site online. Yes, historically their performance has been better than this, but ...
This reads worse than the SLAs on internet connectivity from places like level3 and cogent (pay 10% less if they fuck up completely for more than 2 days).
Second, you've completely ignored the actual key points. For one thing, "Cloud" companies make their business by providing a stable service. You have the "guarantee" based on thousands of other business using the exact same infrastructure without serious service failures. That is a huge amount of statistical reliability. Compared to hiring your own IT department and cobbling together your own system, that is actually really good indicator. Second, the cost difference is potentially massive. Again, it is for similar reasons that shipping via UPS is a much better deal than shipping via your own private distribution network. You might have to still pay some people to handle your own inventory from its source (like you'd have to have some people to work on your system in the cloud) but you'd be taking advantage of a much larger, more efficient system instead of having to build and maintain your own.
So if it's all the same, I'd rather have a decent SLA. Furthermore this sounds a lot like Amazon's not in fact giving me anything.
Your point is that they'll do the right thing because otherwise their customers would leave. Customers you said in the previous paragraph they give "the most conservative amount they can get away with, and they are "getting away" with it just fine so why up it?".
Sounds like they really care about customers doesn't it ?
> You have the "guarantee" based on thousands of other business using the exact same infrastructure without serious service failures.
I can get that guarantee at 1000 datacenters and colo providers, at least. Some of which have a decent SLA. But even among cloud IAAS, both Azure and Google provide both Amazon's guarantee, and better SLAs.
I used Amazon as an example. If you actually read what I'm saying, about how Cloud businesses in general depend on meeting their guarantees and not screwing over businesses, how their superior quality is because of scale and specialization, how they are reliable because one failure would doom them and they haven't failed yet, you could see that this has nothing to do with Amazon at all.
You keep arguing that Amazon is a bad provider. So what? I was never interested in that at all. I'm not comparing them to Azure or Google or the supposed "1000 datacenters and colo providers" you seem to know of. I don't care who is better or worse, I was talking about using cloud services in general.
Pay attention to the topic, pay attention to what my arguments were. Amazon's SLA is utterly irrelevant to anything, what are you even trying to convince me of? None of anything you've said is remotely relevant to my point. It's like arguing about whether Ford or Toyota makes better hybrids in a discussion about whether electric cars are a good idea, I just don't care.
Amazon has superior quality ? They have at best average quality as a vps provider, unless you accept their products that cause lock-in. At which point you're at their mercy, and they have even less reason to treat you well. Amazon doesn't match, say, digital ocean (especially not in the transparency in billing department. WTF). There are other reasons to pick amazon of course, but quality, not one of them. Price ... not one of them. Service ? Not one of them. Stability ? Not one of them. Geographical reach ? At the moment Amazon does better (not that it matters unless you're in Asia).
One failure would doom them ? Just from memory I know two big amazon cloud failures that you could not protect from with availability zones, the ones in a single datacenter, they don't even publish.
The fact that they refuse to publish single cluster failures is probably another aspect of that superior quality you mentioned.
Also, you can get fucked on an ongoing basis just by getting scheduled on a machine. I guess that's part of their superior quality (a lot of VPS providers of course have this problem, others are better at it).
The Netflix/Amazon relationship reminds of the "you owe the bank $100, the bank is your problem, you owe the bank $1bil, you are the bank's problem" sentiment. Netflix is probably such a big business that they are dependent on each other.
On the other hand, Amazon seriously screwing a small business would be like a bank failing a normal customer's withdrawal from their deposit. The second that information went public, the bank would essentially be dead.
AWS is the new Microsoft / IBM, nobody ever got fired for picking AWS.
There is a difference in a discussion about trusting the Cloud with your data and services between it going down briefly on occasion (somewhat acceptable, within very narrow limits) and actually losing data or longterm traffic because of a service failure. Seriously breaching the SLA causes compensation as well as a big loss of reputation and business, going down for a couple hours once a year is hardly the type of instability that would terrify most online businesses, nor is it something that individual companies are able to avoid themselves.
The key piece that's missing here is the idea that risk is something you have to compare, and can combine in interesting ways, then trade off against costs.
There are a bunch of ways in which you can do compute, storage, and networking. You get to pick zero or more of these ways. One of them is "buy a bunch of iron and make a pile of it in your bedroom". Major risk factors here are your house burning down, you getting evicted, or there being a power cut. Another is "rent those services from an infrastructure provider". Risk factors here are much harder for you to visualise, but include things like "governments ban that company from operating in your country".
You can look at the risk of any of these options, and quantify it with an SLO, like "we intend for this compute resource to be available 99.99% of the time in a given quarter". You can then have an SLA that defines what will happen if that objective is not met, and measure how often this is complied with over time. There are lots of ways to analyse this information, but let's suppose that you can reduce it to a single number measuring how safe the resource is for your use case.
If you only look at a single option, and say "this has a safety of X", then the only thing you can get out of this effort is anxiety. This only becomes interesting when you start looking at differences between alternatives, like "the safety of servers in my bedroom is X, but the safety of buying resources on GCE is Y, so I can get this much of an improvement by spending that amount of money", or "by doing both of these things I improve my safety to Z, and I am willing to pay the additional cost of doing so". Or perhaps your position would be "this option is less safe but much cheaper and I'm willing to accept the extra risk".
The problem I have with the "fuck the cloud" article is that it doesn't do any of this. All it says is "the safety of this option is only X, you should experience anxiety". Is X higher or lower than that pile of iron in your bedroom? You still don't know.
(Realistically, unless you have the ability to build a system in your bedroom that has continental diversity for storage, N+2 of everything for hardware failure, etc, your bedroom is likely to be far less safe than the major cloud services - unless you live in a country which regularly bans American companies from doing business with you, which a sixth of the world's population does.)
There exists just about every price point and combination of services you can imagine. It's awesome, too, because I can grow a business and can enter into the market with a whole "rack" of servers for almost no cost at all.
I have a complete history of email archives since I started using gmail (ie cloud), but have lost years of archives from the years I managed email myself.
It turns out that companies focused on data storage and retrieval do a better job than me. And that's fine. I pay someone to do my taxes and I don't build my own furniture. Specialization is a good thing.
The accountant is specialized in doing your taxes (aka serving up your data/files) but you still need to keep a copy of your tax records, receipts, and etc.
I don't care about the "distributed computation" promises of Diaspora or Sandstorm.io; I think they're wrongheaded. Anyone can do compute. But I really do hope that one day that my Facebook account can be canonically "stored on" a database instance I'm paying for, that Facebook's app servers will reach out and connect to and treat as the canonical source for "me", treating their own records as a secondary cache. This kind of setup would makes all sorts of things simpler and clearer and more secure; there would be a definite boundary where "my data" stops (the DB I own) and "Facebook's data" starts (the DBs they own.)
And, to be clear, I'm not talking about everybody running their own infrastructure, or even everybody knowing what an IaaS provider is. Ideally, PaaS providers could get into the "consumer instances" game the way Dropbox is in the "consumer files" game. My "Facebook account database" above could be transparently launched into my own cute little private cloud by my PaaS provider when Facebook requests it through some OAuth-like pairing API. I wouldn't need to think of myself, as a user, as "owning cloud database instances." From my perspective, I'd get an abstract "Facebook account" (which is actually an app instance and attached DBs) sitting in my Heroku-for-consumers account. The important bit is that I'd be paying for the resources that "account" object consumes, that I'd have an SLA on those resources, and that the PaaS company would have every incentive to make it easy for other third-party services to interact with my "Facebook database" in a way Facebook themselves aren't. I, as a user, have no need to "manage" a cloud of my own; I just need to be considered to own it.
What happens to Facebook when your data provider goes down, or just gets slow? What if they mess up permissions or change their API?
Maybe you're thinking "that's fine, if my provider isn't reliable, my Facebook account becomes unavailable and it's up to me to choose a better provider." But what about all the people who are sharing your feed (or whatever it's called these days, I don't really use Facebook)? Do they query your stuff, and then timeout when it doesn't respond in time? Now other people's stuff is slow to load.
Just seems like an engineering nightmare, to me.
Now, note that the "canonical" storage that gets read from doesn't have to be the same storage that the indexes get written to. The first can be owned by the user, while the second can be owned by Facebook.
Presuming an architecture like this, the latency and availability of the user's "Facebook account" database is relatively immaterial. While writes would have to be synchronous (so, like you said, Facebook would have to give the user a "sorry, your account is unavailable" error), reads could be asynchronous. Think of an online RSS feed reader service: the "primary sources" are the third-party sites with their RSS feeds. Sometimes those sites go down. When they do, the reader-service can't retrieve the feed–so that feed just goes stale.
Things like Facebook's graph, meanwhile, are fundamentally indexes. The base-level "documents" in the graph are relationship assertions—a copy of "B accepted A's friend request" stored in B's database. The graph is a computed value built on a pile of those. When Facebook can't reach someone's database, things like these relationship assertions just go stale.
The crucial idea, here, is that for Facebook to do its job, it probably has to cache a majority of the stuff in the user's database in one form or another—just like an RSS reader caches RSS feeds. But this is purely a cache, in a fundamental sense. Users who don't "check in" with Facebook could be cache-evicted from its database. Other users would still have relationships with them and be able to post on their wall and such (they'd be putting those documents in their own outbox); Facebook would just no longer bother computing anything that's personal to the user, like their news feed. There would be every incentive to set up the architecture such that user data that wasn't needed would be "garbage collected" off of Facebook's servers, because it could always get put back on, the moment that user's account-instance woke up again and said hello.
This would also mean that Facebook wouldn't need to store any of their user-data in anything resembling a relational normal form. Every table would be a "view" table. The canonical database, owned by the user, could be relational and full of nice constraints and triggers (and the user could even add these themselves); but since Facebook can just query out of that to get any data it's missing, it wouldn't need anything like a "users" table. (Fascinatingly, if Facebook was built as a microservice architecture, this means that each microservice would probably separately query the canonical data from the database in order to generate its own indexes; the Search service would know one "face" of you while the Photos service would know quite another. These could even—in theory—be separately ACLed within your own DB instance, giving the user true, actual control over what Facebook can do with their data, component by component.)
I wonder if there's a relationship between collecting and being an introvert. I have no evidence but feel like there is. I think most people just don't value "things" the way our types do. What most people value is their social life. They post pictures to Facebook not to store the pictures but to get a reaction from their friends about the picture. That's what they value. The social interaction, not the thing.
Me either, I hope that's not what it sounded like.
> Before facebook people would keep large collections of photo books so that when family and friends came over they could easily share them. It doesn't mean they valued those pictures any less because they were extroverted vs. introverted. People still keep hard copies of the pictures they cherish the most, like a wedding album.
Yeah, absolutely, and I think people do still keep the photos they care most about today. I just think we overestimate how many of those photos exist. I think the overwhelming use of taking photos is for social conversation.
But I remember what it was like before "the cloud"... You had to provision machines one by one, often via email or phone. It took hours or days. Billing was usually done by the day. It sucked.
Now you can just send a POST request to a machine and in a few seconds you have access to a new instance. You can send 1000 requests and get access to 1000 instances. And then you can send some more requests and only get charged for a few minutes of time. When this transition happened, we called it "the cloud" because it's a big undifferentiated mass of computers. We could've called it "the soup" but we didn't.
That's what it has always meant to me. I don't understand why people want to eradicate the word just because it's misused. People misuse the word "internet" all the time, but I don't think we should strike it from our vocabulary.
And that's my biggest problem with the word, and why I go to some lengths to avoid using it when talking to clients, despite a number of my employer's (and their partners') products including 'cloud' in their actual name.
No two people actually agree on what the c-word actually means. For you it's elastic compute. For others it's cheap storage, or geographical diversity, or managed RAID, or an ersatz CDN, or their hosted email or wiki or some other application, or an accounting convenience (dipping into the opex, rather than the capex, bucket), or, of course, some combination of those and others.
The biggest problems I've dealt with in IT over the years stem from people having even slightly different ideas of what words mean -- so a word that involves wildly different understandings is not a recipe for tranquillity.
Though, I don't think the author is upset about the word. He's just warning against services that fall within his definition of "cloud" - which is pretty fair. Losing your stuff is no good, and there's always cruddy services out there.
I guess the only 'new' thing that I saw was the scaling capabilities based on capacity, but we've had time-sharing (albeit, slightly different) for many years.
Is such a statement much different than saying PCs are just tiny mainframes with a screens attached?
Computing at the whim of others aka timesharing.
Smile and nod but your mother has seen a few cycles of the industry.
Remember that a timeshare was very expensive and that those costs were accounted for carefully not only for the purposes of internal billing but also for benefit-analysis.
Use such services for backups, sure. But don't rely on them too much, you don't know what might be going on behind the scenes at the other end of the line. You don't know their financials (usually), whether they're under investigation for something, whether an intelligence agency is spying on their servers, whether their security is up to par in every possible sense...
And if they're offering it for free... well, they don't have much invested in keeping you as a customer when things get tough. Became unpopular recently, for saying something controversial or 'stupid' on social media? Made enemies in the political world? Then a lot of companies will be quite happy to shut down your account to avoid bad PR. By using these services, you provide a nice target for the social media mob the minute you do something that a lot of people don't like...
That's false. Files in Dropbox are also stored locally on all your Dropbox-enabled computers. They would simply stop syncing, and you'd plug in another syncing service. If Dropbox disappears, your Dropbox folder just becomes a regular folder.
And other times I don't forget, I just leave it and continue with a smile :).
I'd love it if Facebook and Twitter had a rolling deletion period option - everything more than six months old is shredded forever, as far as the service is concerned. While people can obviously store shared photos and this wouldn't actually destroy them, I'd like that new contacts wouldn't be able to go back and look through someone's entire history. It's like a more social and longer-lived Snapchat.
"This person" is Jason Scott, who works as the Internet Archive and also heads up Archive Team (the loose band of internet folks who race to archive sites about to go dark).
He's a digital historian; his job is to save everything he can.
Remember that next time you're red with rage because a cloud provider lost your data or a service you depended on is now bought/closed/gone.
To constructively disagree, though: I keep backups. I have a Synology NAS device that I really like. But for your average person, I have to wonder - are their digital photos really safer on their laptop than they are on Facebook? Facebook is a fairly stable company. Laptops are lost, stolen, and damaged all the time. Files are accidentally deleted. People get viruses. People forget the password to their full drive encryption. When Facebook does bite the dust, it's hard to imagine that data just disappearing - it's incredibly valuable, and in the worst case, someone would buy it just to sell it back to people. Not ideal, but it isn't lost, and you had free storage for several years anyway.
Tangentially related, this is why I shoot RAW. Not because it might give me better pictures today, but because it WILL give me better pictures in 5-10-20 years. You can take a RAW today that was shot 5 years ago and pull detail that was impossible to pull when it was shot, and that ability will only improve.
By making a technically inept strawman, it's reduced to an argument of which is more idiot proof.
The problem with idiot proof is that there's always a better idiot out there.
The argument here by the 'anti cloud' side is 'why not both?'. To argue for one side or the other exclusively is pointless.
In other words, in real world usage, the cloud is more dependable and flexible than internal IT infrastructure providers, for most businesses.
That said, there is a good philosophical argument against the cloud.
I agree, but its fundamentally simple: Always be in control of your own data, because no one else is going to be looking out for it.
I really like this idea. The biggest benefit is that it stops people expecting these services to act like an archive. With the current system it's easy to think "Oh I can just look this photo up again on Facebook if I want to". Instead we should be treating these services as publishing platforms while we maintain separate archival versions of our data.
When data will become so important/politicized that there will be regulations about data retention?
(Makes me wonder how all those fly-by-night cloud startups are handling that.)
Of course, how you structure your regulatory framework can have adverse consequences, as well (i.e. implicit guarantees by GSEs on mortgage-backed securities). It is further arguable whether or not fractional reserve exacerbates these effects.
Then, I read an article in Google Reader discussing how we don't even have to worry about arbitrary and capricious market behavior from our cloud partners, since we all have long term contracts and are protected against upward price swings from our cloud partners or material changes to the services they deliver.
The point isn't that these services are bad. Just that you need to be strategic about how you use them.
Besides, banks are subject to rather a lot of external control, through regulation (oh noes!). Here in NL the government actually guarantees your savings (but not investments) should a bank go belly-up. I believe the US did something similar, although I didn't bother to follow the specifics about who bailed out who for whom.
In absolute terms, they can, but no one I know of has a solid backup plan for life savings. Life savings come from saving over a lifetime. You can't really replenish them without replenishing someone's life and ability to save.
How to back up your savings!(.io)
Now to make it work...
Subject to some fine print so it can take some work to get your accounts set up in such a way that you would qualify.
Bank mergers are a risk to be aware of here.
PS: A home loan might seem like the bank owns your house, but they can't say no when you sell it.
They can, unless you satisfy the loan by paying it off as part of the process, at which point they no longer have an ownership-like interest.
In the unusual cases where you try to sell a house without doing that, the bank absolutely can -- and often will -- say no.
Alternatively, you can generally walk away and sell it to them for the value of the loan.
If you actually pay them off, then they don't get to say no, because paying them off is, essentially, buying out their interest in the property, under the terms of an existing contract. That doesn't negate the fact that they have legally-enforceable rights in the property until and unless you do that.
> Alternatively, you can generally walk away and sell it to them for the value of the loan.
Only if the mortgage is governed by the law of a jurisdiction where pursuit of mortgage deficiency isn't allowed (either in general or for mortgages in the specific conditions yours has.)
http://blog.personalcapital.com/wp-content/uploads/2014/03/4... Note that's only for people with open 401(k) accounts. Also it's pretax, so subtract ~25-35+% from those numbers.
For comparison the average house is worth ~180k.
I don't understand this logic. Amazon's S3 offers service level agreements with failure rates that at one point implied the statistical likelihood of losing an object to be once in "thousands of years". When dealing with any sort of stable storage this is simply something I cannot offer. I couldn't produce a set up locally with the resources I have for making guarantees on the decade level let alone millennia.
With that said I keep personal copies, but the authoritative copy is what's in the cloud, because its a hell of a lot more stable.
TL;DR I hear this argument all the time. The cloud isn't perfect but its a hell of a lot closer to anything I could achieve. "not invented here" syndrome won't save your data.
It has nothing to do with 'not invented here', it has everything to do with your inability to outsource your responsibilities.
The degree to which people rely on others to take care of their stuff is a huge blind-spot. Jason's advice is spot on in this respect, no matter what the up-time guarantees of the cloud solution you are using (and no matter what the redundancies), if you store all your data in the cloud without an off-line copy your company is 3 mouse clicks away from being history.
1 copy off-site
The cloud is a great place for that 1 off-site copy.
The vast majority of the world would be much better served by (to appropriate your jargon) a 2-1-1 rule, because it's a straightforward treatment for a straightforward problem that is easily implemented. Yes, "format skew" and double-failure of backup solutions does indeed happen, but at a much lower incidence than "oh crap I deleted it!".
This protects against data corruption that is not detected immediately.
Another common way of doing this (and one that I prefer) is one where you rotate out a backup medium with ever larger intervals. So one gets set aside per week, then one gets set aside per month and so on. That gives you a series of snapshots in time that will allow you to pinpoint with some accuracy when an event happened. Longer ago you'll have less accuracy but this can help a lot in trying to triangulate who or what messed up. Just being able to answer the question of whether or not 'x' happened before 'y' was hired or after can help in narrowing down the number of suspects in case of a breach or other nastiness. It also prevents against back-ups for whatever reason not wanting to be reloaded (and you should guard against that by loading your back-up immediately after you make it, even so, the medium might fail the next time you try a read).
Better still if there is a streaming log of everything but only very few companies can afford that sort of solution for all their data. Those can be hard to restore from (by replaying) so there too a snapshot system can help.
But if you're going to condense it to a "rule" that will help people not well-versed in the field, that rule can only be "MAKE BACKUPS!", because at least 90% of the data loss scenarios in the real world happen because simple backups weren't made.
Don't make it more complicated than it is, because someone will stop to do it "right" and then lose data because they didn't just make a copy on a USB stick.
3 backup services
1 local copy
I guess that could get expensive depending on the services.
For each business this is a delicate affair and you should spend some time on figuring out what you really absolutely have to have and what you could lose without losing the company or getting in trouble with the law.
There's risks and then there's risks. As always, you can spend some time and money to mitigate some risk, and decide some risk is low enough that you don't care to mitigate it. I wouldn't just assume one risk vector is higher than another without some concrete numbers.
What are the risks of having a bad health issue happen to you? Are you insured?
I'll bet that you don't know the answers to any of those questions, and yet, you are probably insured against all of them.
As for your question in more detail: it happens often enough that for me if you operate your business 'in the cloud' and you don't have a back-up of your data outside of the cloud to guard against catastrophic data loss due to either malice or accident that I'll happily fail you. Think 'sysadmin was depressive, wiped out company' (or maybe you let him/her go and they took revenge or any one of a number of other scenarios that would instantly terminate your existence online if you had not guarded against it).
Risk management is consequences * incidence versus cost to mitigate. If the cost to mitigate is negligible, the risk is measureable and the consequences are terminal then it is a no-brainer to protect yourself against this. At least one of my customers wished they had set up a backup facility for their critical data (too bad, end of story for them) and there is one published case that we all know about. That's two that you can count on and most likely other trouble shooters have similar stories.
I see catastrophic cloud data loss as having a much higher chance of happening than many other items on the list of stuff that I verify before greenlighting an investment.
Though if we unbox that a bit... How do we balance the risk of loss due to your cloud being owned vs. loss due to the risk of your users' PII being violated if someone attacks your in-house (ostensibly in-house reliability and security-managed) copies? Better make sure you're spending the money on security best-practices on all your copies, not just the "live" ones.
Side-note: What are your thoughts on two separate cloud providers as providing sufficient insurance? Say, having your data live in Google Cloud and at rest in Amazon S3?
Having two cloud providers would work assuming they don't share any critical infra-structure and assuming that it is not the same set of employees that have access to both systems. Also helps if you have a totally separate set of credentials for the back-up system and if that back-up system can only be unlocked by very senior people (preferably execs) after a catastrophe of suitable magnitude hits.
Whether you store your primary in the cloud or your back-up in the cloud the story is the same: have a copy somewhere else.
Finally: the only real back-up is one that is off-line. So from that (ok, ultra-paranoid) viewpoint it would be best if you actually went to an off-line medium to store your data in such a way that nobody can wipe it all out without physically destroying all copies.
All this of course after suitably weighing the importance of the data.
"Code Spaces will not be able to operate beyond this point, the cost of resolving this issue to date and the expected cost of refunding customers who have been left without the service they paid for will put Code Spaces in an irreversible position both financially and in terms of ongoing credibility. As such at this point in time we have no alternative but to cease trading and concentrate on supporting our affected customers in exporting any remaining data they have left with us."
They promised a follow up once they got to the bottom of what happened but they never did, I'm still curious what the whole story was.
I can make this even easier. Its almost certainly going to boil down to this question. Have you ever lost even a single byte of data? Cause if you have, you aren't even in the same ballpark.
Have you ever been killed in a car accident? No? Then I guess you don't need a seatbelt, right? Has your house been burnt down by a lightning strike? No? Then no lightning rods and proper grounding for you, right?
You don't wait until a disaster occurs before you put a disaster recovery plan in place.
So getting back to risk management. So we about to make a QB selection for the big game. Now that we understand that the implementation of folks like jaque here are like the 12 year old pee wee standout vs. the cloud provider's seasoned NFL quarterback. Which one do you pick to safe guard your business.
This is just alarmism. If you really wanted to demonstrate your point you would show me some data. The data would demonstrate that over the millions and millions of users on a number of cloud platforms that their rates of data loss are significantly higher than your "home spun" storage. Then you would take out the outliers and show an honest distribution on the most stable vendors(since I'd expect the majority of data loss over all vendors to have some amount of locality within a specific vendor). The result of this will be the rate of data loss on the most reliable cloud platforms. You can then compare this against your own success rates.
If you have ever(even once) lost even a single byte of data(for any reason), I do believe you will find yourself hopelessly outmatched.
You seem to be missing the point.
People are not saying "don't use cloud storage at all"
They are saying "don't use cloud storage exclusively"
You might trust that your cloud providers will never be hacked, that your individual account on them will never be hacked, that they will never suffer a catastrophic failure, that they will never go out of business, that they will never be fully or intermittently down at the moment you need to retrieve some data, and so on and so forth, based on their claims and documentation, or that you won't cock up at some point and cause data loss in your account, but I prefer to have my own copy (or copies) as well as the ones that are "in the cloud". For truly essential data at least one of those copies is both offline and offsite.
For most values of "I have x hours and y dollars to safely store z gigabytes", a pure-cloud solution (possibly involving multiple independent cloud providers) has a lower chance of failure than one involving local storage.
But the fact remains that where ever you choose to put it "offline and offsite" the odds of it being lost are orders of magnitude greater then its persistent and redundant storage on a reputable cloud provider. Even if you put it on the most stable storage you can find and lock it in a underground safe. You can't guarantee its integrity a handful of years from now let alone centuries.
To put it in other words you are advocating storing your money in a mattress because you don't "trust the banks"
It is interesting that even today there are still people arguing against having back-ups because 'the cloud', you'd think that we ITers learn from our mistakes but for some this has to be personal experience before the lessons are learned. Good luck if and when it happens to you, please re-read your comments here at that time. There isn't an IT person that I know that has not been saved by a back-up at some point in their career, the fact that you are lucky so far is not a reason to think you are exempt.
And no, nobody is advocating storing your money in a mattress because you 'don't trust the banks', the point is that you can keep a copy of your critical data for very little money to safeguard against an eventuality that seems to hit the IT world with alarming regularity, even if they use 'the cloud'. Fuck-ups, hacks, disgruntled employees and failures all happen, every day.
The failure probabilities are relative. And everyone here is equivocating. As if the likelihood of failure is evenly dispersed. You indicate that storing customer data in the cloud is "playing fast and loose" but I'd argue the opposite. Not having a cloud backup is what is "fast and loose"
Imagine you are an independent bank. You need to move your customers deposits. Do you load sacks of cash into your corporate minivan or hire an armored car service? Well lets look closer. With the latter you are giving up control, right? You have no control over the quality measures an armored car service might take. Yet somehow not contracting them seems like foolishness. The reason is obvious. Its because you know in this case that transporting money isn't your expertise. Its not something you can focus adequate time and resources on perfecting. You also can't spread the risk of failure across a lot of customers, absorb that failure, and make your service better for the next go. The exact same logic applies to long term storage of data. If that isn't your only function its extremely hard to get it right.
If you bank fails, you will be made whole with money from the FDIC. Money is fungible. Any money will do.
Data is not similarly fungible. There's no IT FDIC to replace your lost data with other data that's identical.
When was the last time the FDIC failed versus the last time an online storage vendor folded?
Personally I think the risk that AWS fails is remote, but if I were to store a bunch of data with them I'd most definitely make sure that I would not be dependent on them. It would be a convenience at best, but never a dependency.
> You indicate that storing customer data in the cloud is "playing fast and loose" but I'd argue the opposite. Not having a cloud backup is what is "fast and loose"
Basic reading comprehension failure, that is not what GP is arguing.
He's arguing that if your live data lives in the cloud your backup copy should not be in the cloud and vice versa.
Insured by who? Not governments. Governments collapse all the time. The only safe place to keep it is in your own custom bank at home that you built yourself.
Money and data are not directly compatible here, as one "bit" of money is the same as any other in most respects, but I do always carry a minimum amount of cash in case my cards stop working or are lost/stolen, have a pot of "emergency cash" at home for similar reasons, and recommend others do the same. Most of my money is in the bank so I'm not protecting against massive institutional failure here, but I am protecting myself against a variety of possible temporary inconveniences and possible mistakes on my part.
I'd rather have two backups of which one is more likely to fail then the other than just one, or three rather than two, and mixing types of backups means not all of them are subject to exactly the same failure modes (my local backups, online or off, are unaffected by connectivity issues for instance). For data that is important, a little paranoia is healthy IMO.
In this case, money can't be compared to digital data. You can make multiple copies of digital data on our own but not of your money tucked away into a mattress, that would be illegal.
I also not advocating against the cloud, just that a single copy in S3 is not a backup solution.
There is no single perfect solution, so like security you layer your disaster recovery.
This is more or less an exact copy of how I would advise companies to set this up.
(Jason Scott is speaking from experience here, as one of the people called in to do emergency archive response when these cloud businesses shut down.)
Edit: jacquesm points out the sudden death of cloudspaces.com, which I'd forgotten about: http://www.infoworld.com/article/2608076/data-center/murder-...
A single computer security problem (which happen on a daily basis, although not usually at this severity) could enable your cloud to be deleted.
The "counterparty risk" is small but ineradicable. The finance industry re-learnt this with Bear Stearns recently.
That's absurdly unlikely to happen with S3.
I'm not going to go too deeply into speculative scenarios, but all kinds of business-interrupting calamities are possible with very low probability. We just had the story on the frontpage about the Juniper backdoor; do you think S3 isn't being targeted by multiple state intelligence agencies?
Basically I agree with https://news.ycombinator.com/item?id=10772321 - by all means keep the working copy online, but have a local offline backup.
How absurd was it that money market funds would fall below $1 during the recent unpleasantness ? It was so unthinkable that major parts of our entire global economy were predicated upon it never happening.
Money markets are even more "serious business" than Amazon, or even all of tech is, and had even more big brains attesting to their inability to fail. They failed.
If Amazon went out of business there is no chance at this point that the loss would be total and catastrophic. Its essentially a bank at this point. Its shutdown would be orderly.
For any one system to fail is perfectly possible and in fact should be expected to happen at some point and that's why you design against that.
You also make sure that it is not the same people that have access to both systems so that if your sysadmin walks onto the floor with a bad hairday your back-ups will still be there. And this is also why you test your back-ups to make sure they actually work.
The cost of getting stuff out of Amazon is very dear. While the statistical likelihood that Amazon will lose your stuff is low, the likelihood of Amazon or a competitor doing something to cause you to rethink your use of AWS (or force you to move) is much higher. Wanting to reduce your AWS costs in the long term is a near certainty.
As vendors move away from perpetually licensed software this gets more important. What happens when a vendor (say Microsoft for strawman purposes) decides that some function that is important to you must interface with Azure, and Azure only? What if Amazon decides upstream network transfers are no longer free? Those sorts of changes can break your business, and vendors like Amazon/Microsoft/Google are fully enabled to change those terms.
I think there is an easy rebuttal that should be considered...
First, the "nines" rating of any service or resiliency is just gibberish. Go find the statistical likelihood of money market funds "breaking the buck" or of CDS blowing up - both in 2007/2008. Those had a lot of nines too and a lot of very smart , well qualified people attesting to those nines (in venues even more serious than IT).
A highly complex system becomes incomprehensible, even to the people that built it. Those nines mean nothing.
Second, you absolutely can build something more stable and predictable than Amazon precisely because you're the one that built it - which means that it is more comprehensible and fails more predictably and gracefully.
I don't care who does the calculation and how many nines they come up with - if you load FreeBSD on two bare metal servers and put them in two different datacenters and run them with any kind of conservative and cautious sysadminning you'll have a better solution. Yes, it will be more expensive.
The standard closure to a comment like this is to refer to Talebs Black Swan and Antifragile books ... which you certainly should read ... but even more important is "Normal Accidents" by Charles Perrow which I hope will convince you to stop looking for complex things that never fail, and instead look for simple things that fail gracefully.
 ... but we have a HN-Readers discount - just ask!
 You know who we are.
Nonsense. It's very easy to have that kind of setup fail - remove one from the load balancer, take it down for maintenance, whoops we removed the wrong one from the load balancer. I've been in similar-sized companies using dedicated servers or AWS, so I've seen both sides. It's like how people feel safer when driving themselves than when being driven by a professional, even though a professional is overwhelmingly likely to be a better driver - everyone thinks "oh, there's no way I'd make that kind of mistake".
It is very easy to overengineer "high availability" systems - I certainly think things like STONITH and dedicated control planes are more trouble than they're worth, and running a system that you don't understand is a recipe for failure. But I'll take FreeBSD on a basic AWS setup - four EC2 nodes (multiple AZs), and an ELB - over bare metal any day. Physical server maintenance is not where I have a competitive advantage, and would mean more complexity to understand, not less.
I'm sorry - you're already missing the point.
There is no load balancer. There's no firewall. There's no services running except for sshd.
I guess I should qualify what I mean by "better", though. What I mean is, "I know exactly how this will fail and it won't be interesting or surprising. Or take any thought or time to fix."
It will be comprehensible. "FreeBSD on a basic AWS setup - four EC2 nodes (multiple AZs), and an ELB" is not. You have no idea (nor do I, nor do probably most folks at Amazon) the different and fascinating ways that will fail.
(disclosure: we use EC2 instances for backup DNS. We are not anti cloud or anti amazon)
This is from you and this is from the wikipedia article on NIH(not invented here) syndrome ...
>Not invented here (NIH) is the philosophical principle of not using third party solutions to a problem because of their external origins. False pride often drives an enterprise to use less-than-perfect invention in order to save face by ignoring, boycotting, or otherwise refusing to use or incorporate obviously superior solutions by others.
I am not saying don't have a copy, I am saying the if you have several copies the safe one is in the cloud.
These polarized "requirements" are usually rationalized away by each group. As a data privacy advocate, I believe that I can provide a reliable storage solution to my company without using a centralized cloud service, which then guarantees my privacy because "I'm in control". As an advocate of centralized cloud services, such as Amazon, I present that their team is better at security and reliability than any other team on the planet and that I can encrypt something and trust that my key management is secure. Both of these arguments have fallacies and assumptions.
The solution is to challenge ourselves to build better solutions. At what point in time did technology advancement ever slow down? At what point will we ever stop and say "Y'all, this here compute system is good enough and should be centralized/decentralized!"? Never, I say.
and if you’re a person they are giving it to you without
you signing anything accompanied by cash or payment that
says “and I mean it“.
For plenty of use cases I actually agree with that conclusion. With the caveat that I would add: accompanied by a suitable non-cloud back-up of critical data, code and configuration data held on a medium that is not accessible by the same people that administer the cloud stuff.
This is where you are simply wrong.
> "the authoritative copy goes into the cloud"
This we agree on.
So, where you are wrong is this: you are only considering technical failure modes but there are many others besides and so any comparison between 'the cloud' and 'your get-out-jail-free-card-backup' would have to be in terms taking into account all modes of failure, not just the technical ones and then on top of that you'd have to consider the likelihood of the backup failing to restore at the same time that the primary system goes down for whatever reason (including technical ones).
That's why I call it a blind spot, even after being told in 20 different ways that the uptime of Amazon does not come into play here you are still clinging to that. It is simply not relevant because that's not the scenario a back-up will most likely prevent against. It will also prevent against that scenario, on top of the ones where someone incapacitated by drugs, anger, depression or any one of another set of circumstances decides to take it out on your company. Or maybe someone makes an honest mistake (I've been called out twice in my career to restore data for companies that had been wiped out because of simple mistakes that should have never ever happened).
So back-ups are not a luxury, they are a necessity and the degree to which Amazon can outperform your back-up in terms of availability is not a factor in the whole discussion.
Ask Bernie Sanders about the cloud and the implications of losing access to it at the whim of the cloud provider.
Now go put your stuff up on S3 but please keep a backup of the things you upload there.
>Do you control Amazon? No? Then you're not the captain
This is textbook NIH syndrome. Rather than looking at the relative probabilities you look at something from an ideological perspective. Your argument isn't "Its safer with me" its "I want to be in control of it" This attitude is generally harmful. Think about your money. Are you in control of the bank? no right, but you still keep your money there instead of in a shoe box under your bed. Why? Because the bank is better at protecting your money than you are. They have big heavy metal doors and men with guns to move it from place to place. Amazon is like a bank for your data.
I'm not saying you shouldn't use S3. You should just make sure that it's not the only service you have the only copy of your data hosted.
>Rather than looking at the relative probabilities you look at something from an ideological perspective.
No, I'm looking here solely at the risk of putting all your eggs into a basket you don't control.
That is what happened (once again) to John Meriwether and Long-Term Capital Management... A "once every 10.000 years event" which took place... today.
Where things like that in 2009 when this article was written?
Curious if Upton Sinclair would have anything to do with it.
Lets put it this way, I know enough not to store data on the cloud(s) exclusively.
Don't move to the cloud; copy to the cloud.
Also applies to other storage media.
P.S. Can't delete Glacier Vaults for now as AWS enforces a cooling period.
That's a good thing. And in the case of Glaciers an excellent pun.
I have a bit more of a nuanced view on this than Jason, but I totally understand where he's coming from and when the whole cloud gravy train started rolling our perspectives overlapped much more than they do today (and quite probably since then Jason's perspective has changed as well as perspectives do with the passing of time).
There are use-cases where the cloud is absolutely and utterly the wrong way to go about it. When you're running a bank, a government institution (ever a lower government one) or something else that is mission critical and where total control of the data and maintaining end-user privacy is paramount then the cloud is probably not the right solution.
There are also use-cases where the cloud is the right solution in principle but the wrong solution in practice because of cost. Above a certain scale bandwidth and storage costs of cloud operators will always command a premium over those you get from dedicated hosting providers.
As for 'not owning the machines', plenty of companies lease their servers, so technically they don't own them anyway.
The big problem with 'the cloud' as I see it is that companies tend to rely utterly on it and do not have a 'what if the cloud fails' line in their disaster recovery plans. Lose the cloud data and the company goes up in a puff of water vapor, which is what clouds are made of after all.
So if your use case does match the cloud solutions well then make sure that whatever else you do, have at least a copy of your critical data, code and your configuration information outside of the cloud provider. And while you're at it, make sure that this is done in such a way that there is a separation of duties with respect to those that can administer the cloud portion and those that can access those just-in-case-the-shit-hits-the-fan backups.
Just so you don't end up like codespaces did.
Finally, the cloud is not so much an end-station as it is a step on a much wider scale from absolute control with certain administrative duties on one end and much less control but great convenience on the other. Where on that scale constraints indicated by your comfort level, your application and your fiduciary duties allow you to pick your solution is something that is likely different for every company (and likely for every person).
Customers of companies would do well to research their service providers when it comes to how they are architected, just in case something goes drastically wrong so they don't end up holding the bag.
I read both that one and the one posted here like an hour before this one got posted on HN and they have convinced me to give running my own server a try.
In the meantime, I have just downloaded a backup of all of my data from Twitter and Facebook (Facebook's archive was like 15x as big as the Twitter's archive even though I'm using Twitter way more) that I am going to save on my server, I have switched to POP instead of IMAP on my current email service and I am testing out ownCloud in a Docker container.
Nothing about the cloud is that different from what we had before. With shared hosting providers, you and 50 other users would fill up your disk quota on one or two hard drives on some dinky 1U server running Apache and ProFTPD. If the drives died, along with it went your data. Which is why you kept a copy on your own computer. Back then, nobody expected anyone to keep their data for them, so they just kept their own backups. The same was true for managed services and colo with the exception that you had to do more of the work yourself.
Because the industry has gotten better about preventing data loss, we get complacent and stop saving our stuff as much. But why piss and moan over more reliable, more massive services for cheap or free? Because it isn't perfect, or innovative, or more transparent?
The status quo of the industry is to reinvent the wheel, so it's hard to get mad at people for re-packaging the same solution in a different container. The obsession of holding onto all your old stuff just makes this look even more unnecessary.
As someone who has generated a pretty hefty sandbag of verbiage over my decades online, it's always amusing to see what the Grand Eye of internet arbitration decides is an incredibly important and pertinent subject to discuss in my back catalog. Whether it's my work in guiding volunteers for in-browser emulation (http://archive.org/details/softwarelibrary), my delightful coterie of 1980s BBS textfiles (http://www.textfiles.com) or perhaps my documentaries on BBS culture (http://www.bbsdocumentary.com) Text Adventures (http://www.getlamp.com) or the DEFCON Hacker Conference (https://www.youtube.com/watch?v=rVwaIe6CiHw) ... or, as it is today, one of my many long-form written-down thoughts on all manner of this silly medium many of us have chosen to live our lives.
Oh yes, also that my cat is on twitter and has a million followers. (http://www.twitter.com/sockington) - Lots of people are loaded with knapsacks of opinion about that one as well.
I have found that Hacker News (which is, be clear, an unexpectedly lively extension of Y Combinator) is composed of several diverse groups, all with variant approaches to a linked subject. A linked subject which, as some have pointed out, I wrote 6 years ago, deep in the mists of time.
One group is literally in it for the Money, the gain, the ROI, the endless quest for the "Unicorn", and all their commentary is pungent with the bias and filter of either finding the precious gold coin at the bottom of the shitpile, or are rife with attempts to promote or play up subjects and links of great interest to their financial agenda. Be assured that I could not care less about the current status of the beating of your heart.
Another group seems to be happy to drill down as deep as they can into the mathematics, algorithms, and code of a situation, thinking that if they napkin-blart out enough "facts", they will win some sort of day. I find these people tend to be unhappy about flowery language or effusive phrasing, simply because they've left-brain-dominated themselves into deep pits of nut-sorting and bolt-counting. They use "TL;DR" a lot, as well as, I assume, Adderall. Their heartbeat status is of greater interest to me, if only because I think they are coming from a good place, even if that place smells of Cheetos and sweat.
And, of course, there are Opinion Tourists, my favorite, who might as well be equated with a loud and cantankerous pit of waving hands, waiting for the newly linked (if not newly written) event/opinion/image for their to raise in a mighty roar with a hastily cooked "hot take" on the item. Some of them even optimize the process to not even click on the provided link before the horn honking ensues.
So, "Fuck the Cloud" was written in the deep miasma of when everyone used the term "Cloud" interchangeably with "Magic"; that it was an approach and glory that would lead the experience of computing to a new shangrila. Like any old-timers rife with memories of how we got into that world (and of the echoes of cloud-dom going back 50 years), I decided to write out some of my own thoughts, especially on this attempt to dumb down the populace and separate them from not just responsibility, but control and agency with their data. I have been entirely correct in the general theme - there is a divide within the technical community, of people with admin access and the ability to control any aspect of their work, and then a very large, almost overwhelming set of users who are, essentially, meat stock. And in the same way that meat stock has no particular seat at the table when negotiations of an agricultural nature are conducted, so in the same way are the "users" left out in the cold as a whole range of abilities and ersatz "rights" are stripped away, under the guise of "ease of use" and "leave it to us".
All of this was written without the revelations of the deep, intense surveillance apparatus that is now in place, ensuring that any of this data you control or thought was within your own private space is actually destined to meet you again in an investigation, a courtroom, a warrantless intrusion or a physical SWAT attack. That wasn't even the point.
The point was that user data, treated as something to abuse, monetize, and ultimately discard as a whim, was a complete betrayal of the early promises and experimentation of the Internet. To counteract this trend, I co-founded Archive Team (http://www.archiveteam.org) and our delightful success in many areas would warrant a completely different essay itself - and it has, along with myriad speeches and presentations in the years hence.
I'm sure it might be delightful entertainment for Hacker News to find this or that out on the net and go off, endlessly, in the loop of "This Needs Me" and "Fuck You For Thinking That", but ultimately, these are ridiculous showboat-dances of "what if" and "why not", and I've discovered in the years hence that truly, actions and achievement s speak louder, ever so louder, than words.
Enjoy your day.
And fuck "The Cloud".
That would be good, yes. We should all try to be like that. However, HN has guidelines as well, and I'm pretty sure that textfile's post violated some of them. And the way HN maintains it's status as the kind of place where we care about the message is partly by discouraging statements that needlessly distract with an offensive tone.
> Be civil. Don't say things you wouldn't say in a face-to-face conversation. Avoid gratuitous negativity.
I guess this is the point where we'd be arguing about taste, but it seems to me that while textfile's comment was clearly negative, it was well constructed and well thought out. Not to mention it was in the _exact_ same tone as the article he'd written years ago that the conversation was about. In this case we should be very careful not to immediately jump on people being upset or negative. It's a powerful tool that, in this case, is being used wisely. I think a bruised sensibility here and there is a worthwhile risk in the name of maintaining a network like HN that accepts dissent and spirited disagreement.
As a slight tangent, in reading the guidelines I found that more than an attempt to tone-police, they are an attempt to lower the noise:signal ratio. The reason I feel that this discussion is important is because it's the reply, not textfile's response, that adds noise to the discussion even if you feel targeted by the authors response.
The local data center people will threaten and lord over you with their hardware powers unless you have the cloud alternative.
And you're not entirely wrong about the cloud.
So don't trust either.
A modern business should have at least two external cloud providers and a local option.
Go buy an external hard drive. Start saving important things locally, and also automate backups to the external drive. You now have two copies of everything you can't afford to lose.
I bet that wouldn't happen if he'd host his blog on a scaling cloud provider with a proven track record. I could think of a few that might be good candidates... ;)
Mere data driven conclusions get nowhere without a point of view and an end goal to accompany them.
Place smells more and more like Slashdot every month.
Also, it is possible to have civil debate around a profanity riddled article. That's what this site is trying to achieve.
Personal attacks are not allowed here, regardless of how wrong you think someone is. This and other comments you've posted break the HN guidelines egregiously. We ban accounts that do this, so please stop doing this.
We detached this subthread from https://news.ycombinator.com/item?id=10772561 and marked it off-topic.
I'm responding to you the way I do because you run (or apparently, ran) an AWS hosted service.
I'm a piece of shit because you disagree with me, but it just so happens that you've been all over this thread with a continuous stream of (hopefully) purposeful mis-understandings and/or downright trolling. If you want to do that from an anonymous account go ahead but don't do it from the same one that you use for 'Who is hiring' posts.
Just like say if cperciva writes about online storage I interpret that in the context of his tarsnap business.
People here are pretty open about the projects they're involved with and so are you, if you don't want that link to be made then I suggest you don't post about your company on HN or you create a separate account to share your 'pearls of wisdom' like you do in this thread.
Oh, and your anecdote of how many years you were up and running holds zero water by your own standards.
Attacking people and their professional reputations can get your own professional reputation called into question, get used to it.
>I'm a piece of shit because you disagree with me, but it just so happens that you've been all over this thread with a continuous stream of purposeful mis-understandings and/or downright trolling.
This is just not true. I called you that because you tried to do just that to me. A quick re-read of all my comments and I'm sure you'll see I never made any personal reference to you. Now if you consider me telling you that you are incorrect an attack then I'm sorry this is something you'll need to get used to. In this case because you are wrong.
I've offered you ample opportunity to proffer a statistical argument, but all you can give me is "if I didn't build its not safe" Sorry friend I just don't trust you. I've been doing this a long time and I don't trust myself.
I never told someone to do something "cause I said so" but then again you never address the meat of my position. "Cloud providers have much lower failure rates than you historically I trust them more than I do you when building my understanding of the risks". Its trivial to undermine my position. Show you have lower failure rates than a prime time storage provider(bonus points if its amazon).
Lets just a get a few non sequitars that you can't seem to get your head wrapped around. I never said don't have a back up. I said the back up in the cloud is the most durable one.
I never said amazon couldn't fail. I just said the chances they fail vs. you are orders of magnitude lower.
I never said you were bad at your job. I said you seem to have a shaky grasp on statistics and possible a fair amount of NIH syndrome.
You are correct my anecdote doesn't prove my point, but it provides anecdotal evidence to corroborate my statistical position.
I didn't say "don't be open or transparent" I said pulling a persons personal details into an internet argument crosses a very clear line.
Get over yourself and remember not everyone that disagrees with you is trying to undermine your career. But you step over a line with you get personal and start pulling personal details into an internet conversation.
I find this especially hilarious. "I just insulted you! I never insulted you!"
(I preemptively decline to get in an extended semantic argument about whether insulting someone is an "attack".)
Gain more experience in IT? I mean i've been at it professionally for 18 years soooo I guess it depends on what you consider "a lot of experience"
I get what you are saying. "A disgruntled amazon employee could just delete the world's data" right? Except for thats not the case. Its all stored in triplicate(at a minimum) across a vast number of independently available data centers. The durability guarantees are hard to fathom . Honestly I bet they'd have a hard time deleting data permanently even if they wanted to.
But once against I'm not advocating that you only store your data in one place. I'm just saying dollars to donuts you stored at least one copy of your data on s3, catastrophe has struck and only one copy of your data is left where do you think it is?
I know where I am placing my bets.
Experience is not measured in years but in what you learned in those years.
> I get what you are saying. "A disgruntled amazon employee could just delete the world's data" right?
No, that's not what I was saying. It is absolutely incredible but you again manage to mis-interpret what I wrote. Really, how hard can this be, let me try once again:
An employee of a customer of Amazon could wipe the company data.
So if you 'x' found a company 'y' that employs an employee ('z') then 'z' can if given the right credentials do a ton of harm to your company. Note that the amazon employees are not even in this equation (they probably should be but they are a lesser threat assuming Amazon is set up properly).
> Except for thats not the case. Its all stored in triplicate(at a minimum) across a vast number of independently available data centers.
You really should read up on codespaces.
> The durability guarantees are hard to fathom .
Yes, except when your data is gone. Then the durability guarantees matter not one single bit. Amazon can not protect against you or one of your employees wiping the data purposefully.
Directly from the Amazon docs:
"When an object is deleted from Amazon S3, removal of the mapping from the public name to the object starts immediately, and is generally processed across the distributed system within several seconds. Once the mapping is removed, there is no external access to the deleted object. That storage area is then made available only for write operations and the data is overwritten by newly stored data."
> Honestly I bet they'd have a hard time deleting data permanently even if they wanted to.
Apparently, you're wrong about that. And it is quite logical from Amazons perspective that you are wrong about that. Otherwise how would their billing ever function. If you delete something there may be a very short period during which Amazon might be able to recover it if you asked nicely however I wouldn't count and that depending on how important the data is you're playing Russian Roulette here. And if you're so sure the data can be recovered how come you directly contradict Amazon documentation on that very subject?
Glacier is another matter by the way, at least there you'll be writing some code to delete a vault.
> But once against I'm not advocating that you only store your data in one place. I'm just saying dollars to donuts you stored at least one copy of your data on s3, catastrophe has struck and only one copy of your data is left where do you think it is?
Yes, well, several instances that prove you wrong exist. Your next hire might prove you personally wrong.
Good to see you at least have a copy elsewhere, and nice to see you consider at least a possibility besides the technical ones.
> I know where I am placing my bets.
You're welcome to place your own bets any way you want. But the company you work for and the companies you found had better have a 'oh shit we lost all our Amazon data' recovery plan in the vault and it had better be one that when tested holds water. Otherwise you too may one day start looking to outsource your troubles.
Mind you, I don't actually have a problem with people that act in this way, they are more than happy to pay me my exorbitant fees when it comes to saving their hospital or company or whatever institution it is this week that manages to intersect paths with something they considered absolutely impossible right up until the moment that it happened.
But you at least will never ever be able to say you weren't warned.
Ah ok, that's what caused it. Well, that's fine but there is probably an aggregate several 100 years worth of experience in this thread that seems to be in violent agreement on you not really understanding the matter under consideration.
> An attempt to call attention to my potential customers in a forum because you disagree with me is why you are a piece of shit.
If you present yourself as a representative or founder of a company here then by extension you represent the views of that company unless otherwise noted. If you don't want that then you are free to make another account which is not in any way associated with your professional HN profile, it wasn't me that made the choice to join the two, you did (as do many other people here). But most of us are careful to speak in such a way that we do not bring our partners or employers trouble by drawing attention to either our lack of knowledge or ability or by making statements that would - when viewed in the light of our professional engagements - show our employers in a bad light.
So, whether or not you agree with me is besides the point. If you feel like totally mis-interpreting Jasons writings and on top of that you call me 'alarmist' you are actively attacking our professional reputations, and what goes around comes around.
I didn't call you an "alarmist" I said worrying about whether amazon will stay in business, "get hacked", or if S3 is going to disappear without warning is "alarmism" They were responsible for 39% of all commercial internet transactions last year and store 2,000,000,000,000 objects. If they go under we are are all in a heap of trouble. A NAS device in some remote part of the netherlands isn't going to save us.
Then maybe you should write a bit more clearly. The opening line of this whole thread is you quoting a single line out of context, stating you disagree with it, then providing evidence that you probably should agree with it and that in fact you act contrary to your stated position. To me that makes no sense at all.
> whether I stand behind what I say is totally separate concept from your motivations and how they reflect on you as a person.
That's another thing I can't make sense of. Probably my fault.
> The quality of my technical reputation is not in question here and my employer is welcome to read this thread.
> I didn't call you an "alarmist" I said worrying about whether amazon will stay in business, "get hacked", or if S3 is going to disappear without warning is "alarmism"
I never suggested Amazon would go out of business, you made that up all by your lonesome. What I wrote is that your control panel could get hacked which is an entirely different thing.
I never suggested S3 would disappear.
I also never suggested that Amazon (the company) would get hacked.
Even so, there is a remote possibility that all of the above (which you just came up with) would become true at some point in the future. But I purposefully did not allude to any of those because the chances of those happening are remote enough that for me they don't count as reasons to have a back-up.
> They were responsible for 39% of all commercial internet transactions last year
> and store 2,000,000,000,000 objects.
Doesn't enter into the equation at all.
> If they go under we are are all in a heap of trouble.
Well, you probably will be.
> A NAS device in some remote part of the netherlands isn't going to save us.
NL is small enough that we don't really have remote parts. Besides, none of those situations are the ones that I wrote about in my original comment. You really have a hard time in the understanding department, first with the original posting, subsequently with my comment on yours and further on with several other people in this thread.
When you feel everybody is acting weird or seems to be unable to understand what you are saying: consider the problem is at least partially on your own end.