
Fuck the Cloud (2009) - colinprince
http://ascii.textfiles.com/archives/1717
======
rietta
While Jason Scott raises interesting points six years ago, the principles of
data management remain the same as the days of dedicated servers with on
premises systems. If you have only one copy of data and that machine goes
down, then the data goes with it. In my experience in a business context 'the
cloud' discussion is about the business' desire to expand capacity without the
need for the capital expenditure needed to buy equipment and server admin
salaries. By the numbers it is just better to rent instead of buy at the
infrastructure level.

The cloud is a murky, ambiguity-laden concept though. Both Netflix and my 92
year old grandmother on Facebook 'use the cloud', but the former is much more
sophisticated in their network and data management practices. My grandmother
just wants to see fun pictures of her family and great grandkids.

~~~
FussyZeus
That's what the author is getting at, "The Cloud" is sold as this awesome
thing that will never die or break, and worse yet it's sold to people who
often don't know any better.

No one realizes that the Cloud is running on the same crap we've always had
and is vulnerable to the same issues as everything else. MAYBE the company is
better at data management, MAYBE the employees take pride in their job and do
it properly, but that's all MAYBE MAYBE MAYBE, and could just as well be "no"
and you're entrusting your data to people who really have nothing to lose if
it goes into the garbage tomorrow.

~~~
BookmarkSaver
Eh... I think that the maybes you are using are a little bit misleading. When
a company sells space in their "cloud", like maybe Microsoft or Amazon, there
is a business guarantee that goes along with it. If Amazon were to randomly
lose a big chuck of Netflix data, AWS's business would tank immediately. AWS
is a giant system that uses its scale and number of customers to efficiently
provide a more stable system at a lower cost than all of those individual
customers could achieve by building and maintaining their own IT.

I think it is sort of like a delivery system. If USPS or FedEx or USP started
losing a massive number of packages (I know that they do lose some) then they
would get abandoned, just like "Cloud" companies have an incentive to maintain
a baseline level of quality. The alternative is that every business would have
to create their own shipping services. I think it makes sense to assume that
in most cases, unless the business is already massive enough to warrant it,
that it is cheaper and more reliable to use the aggregate, dedicated ones for
hire. The "Cloud" will be cheaper than individual implementations, and it
won't be nearly as suspect to individual implementation errors because the
identical system will have been proven by many other customers (otherwise it
would be abandoned).

~~~
iofj
That must be why Amazon's SLA is defined as follows[1] :

If amazon loses more than 3 datacenters (only total loss of external
connectivity for all of your instances in an entire availability zone, or
total loss of hard disk access, again only counts if all your instances
completely lose hard disk/EBS access) for more than 45 minutes in a month you
get 10% of what you pay as a voucher for future ec2 usage. If they lose it for
more than 7 hours you get 30%.

So no, Amazon, or at least their legal department, does not trust their own
competency. Or at least, they're not willing to risk any revenue on that, but
they're willing to give you a small future discount to encourage you to
restart using the service. Oh and you only get that if you explicitly ask for
it.

If they lose your data on EBS/S3/Dynamo/..., you get nothing. So having any
data exclusively on any Amazon service should be cause for getting fired, and
this of course also means that using Dynamo for storing anything non-trivial
is a big no-no from a disaster recovery standpoint.

So I have to say, I would suggest you do not trust Amazon with either your
data, nor with keeping your site online. Yes, historically their performance
has been better than this, but ...

This reads worse than the SLAs on internet connectivity from places like
level3 and cogent (pay 10% less if they fuck up completely for more than 2
days).

[1] [https://aws.amazon.com/ec2/sla/](https://aws.amazon.com/ec2/sla/)

~~~
BookmarkSaver
You completely tunneled on the wrong portions of what I was saying. For one
thing, Amazon's "real" liability extends far beyond their SLA. Yes, their
immediate financial compensation is small. But there are two things wrong with
your conclusion here. First, that is not indicative of how much they "trust"
the service. They are always going to take the most conservative amount they
can get away with, and they are "getting away" with it just fine so why up it?
Second, if a company on AWS was severely impacted by a genuine Amazon screw-
up, the compensation SLA is the least of Amazon's concerns. It would be like
if UPS lost 20% of Amazon deliveries for one day. They wouldn't be nearly as
concerned with the explicit liability of compensating Amazon for those
deliveries, however much they guarantee for them contractually, they would be
far more concerned with everyone immediately switch to another shipping
company because they could no longer trust UPS. That is the motivation.

Second, you've completely ignored the actual key points. For one thing,
"Cloud" companies make their business by providing a stable service. You have
the "guarantee" based on thousands of other business using the exact same
infrastructure without serious service failures. That is a huge amount of
statistical reliability. Compared to hiring your own IT department and
cobbling together your own system, that is actually really good indicator.
Second, the cost difference is potentially massive. Again, it is for similar
reasons that shipping via UPS is a much better deal than shipping via your own
private distribution network. You might have to still pay some people to
handle your own inventory from its source (like you'd have to have some people
to work on your system in the cloud) but you'd be taking advantage of a much
larger, more efficient system instead of having to build and maintain your
own.

~~~
iofj
I think the original point I made stands. Hosting on Amazon's platform
essentially means I risk my revenue on Amazon's uptime. Actually using their
infrastructure like S3 means I don't just risk my revenue but actually get
locked in. Amazon is not willing to do the same, according to their total crap
SLA. That tells me a lot. And to top it all of, Amazon is famous for "eating"
businesses of their customers : use their infrastructure to see how one of
their customers do business, then take it over.

So if it's all the same, I'd rather have a decent SLA. Furthermore this sounds
a lot like Amazon's not in fact giving me anything.

Your point is that they'll do the right thing because otherwise their
customers would leave. Customers you said in the previous paragraph they give
"the most conservative amount they can get away with, and they are "getting
away" with it just fine so why up it?".

Sounds like they really care about customers doesn't it ?

> You have the "guarantee" based on thousands of other business using the
> exact same infrastructure without serious service failures.

I can get that guarantee at 1000 datacenters and colo providers, at least.
Some of which have a decent SLA. But even among cloud IAAS, both Azure and
Google provide both Amazon's guarantee, and better SLAs.

~~~
BookmarkSaver
Again, you are tunneling...

I used Amazon as an example. If you actually read what I'm saying, about how
Cloud businesses _in general_ depend on meeting their guarantees and not
screwing over businesses, how their superior quality is because of scale and
specialization, how they are reliable because one failure would doom them and
they haven't failed yet, you could see that this has nothing to do with Amazon
at all.

You keep arguing that Amazon is a bad provider. So what? I was never
interested in that at all. I'm not comparing them to Azure or Google or the
supposed "1000 datacenters and colo providers" you seem to know of. I don't
care who is better or worse, I was talking about using cloud services in
general.

Pay attention to the topic, pay attention to what my arguments were. Amazon's
SLA is utterly irrelevant to anything, what are you even trying to convince me
of? None of anything you've said is remotely relevant to my point. It's like
arguing about whether Ford or Toyota makes better hybrids in a discussion
about whether electric cars are a good idea, I just don't care.

~~~
iofj
It's funny how you put just the opposite argument of common sense forward and
present it as an axiom. The more customers a company has, the worse they treat
them. Called comcast recently ? The less choice customers have the worse
they're treated. How's your electricity company ? But when you can easily
switch ... surely that's better right ? Hmm companies with lots of customers
that can easily switch. Have you called Bank Of America recently ? And
frankly, they're one of the better ones.

Amazon has superior quality ? They have at best average quality as a vps
provider, unless you accept their products that cause lock-in. At which point
you're at their mercy, and they have even less reason to treat you well.
Amazon doesn't match, say, digital ocean (especially not in the transparency
in billing department. WTF). There are other reasons to pick amazon of course,
but quality, not one of them. Price ... not one of them. Service ? Not one of
them. Stability ? Not one of them. Geographical reach ? At the moment Amazon
does better (not that it matters unless you're in Asia).

One failure would doom them ? Just from memory I know two big amazon cloud
failures that you could not protect from with availability zones, the ones in
a single datacenter, they don't even publish.

The fact that they refuse to publish single cluster failures is probably
another aspect of that superior quality you mentioned.

Also, you can get fucked on an ongoing basis just by getting scheduled on a
machine. I guess that's part of their superior quality (a lot of VPS providers
of course have this problem, others are better at it).

------
swampthinker
Jason's follow up post to this:
[http://ascii.textfiles.com/archives/4352](http://ascii.textfiles.com/archives/4352)

~~~
wanderingstan
tl;dr Renting remote computational power is cool (AWS), but renting remote
storage is not.

I have a complete history of email archives since I started using gmail (ie
cloud), but have lost years of archives from the years I managed email myself.

It turns out that companies focused on data storage and retrieval do a better
job than me. And that's fine. I pay someone to do my taxes and I don't build
my own furniture. Specialization is a good thing.

~~~
bingofuel
The tax accountant analogy goes pretty well with this:

The accountant is specialized in doing your taxes (aka serving up your
data/files) but you still need to keep a copy of your tax records, receipts,
and etc.

~~~
derefr
And so, ideally, I want to be able to tell my accountant to store my archival
tax records in _my_ safe-deposit box, not in his office. Compute
infrastructure is a commodity—doesn't matter who's paying for it—but all the
services you depend on _should_ rely on storage infrastructure _you_ have an
SLA agreement with.

I don't care about the "distributed computation" promises of Diaspora or
Sandstorm.io; I think they're wrongheaded. Anyone can do compute. But I really
_do_ hope that one day that my Facebook account can be canonically "stored on"
a database instance I'm paying for, that Facebook's app servers will reach out
and connect to and treat as the canonical source for "me", treating their own
records as a secondary cache. This kind of setup would makes all _sorts_ of
things simpler and clearer and more secure; there would be a definite boundary
where "my data" stops (the DB I own) and "Facebook's data" starts (the DBs
they own.)

And, to be clear, I'm not talking about everybody running their own
infrastructure, or even everybody knowing what an IaaS provider is. Ideally,
PaaS providers could get into the "consumer instances" game the way Dropbox is
in the "consumer files" game. My "Facebook account database" above could be
transparently launched into my own cute little private cloud by my PaaS
provider when Facebook requests it through some OAuth-like pairing API. I
wouldn't need to think of myself, as a user, as "owning cloud database
instances." From my perspective, I'd get an abstract "Facebook account" (which
is actually an app instance and attached DBs) sitting in my Heroku-for-
consumers account. The _important_ bit is that I'd be _paying_ for the
resources that "account" object consumes, that I'd have an SLA on those
resources, and that the PaaS company would have every incentive to make it
easy for other third-party services to interact with my "Facebook database" in
a way Facebook themselves aren't. I, as a user, have no need to "manage" a
cloud of my own; I just need to be considered to _own_ it.

~~~
jimbokun
If I'm a Facebook engineer, I would never agree to this because there is no
possible way to optimize performance in this scenario.

What happens to Facebook when your data provider goes down, or just gets slow?
What if they mess up permissions or change their API?

Maybe you're thinking "that's fine, if my provider isn't reliable, my Facebook
account becomes unavailable and it's up to me to choose a better provider."
But what about all the people who are sharing your feed (or whatever it's
called these days, I don't really use Facebook)? Do they query your stuff, and
then timeout when it doesn't respond in time? Now other people's stuff is slow
to load.

Just seems like an engineering nightmare, to me.

~~~
derefr
Think of the Datomic architecture[1]: some nodes are "storage", while other
nodes are "transactors." The transactor nodes pull "chunks" of
rows/objects/documents _from_ storage nodes, compute relational indexes
locally, answer queries from those computed indexes, and finally persist
computed index "chunks" back _to_ the storage nodes.

Now, note that the "canonical" storage that gets read from doesn't have to be
the same storage that the indexes get written to. The first can be owned by
the user, while the second can be owned by Facebook.

Presuming an architecture like this, the latency and availability of the
user's "Facebook account" database is relatively immaterial. While writes
would have to be synchronous (so, like you said, Facebook would have to give
the user a "sorry, your account is unavailable" error), _reads_ could be
asynchronous. Think of an online RSS feed reader service: the "primary
sources" are the third-party sites with their RSS feeds. Sometimes those sites
go down. When they do, the reader-service can't retrieve the feed–so that feed
just goes stale.

Things like Facebook's graph, meanwhile, are fundamentally indexes. The base-
level "documents" in the graph are relationship assertions—a copy of "B
accepted A's friend request" stored in B's database. The graph is a computed
value built on a pile of those. When Facebook can't reach someone's database,
things like these relationship assertions just go stale.

The crucial idea, here, is that for Facebook to do its job, it probably has to
cache a majority of the stuff in the user's database in one form or
another—just like an RSS reader caches RSS feeds. But this _is_ purely a
cache, in a fundamental sense. Users who don't "check in" with Facebook could
be cache-evicted from its database. Other users would still have relationships
with them and be able to post on their wall and such (they'd be putting those
documents in their own outbox); Facebook would just no longer bother computing
anything that's personal to the user, like their news feed. There would be
every incentive to set up the architecture such that user data that wasn't
needed would be "garbage collected" off of Facebook's servers, because it
could always get put back on, the moment that user's account-instance woke up
again and said hello.

This would also mean that Facebook wouldn't need to store any of their user-
data in anything resembling a relational normal form. Every table would be a
"view" table. The _canonical_ database, owned by the user, could be relational
and full of nice constraints and triggers (and the user could even add these
themselves); but since Facebook can just query out of that to get any data
it's missing, it wouldn't need anything like a "users" table. (Fascinatingly,
if Facebook was built as a microservice architecture, this means that each
microservice would probably _separately_ query the canonical data from the
database in order to generate _its own_ indexes; the Search service would know
one "face" of you while the Photos service would know quite another. These
could even—in theory—be separately ACLed within your own DB instance, giving
the user true, actual control over what Facebook can do with their data,
component by component.)

[1]
[http://docs.datomic.com/architecture.html](http://docs.datomic.com/architecture.html)

------
Touche
I think this gets to the heart of "what people value" and Jason is looking at
it from a perspective of a collector. I'm also a collector so I feel much like
him.

I wonder if there's a relationship between collecting and being an introvert.
I have no evidence but feel like there is. I think most people just don't
value "things" the way our types do. What most people value is their social
life. They post pictures to Facebook not to store the pictures but to get a
reaction from their friends about the picture. That's what they value. The
social interaction, not the thing.

~~~
TheGRS
I really don't like to marginalize others by what type of personality they
are. This is yet another way of putting people into buckets so that you can
more easily go about your day without thinking about it much. Before facebook
people would keep large collections of photo books so that when family and
friends came over they could easily share them. It doesn't mean they valued
those pictures any less because they were extroverted vs. introverted. People
still keep hard copies of the pictures they cherish the most, like a wedding
album. People value all sorts of things, but maybe they see cloud storage as
perfectly secure, for the moment. Maybe personality plays into that, but I
think its more about your approximate knowledge of the underlying technology
that keeps people up at night.

~~~
Touche
> I really don't like to marginalize others by what type of personality they
> are.

Me either, I hope that's not what it sounded like.

> Before facebook people would keep large collections of photo books so that
> when family and friends came over they could easily share them. It doesn't
> mean they valued those pictures any less because they were extroverted vs.
> introverted. People still keep hard copies of the pictures they cherish the
> most, like a wedding album.

Yeah, absolutely, and I think people do still keep the photos they care most
about today. I just think we overestimate how many of those photos exist. I
think the overwhelming use of taking photos is for social conversation.

------
erikpukinskis
I understand why people are upset about the word. It's misused.

But I remember what it was like before "the cloud"... You had to provision
machines one by one, often via email or phone. It took hours or days. Billing
was usually done by the day. It sucked.

Now you can just send a POST request to a machine and in a few seconds you
have access to a new instance. You can send 1000 requests and get access to
1000 instances. And then you can send some more requests and only get charged
for a few minutes of time. When this transition happened, we called it "the
cloud" because it's a big undifferentiated mass of computers. We could've
called it "the soup" but we didn't.

That's what it has always meant to me. I don't understand why people want to
eradicate the word just because it's misused. People misuse the word
"internet" all the time, but I don't think we should strike it from our
vocabulary.

~~~
whichfawkes
Agreed. When I read the title, I was thinking "Oh, so you'd prefer to pay for
dedicated servers? What a pain!".

Though, I don't think the author is upset about the _word_. He's just warning
against services that fall within his definition of "cloud" \- which is pretty
fair. Losing your stuff is no good, and there's always cruddy services out
there.

~~~
erikpukinskis
Yeah, I guess you're right. Weirdly he's not talking about the cloud at all,
he's just talking about storing things on someone else's servers. So I guess
he's one of those people misusing the word cloud.

------
tibbon
When everything was "moving to the cloud" a few years ago (around when Jason
wrote this), I started to have similar feelings. It all felt like something
marketers were over-hyping. "Your computer in the cloud" (ever had a shell
account that was your main system? This isn't new), "your games in the cloud!
Ever played Nethack using that shell account?

I guess the only 'new' thing that I saw was the scaling capabilities based on
capacity, but we've had time-sharing (albeit, slightly different) for many
years.

~~~
83457
My mom was a mainframe programmer and says the cloud it is just like what they
had 50 years ago with timesharing but a different name. I just smile and nod.

Is such a statement much different than saying PCs are just tiny mainframes
with a screens attached?

~~~
notalaser
You aren't timesharing your tiny mainframe, so it is quite different, yes. The
cloud is _precisely_ what your mom had on a mainframe 50 years ago, including
virtualization, on-the-fly scaling of resources (maybe a little more difficult
because of the larger equipment), and even renting time on computers you
didn't have in your offices, since not every institution could afford their
own computer.

------
CM30
The points in this article also apply really nicely to websites and other
services, many of which seem to make the mistake of outsourcing everything to
other people's platforms. Oh sure, you may be using 'the cloud' to host your
webmail, or your forums, or your chat, or anything else... but what if that
goes missing? You're in an even worse situation than the individuals using
these services to host their personal files. Remember what happened to
IPBFree? They were shut down for some weird, unexplained reason (rumour has it
as a raid for illegal activity), and literally everything on their servers was
wiped out. Now imagine if the same sort of thing happened to one of these
services, like Dropbox or the likes... Millions of people would lose most of
their files overnight.

Use such services for backups, sure. But don't rely on them too much, you
don't know what might be going on behind the scenes at the other end of the
line. You don't know their financials (usually), whether they're under
investigation for something, whether an intelligence agency is spying on their
servers, whether their security is up to par in every possible sense...

And if they're offering it for free... well, they don't have much invested in
keeping you as a customer when things get tough. Became unpopular recently,
for saying something controversial or 'stupid' on social media? Made enemies
in the political world? Then a lot of companies will be quite happy to shut
down your account to avoid bad PR. By using these services, you provide a nice
target for the social media mob the minute you do something that a lot of
people don't like...

~~~
superuser2
> like Dropbox or the likes... Millions of people would lose most of their
> files overnight.

That's false. Files in Dropbox are also stored locally on all your Dropbox-
enabled computers. They would simply stop syncing, and you'd plug in another
syncing service. If Dropbox disappears, your Dropbox folder just becomes a
regular folder.

~~~
brbsix
I think it is an accurate assessment that many (potentially millions) people
would lose their files in such an event. Many people use cloud storage (e.g.
Dropbox, Google Drive) without a native client. Even on mobile, it is common
to delete photos from the device after they have been synced/uploaded.

------
bradleyankrom
This article is a lot of fun with the Cloud To Butt Chrome plugin installed.

~~~
LinuxBender
That is by far my favorite plugin. Just don't forget to turn it off before
presenting in a meeting...

~~~
dingo_bat
Why would you be presenting Chrome in a meeting?

~~~
j_jochem
Because you've built your slides with Reveal.js ([http://lab.hakim.se/reveal-
js/#/](http://lab.hakim.se/reveal-js/#/)), for example.

------
cballard
This seems unnecessarily angry. I suppose that this person wants to keep his
data forever, but many people just don't care that much about their social
media photos and the like.

I'd love it if Facebook and Twitter had a rolling deletion period option -
everything more than six months old is shredded forever, as far as the service
is concerned. While people can obviously store shared photos and this wouldn't
actually destroy them, I'd like that new contacts wouldn't be able to go back
and look through someone's entire history. It's like a more social and longer-
lived Snapchat.

~~~
toomuchtodo
> I suppose that this person wants to keep his data forever, but many people
> just don't care that much about their social media photos and the like.

"This person" is Jason Scott, who works as the Internet Archive and also heads
up Archive Team (the loose band of internet folks who race to archive sites
about to go dark).

He's a digital historian; his _job_ is to save everything he can.

[https://twitter.com/textfiles](https://twitter.com/textfiles)

[https://archive.org/about/bios.php](https://archive.org/about/bios.php)

[http://archiveteam.org/index.php?title=Main_Page](http://archiveteam.org/index.php?title=Main_Page)

~~~
skwirl
This article doesn't really stand on its own; it comes across as a rambling
diatribe by someone who is out of touch. If you have to know who the author of
an article is to be persuaded by the argument given then it isn't very good.
There are much better articles expressing this point of view.

~~~
coldtea
> _This article doesn 't really stand on its own; it comes across as a
> rambling diatribe by someone who is out of touch._

Remember that next time you're red with rage because a cloud provider lost
your data or a service you depended on is now bought/closed/gone.

~~~
skwirl
My point was more on the quality of the article. The article's point is a
valid one, if you can find it.

To constructively disagree, though: I keep backups. I have a Synology NAS
device that I really like. But for your average person, I have to wonder - are
their digital photos really safer on their laptop than they are on Facebook?
Facebook is a fairly stable company. Laptops are lost, stolen, and damaged all
the time. Files are accidentally deleted. People get viruses. People forget
the password to their full drive encryption. When Facebook does bite the dust,
it's hard to imagine that data just disappearing - it's incredibly valuable,
and in the worst case, someone would buy it just to sell it back to people.
Not ideal, but it isn't lost, and you had free storage for several years
anyway.

~~~
Domenic_S
Think about the future too though. Facebook strips exif and may have
resized/recompressed your images. 20 years from now when we can do more super
cool things with that data, the data won't exist in your FB photos.

Tangentially related, this is why I shoot RAW. Not because it might give me
better pictures today, but because it WILL give me better pictures in 5-10-20
years. You can take a RAW today that was shot 5 years ago and pull detail that
was impossible to pull when it was shot, and that ability will only improve.

~~~
matthewmcg
Ideally, you'd want to keep RAW + JPEG. I'd be worried about reading some of
the more obscure camera RAW formats in the far future. JPEG seems like a good
backup (at the cost of the data loss you mentioned).

~~~
Domenic_S
Yep, sometimes I convert to DNG for that reason, but that seems like pushing
the problem out...

------
goodcall
Don't we keep our entire life's savings in companies aka banks that we don’t
run, don’t control, don’t buy, don’t administrate, and don’t really
understand.

~~~
Retric
I think most people have more of there savings in stuff than banks. (Clothes,
PC, car, House etc.) At scale cash is a proxy for wealth not actual wealth.

PS: A home loan might seem like the bank owns your house, but they can't say
no when you sell it.

~~~
dragonwriter
> PS: A home loan might seem like the bank owns your house, but they can't say
> no when you sell it.

They can, unless you satisfy the loan by paying it off as part of the process,
at which point they no longer have an ownership-like interest.

In the unusual cases where you try to sell a house _without_ doing that, the
bank _absolutely_ can -- and often will -- say no.

~~~
Retric
If you have the cash to pay them off, then they don't get to say no even if
it's a short sale.

Alternatively, you can generally walk away and sell it to them for the value
of the loan.

~~~
dragonwriter
> If you have the cash to pay them off, then they don't get to say no even if
> it's a short sale.

If you _actually_ pay them off, then they don't get to say no, because paying
them off is, essentially, buying out their interest in the property, under the
terms of an existing contract. That doesn't negate the fact that they have
legally-enforceable rights in the property until and unless you do that.

> Alternatively, you can generally walk away and sell it to them for the value
> of the loan.

Only if the mortgage is governed by the law of a jurisdiction where pursuit of
mortgage deficiency isn't allowed (either in general or for mortgages in the
specific conditions yours has.)

------
nsfyn55
>Don’t blow anything into the Cloud that you don’t have a personal copy of.

I don't understand this logic. Amazon's S3 offers service level agreements
with failure rates that at one point implied the statistical likelihood of
losing an object to be once in "thousands of years". When dealing with any
sort of stable storage this is simply something I cannot offer. I couldn't
produce a set up locally with the resources I have for making guarantees on
the decade level let alone millennia.

With that said I keep personal copies, but the authoritative copy is what's in
the cloud, because its a hell of a lot more stable.

TL;DR I hear this argument all the time. The cloud isn't perfect but its a
hell of a lot closer to anything I could achieve. "not invented here" syndrome
won't save your data.

~~~
jacquesm
Until your control panel gets hacked and you lose all your data. See:
codespaces.com (assuming it wasn't an inside job or a dumb mistake, but even
then the same rules apply). So no, DON'T BLOW ANYTHING INTO THE CLOUD THAT YOU
DON'T HAVE A COPY OF.

It has nothing to do with 'not invented here', it has everything to do with
your inability to outsource your responsibilities.

The degree to which people rely on others to take care of their stuff is a
huge blind-spot. Jason's advice is spot on in this respect, no matter what the
up-time guarantees of the cloud solution you are using (and no matter what the
redundancies), if you store all your data in the cloud without an off-line
copy your company is 3 mouse clicks away from being history.

~~~
Domenic_S
The 3-2-1 rule is a rule for a reason.

3 copies

2 formats

1 copy off-site

The cloud is a great place for that 1 off-site copy.

~~~
ajross
No argument at all about the last sentence, but you should have skipped the
first four lines as wildly premature optimization.

The vast majority of the world would be much better served by (to appropriate
your jargon) a 2-1-1 rule, because it's a straightforward treatment for a
straightforward problem that is easily implemented. Yes, "format skew" and
double-failure of backup solutions does indeed happen, but at a much lower
incidence than "oh crap I deleted it!".

~~~
jacquesm
I think the 'three copies' rule refers to cyclic backups, not necessarily
three copies of the one piece of data.

This protects against data corruption that is not detected immediately.

Another common way of doing this (and one that I prefer) is one where you
rotate out a backup medium with ever larger intervals. So one gets set aside
per week, then one gets set aside per month and so on. That gives you a series
of snapshots in time that will allow you to pinpoint with some accuracy when
an event happened. Longer ago you'll have less accuracy but this can help a
lot in trying to triangulate who or what messed up. Just being able to answer
the question of whether or not 'x' happened before 'y' was hired or after can
help in narrowing down the number of suspects in case of a breach or other
nastiness. It also prevents against back-ups for whatever reason not wanting
to be reloaded (and you should guard against that by loading your back-up
immediately after you make it, even so, the medium might fail the next time
you try a read).

Better still if there is a streaming log of everything but only very few
companies can afford that sort of solution for all their data. Those can be
hard to restore from (by replaying) so there too a snapshot system can help.

~~~
ajross
Oh sure, there's lots to say about the design and effective use of a backup
regime. I agree with all that stuff.

But if you're going to condense it to a "rule" that will help people not well-
versed in the field, that rule can only be "MAKE BACKUPS!", because at least
90% of the data loss scenarios in the real world happen because simple backups
weren't made.

Don't make it more complicated than it is, because someone will stop to do it
"right" and then lose data because they didn't just make a copy on a USB
stick.

------
nine_k
Well, it's seemingly trivially easy.

Don't _move_ to the cloud; _copy_ to the cloud.

Also applies to other storage media.

------
rodionos
AWS customer since 2007. Just pulled the plug on our last EC2 instance this
week having migrated our stuff to a provider offering root servers for the
cost of m3.medium. Our requirements are simple and we have no need for high-
load/high-end layers. We used to have 50-100 VMs depending on the time of the
day, now less than 20 with the rest of the workloads migrated to Docker
containers.

P.S. Can't delete Glacier Vaults for now as AWS enforces a cooling period.

~~~
jacquesm
> P.S. Can't delete Glacier Vaults for now as AWS enforces a cooling period.

That's a good thing. And in the case of Glaciers an excellent pun.

------
jacquesm
Well, that's one way to break the ice :)

I have a bit more of a nuanced view on this than Jason, but I totally
understand where he's coming from and when the whole cloud gravy train started
rolling our perspectives overlapped much more than they do today (and quite
probably since then Jason's perspective has changed as well as perspectives do
with the passing of time).

There are use-cases where the cloud is absolutely and utterly the wrong way to
go about it. When you're running a bank, a government institution (ever a
lower government one) or something else that is mission critical and where
total control of the data and maintaining end-user privacy is paramount then
the cloud is _probably_ not the right solution.

There are also use-cases where the cloud is the right solution in principle
but the wrong solution in practice because of cost. Above a certain scale
bandwidth and storage costs of cloud operators will always command a premium
over those you get from dedicated hosting providers.

As for 'not owning the machines', plenty of companies lease their servers, so
technically they don't own them anyway.

The big problem with 'the cloud' as I see it is that companies tend to rely
utterly on it and do not have a 'what if the cloud fails' line in their
disaster recovery plans. Lose the cloud data and the company goes up in a puff
of water vapor, which is what clouds are made of after all.

So if your use case _does_ match the cloud solutions well then make sure that
whatever else you do, have at least a copy of your critical data, code and
your configuration information _outside_ of the cloud provider. And while
you're at it, make sure that this is done in such a way that there is a
separation of duties with respect to those that can administer the cloud
portion and those that can access those just-in-case-the-shit-hits-the-fan
backups.

Just so you don't end up like codespaces did.

Finally, the cloud is not so much an end-station as it is a step on a much
wider scale from absolute control with certain administrative duties on one
end and much less control but great convenience on the other. Where on that
scale constraints indicated by your comfort level, your application and your
fiduciary duties allow you to pick your solution is something that is likely
different for every company (and likely for every person).

Customers of companies would do well to research their service providers when
it comes to how they are architected, just in case something goes drastically
wrong so they don't end up holding the bag.

------
r3bl
I'm definitely going to start renting my own server somewhere in Europe
starting from January. I absolutely agree with everything he said and I really
want to claim my own data again (run my own email server and things).

~~~
nekopa
Do you have a write up on running your own email server?

~~~
r3bl
This was the article that convinced me to try:
[http://www.27months.com/2013/10/its-always-sunny-in-
iceland-...](http://www.27months.com/2013/10/its-always-sunny-in-iceland-or-
how-i-nsa-proofed-my-email/)

I read both that one and the one posted here like an hour before this one got
posted on HN and they have convinced me to give running my own server a try.

In the meantime, I have just downloaded a backup of all of my data from
Twitter and Facebook (Facebook's archive was like 15x as big as the Twitter's
archive even though I'm using Twitter way more) that I am going to save on my
server, I have switched to POP instead of IMAP on my current email service and
I am testing out ownCloud in a Docker container.

------
peterwwillis
If you use Snapchat a lot, you may notice how often you get updates from
people or see their public story change. Do you ever stop to think about the
old snaps, or miss them? No, because you have a constant stream of new ones.
You can always make more memories.

Nothing about the cloud is that different from what we had before. With shared
hosting providers, you and 50 other users would fill up your disk quota on one
or two hard drives on some dinky 1U server running Apache and ProFTPD. If the
drives died, along with it went your data. Which is why you kept a copy on
your own computer. Back then, nobody expected anyone to keep their data for
them, so they just kept their own backups. The same was true for managed
services and colo with the exception that you had to do more of the work
yourself.

Because the industry has gotten better about preventing data loss, we get
complacent and stop saving our stuff as much. But why piss and moan over more
reliable, more massive services for cheap or free? Because it isn't perfect,
or innovative, or more transparent?

The status quo of the industry is to reinvent the wheel, so it's hard to get
mad at people for re-packaging the same solution in a different container. The
obsession of holding onto all your old stuff just makes this look even more
unnecessary.

------
textfiles
Salutations, ass-end of the Tech Elite.

As someone who has generated a pretty hefty sandbag of verbiage over my
decades online, it's always amusing to see what the Grand Eye of internet
arbitration decides is an incredibly important and pertinent subject to
discuss in my back catalog. Whether it's my work in guiding volunteers for in-
browser emulation
([http://archive.org/details/softwarelibrary](http://archive.org/details/softwarelibrary)),
my delightful coterie of 1980s BBS textfiles
([http://www.textfiles.com](http://www.textfiles.com)) or perhaps my
documentaries on BBS culture
([http://www.bbsdocumentary.com](http://www.bbsdocumentary.com)) Text
Adventures ([http://www.getlamp.com](http://www.getlamp.com)) or the DEFCON
Hacker Conference
([https://www.youtube.com/watch?v=rVwaIe6CiHw](https://www.youtube.com/watch?v=rVwaIe6CiHw))
... or, as it is today, one of my many long-form written-down thoughts on all
manner of this silly medium many of us have chosen to live our lives.

Oh yes, also that my cat is on twitter and has a million followers.
([http://www.twitter.com/sockington](http://www.twitter.com/sockington)) -
Lots of people are loaded with knapsacks of opinion about that one as well.

I have found that Hacker News (which is, be clear, an unexpectedly lively
extension of Y Combinator) is composed of several diverse groups, all with
variant approaches to a linked subject. A linked subject which, as some have
pointed out, I wrote 6 years ago, deep in the mists of time.

One group is literally in it for the Money, the gain, the ROI, the endless
quest for the "Unicorn", and all their commentary is pungent with the bias and
filter of either finding the precious gold coin at the bottom of the shitpile,
or are rife with attempts to promote or play up subjects and links of great
interest to their financial agenda. Be assured that I could not care less
about the current status of the beating of your heart.

Another group seems to be happy to drill down as deep as they can into the
mathematics, algorithms, and code of a situation, thinking that if they
napkin-blart out enough "facts", they will win some sort of day. I find these
people tend to be unhappy about flowery language or effusive phrasing, simply
because they've left-brain-dominated themselves into deep pits of nut-sorting
and bolt-counting. They use "TL;DR" a lot, as well as, I assume, Adderall.
Their heartbeat status is of greater interest to me, if only because I think
they are coming from a good place, even if that place smells of Cheetos and
sweat.

And, of course, there are Opinion Tourists, my favorite, who might as well be
equated with a loud and cantankerous pit of waving hands, waiting for the
newly linked (if not newly written) event/opinion/image for their to raise in
a mighty roar with a hastily cooked "hot take" on the item. Some of them even
optimize the process to not even click on the provided link before the horn
honking ensues.

So, "Fuck the Cloud" was written in the deep miasma of when everyone used the
term "Cloud" interchangeably with "Magic"; that it was an approach and glory
that would lead the experience of computing to a new shangrila. Like any old-
timers rife with memories of how we got into that world (and of the echoes of
cloud-dom going back 50 years), I decided to write out some of my own
thoughts, especially on this attempt to dumb down the populace and separate
them from not just responsibility, but control and agency with their data. I
have been entirely correct in the general theme - there is a divide within the
technical community, of people with admin access and the ability to control
any aspect of their work, and then a very large, almost overwhelming set of
users who are, essentially, meat stock. And in the same way that meat stock
has no particular seat at the table when negotiations of an agricultural
nature are conducted, so in the same way are the "users" left out in the cold
as a whole range of abilities and ersatz "rights" are stripped away, under the
guise of "ease of use" and "leave it to us".

All of this was written without the revelations of the deep, intense
surveillance apparatus that is now in place, ensuring that any of this data
you control or thought was within your own private space is actually destined
to meet you again in an investigation, a courtroom, a warrantless intrusion or
a physical SWAT attack. That wasn't even the point.

The point was that user data, treated as something to abuse, monetize, and
ultimately discard as a whim, was a complete betrayal of the early promises
and experimentation of the Internet. To counteract this trend, I co-founded
Archive Team ([http://www.archiveteam.org](http://www.archiveteam.org)) and
our delightful success in many areas would warrant a completely different
essay itself - and it has, along with myriad speeches and presentations in the
years hence.

I'm sure it might be delightful entertainment for Hacker News to find this or
that out on the net and go off, endlessly, in the loop of "This Needs Me" and
"Fuck You For Thinking That", but ultimately, these are ridiculous showboat-
dances of "what if" and "why not", and I've discovered in the years hence that
truly, actions and achievement s speak louder, ever so louder, than words.

Enjoy your day.

And fuck "The Cloud".

~~~
oldmanjay
I could have done without your insulting tone, no matter how proud of your
opinions you are.

~~~
ixtli
You are an adult. If you miss the message for the messenger you've got no one
to blame but yourself.

~~~
AnimalMuppet
So is textfiles an adult. If he/she is a jerk, he/she has nobody else to
blame.

~~~
ixtli
I'm not sure how this is relevant to what I said. While it is certainly the
case that one can catch more flies with honey, I think we'd all rather be
people who digest the message instead of disregarding it because we don't like
how it was delivered to us. This isn't about textfile's delivery, but that
it's useless bordering on childish to take a response from an author and
respond simply with "I don't like your tone."

~~~
AnimalMuppet
> I think we'd all rather be people who digest the message instead of
> disregarding it because we don't like how it was delivered to us.

That would be good, yes. We should all try to be like that. However, HN has
guidelines as well, and I'm pretty sure that textfile's post violated some of
them. And the way HN maintains it's status as the kind of place where we care
about the message is partly by discouraging statements that needlessly
distract with an offensive tone.

~~~
ixtli
Please correct me if I'm wrong, but the guidelines only seem to state:

> Be civil. Don't say things you wouldn't say in a face-to-face conversation.
> Avoid gratuitous negativity.

I guess this is the point where we'd be arguing about taste, but it seems to
me that while textfile's comment was clearly negative, it was well constructed
and well thought out. Not to mention it was in the _exact_ same tone as the
article he'd written years ago that the conversation was about. In this case
we should be very careful not to immediately jump on people being upset or
negative. It's a powerful tool that, in this case, is being used wisely. I
think a bruised sensibility here and there is a worthwhile risk in the name of
maintaining a network like HN that accepts dissent and spirited disagreement.

As a slight tangent, in reading the guidelines I found that more than an
attempt to tone-police, they are an attempt to lower the noise:signal ratio.
The reason I feel that this discussion is important is because it's the reply,
not textfile's response, that adds noise to the discussion even if you feel
targeted by the authors response.

------
ninjakeyboard
There is no alternative approach presented here. I can only assume that the
author has never had to scale a piece of software to server hundreds of
thousands of concurrent users.

~~~
etjossem
The implied alternative is "something you run, control, buy, administrate, and
understand." Don't overthink this. You need it to scale to one (1) concurrent
user.

Go buy an external hard drive. Start saving important things locally, and also
automate backups to the external drive. You now have two copies of everything
you can't afford to lose.

------
vvpan
I like the guy's writing style and use of English.

------
sbov
In many ways the cloud is one step forward, two steps back. I look forward to
the day when I can use my phone as my "cloud".

------
adityar
Getting resource limit reached error

~~~
fixermark
[http://mxtoolbox.com/SuperTool.aspx?action=ptr%3a204.109.60....](http://mxtoolbox.com/SuperTool.aspx?action=ptr%3a204.109.60.27&run=toolpage)

I bet that wouldn't happen if he'd host his blog on a scaling cloud provider
with a proven track record. I could think of a few that might be good
candidates... ;)

------
bhz
This is an old article and is almost entirely op-ed. No idea how this got to
the top of page on HN. :/

~~~
coldtea
Maybe because opinions are also valid intellectually?

Mere data driven conclusions get nowhere without a point of view and an end
goal to accompany them.

~~~
fixermark
These statements are true, but may not be reflective of why this story gets
top billing on Hacker News. See techdragon's extrapolation comment that's a
sibling to this one. ;)

Place smells more and more like Slashdot every month.

------
ksk
It's rather interesting that HN mods allow posts with this kind of language
but the moment you are even mildly critical of a comment or commenter, you get
a warning about language. I guess the thinking is we look the other way as
long as its not hosted on the ycombinator domain.

~~~
nekopa
I think it's more he (the article) is saying fuck to an idea, not a flesh and
blood person (as when responding to a comment).

Also, it is possible to have civil debate around a profanity riddled article.
That's what this site is trying to achieve.

------
cowardlydragon
In enterprises, you either trust the cloud, or your local data center people.

The local data center people will threaten and lord over you with their
hardware powers unless you have the cloud alternative.

And you're not entirely wrong about the cloud.

So don't trust either.

A modern business should have at least two external cloud providers and a
local option.

------
voynich61
OldManYellsAtCloud.jpg

------
Karunamon
Oh, right, this is the guy that wrote the "facebook is the worst thing ever"
screed.

