
Our service is down because Msft Azure is down. This is how we chose to react. - roee
http://modern-products.tumblr.com/post/39144653241/microsoft-windows-azure-down-what-we-did
======
johns
This is a nice gesture, but I prefer the services I use to take responsibility
for their provider and technology choices. Your customers don't care that
Azure is down, they only care that they can't do whatever they were trying to
accomplish. This isn't Azure's fault. It's yours. Don't pass the buck.

~~~
mhurron
And if all they said was they were experiencing difficulties you'd see people
complaining that they aren't giving any indication as to what the problem is.

The site is down because their provider is down. This is, as someone else
stated, a fact. They told you that and that is really all they can do. They
have to wait just the same as you do, for the same thing.

> This isn't Azure's fault.

Yes it is. The alternative is to pay more (a lot more) for projects to host
all their own servers and storage and backups. That would be the end of a
great many small to medium companies.

~~~
PommeDeTerre
It's a myth that dedicated hosting in several geographically distant data
centers really costs that much more than cloud computing. With a little bit of
care, it's quite easy to get very good value from such hosting, with far more
control over it. Outages like this could be easily avoided, or at the very
least dealt with much quicker.

It would not be the "end of a great many small to medium companies" were they
to use dedicated hosting. Hell, this is how it was commonly done up until the
past 2 to 3 years, and small and medium companies thrived quite well.

In fact, after the cloud outages of the past week, and a few other recent
outages, more traditional dedicated hosting keeps on looking better and
better.

~~~
roee
We actually pay for georeplication on Azure. Here's the official response from
them:

"Why didn’t we just fail over? We do have geo-replication for Windows Azure
Blobs and Tables, where the data from US South is geo-replicated to keep
another replica set of the data in US North. We have chosen at this time not
to failover, since we believe we can bring back the primary storage stamp in
US South in place. One of the advantages of recovering in place is it avoids
losing the Windows Azure Queue data in that stamp, since Windows Azure Queues
is not being geo-replicated at this time (we are working towards turning that
on)."

------
nodesocket
I tried Microsoft Azure for a month because I received free credits via their
BizSpark program. I can confidently say their uptime is the worst I have ever
seen. Nearly daily servers would loose access to disks and switch to read-only
mode. Also, they lack any ability to snapshot backup servers, which is a basic
requirement of a cloud platform.

I know its early, and they are just starting, but I just don't have confidence
in Azure. Stick with AWS, Linode, Softlayer.

~~~
roee
When was it? We actually have great experience and great uptime with Azure.
It's a very unique case for us.

~~~
nodesocket
Are you running Linux or Windows? This was June/July.

~~~
roee
We're running on their table storage from web roles and worker roles (their
platform-as-a-service)

------
eddieroger
This made me think of the Netflix outage the other day. When people like my
non-technical parents say things like, "did you hear Netflix went down," they
don't care that it was Amazon that really went down. I don't use Soluto, so I
don't know their customers, but I doubt they care that Microsoft has dropped
the ball.

It's a chance we take when we deploy to the cloud, and services like Heroku
only compound it because that's an additional point of failure. I'm not sure a
giant "it's their fault" finger point is the best way to handle a problem, but
I assume they feel pretty powerless, so finger pointing is an option.

Good luck resuming operations. Hopefully you won't loose too many customers.

~~~
Yrlec
A classic case of: "you can't outsource responsibility".

------
niggler
When can we move beyond blaming a cloud service and just owning up to the
decision? Soluto chose to use Azure and knew they were taking a risk. It's
their fault for choosing to use Azure, and unless they signed for a
100.000000000% uptime guarantee (which I'm sure they didn't, given no one
would give such a guarantee) they have to own up to any faults.

This extends to those services that blames AWS as well. It's not a good habit,
and customers (like we saw with netflix) don't care -- all they know is that
their service is down.

~~~
Avalaxy
Azure offers a SLA with a 99,95% uptime guarantee for computing instances and
a 99,9% uptime SLA for storage. If they do not meet this SLA you will get your
money back.

You can read more about it at www.windowsazure.com/en-us/support/legal/sla/

~~~
taylodl
They've already blown that SLA - for the year. Now how much of my money do I
get back?

~~~
Avalaxy
All of it I think?

------
pakeha
What is the extent of this outage? There isn't much info on Twitter or
elsewhere. The Azure service status dashboard shows an issue impacting
"Storage [South Central US]". Is that service like S3 or EBS or something
entirely different?

~~~
biot
Azure Storage is for blob storage (like S3), non-relational table storage
(like SimpleDB), and queues (like SQS).

------
BaconJuice
I didn't know what Soluto was, so I went on your main page to find out what it
was while your service was down. All I saw was "Service Disruption :( We are
experiencing problems with our cloud infrastructure More information..."
Nothing else to click on but that one link for support. So when I clicked on
it, still didn't know what it was and so I scrolled all the way down and
clicked "About" then yet again I was greeted by the same alert. I understand
your service is down, but does it have to take your whole site with
information on your product with it just to show your service is down?

Anyways my 2 cents.

Regards.

~~~
roee
That's right. It's double-bad when things like that happen during the weekend
so time-to-reaction is slower. All will be better tomorrow.

~~~
PommeDeTerre
Are you trying to tell us that proper automated failover systems take the
weekends off?

~~~
roee
I'm trying to say we were not visually prepared for this, and that getting
designers and coders to work on a weekend is not something fun. I see your
sarcasm and get it, but really it's out of place.

------
sologoub
Maybe a noobish question, but in the future, is there anything other than need
for preparation, that prevents you from having a backup spun up on say AWS or
another service provider?

It seems like a prudent step to take.

~~~
roee
It's the entire service, not a bunch of files. When Netflix went down due to
AWS outage, could you image them just "restoring a backup" on rackspace and
running just like that?

~~~
sologoub
The core service, sure. The problem is that Netflix accounts for something
like a 3rd of US bandwidth consumption. Not many services can sustain that.
Netflix also has to manipulate a giant library of assets. Doesn't sound like
this is part of their service. A normal company should be able to fail-over.

I have been experimenting a lot with Google App Engine. Failing over from a
PaaS like that is not easy because your app typically relies on a bunch of
proprietary APIs. However, it is possible, provided you keep right replication
and architect the application accordingly.

You also don't need to fail over 100% of your app. I don't have a very clear
understanding of what Soluto actually does (sounds like a GotToAssist
competitor), ability to fail over the part of the system that facilitates the
remote desktop experience would be more important than administration or
creation of new accounts. (This is a crude example.)

------
runesoerensen
I can recommend trying AppHarbor - we're striving to deliver a better .NET and
Windows cloud platform. Feel free to shoot me an email (rs@appharbor.com) if
there's anything I can help with.

~~~
hhudolet
Its just too expensive, you cant host your custom domain name on free
instance. And there's nothing between 0 and 50$. I'm waiting for prices of
azure web sites, hopefully it would compare to shared hosting prices.

~~~
runesoerensen
You can add custom hostnames to the free instance for $10/month

~~~
PommeDeTerre
Can you explain to us why such basic functionality is so expensive?

Is there a legitimate technical reason for the cost, for example? Or is it
subsidizing some other service? Or is it that high just because some people
are foolish enough to pay that much for it?

~~~
biot
The paid plans already include hosting on a custom domain. I find it
surprising that they let you pay a la carte for custom domain support on the
free plan as well.

------
jgn
I love this, it's refreshing. Every time something goes down, I see a flood of
hate and a torrent of comments suggesting everyone move somewhere else. That,
coupled with HN's hate of MSFT, makes this one of the most refreshing posts
I've seen in some time. And sensible.

Respect to you, my friend :).

------
arnorhs
For those who are curious

Cached copy of the front page:
[http://webcache.googleusercontent.com/search?q=cache:0vf1uUF...](http://webcache.googleusercontent.com/search?q=cache:0vf1uUFgPnAJ:https://www.soluto.com/+&cd=1&hl=en&ct=clnk&gl=us)

Their support site: <https://support.soluto.com/home> (not down)

Wikipedia page: <http://en.wikipedia.org/wiki/Soluto>

Soluto alternatives: <http://alternativeto.net/software/soluto/>

~~~
brandonfish
Wikipedia says that one of their remote features is: "Defrag the hard drives"

Someone pushed the wrong button there? ;)

Seriously, that just reminded me of running Windows back in the old days. You
had to make sure to have the latest AV, do system upgrades all the time and of
course defrag the drives... yay

------
PommeDeTerre
One of the FAQ answers on the page linked to by the "More information..." link
on their outage page states, "Cloud services like ourselves someone ( _sic_ )
experience such outages, it's part of the modern web."

I can't believe that it actually says that. It seems extremely unprofessional
to me to make a statement like that.

Downtime like this is not acceptable, regardless how the service is hosted,
and who is providing the hosting.

We didn't put up with this kind of outage with the "old fashioned" Web, and we
shouldn't put up with it from the so-called "modern" Web, either.

------
AlexDanger
Perhaps this is part of your current problem with Azure, but is it possible to
retain some part of your frontpage during an outage like this? Or failover to
a static page with a basic description of the service? Your support site seems
to look ok.

I only make the suggestion because I was not aware of Soluto. I went to your
homepage and see the outage notice and a link to support. I still have no idea
what your service is about.

Perhaps not the best time to be spruiking your site and service, but any-
publicity-is-good-publicity etc.

~~~
BaconJuice
my point exactly.

------
paulsutter
Looks like Azure is having problems in the US south-central region [1], but
fine everywhere else. The fact that they don't mention their region suggests
they may not understand the implications.

If the status page is accurate, their downtime is just poor planning on their
part. Same goes for Amazon. If you're stranded in one region, you're asking
for downtime.

[1] <http://www.windowsazure.com/en-us/support/service-dashboard/>

------
ChuckMcM
Maybe the world is ready for RACSP (Redundant Array of Cloud Service
Providers)

~~~
roee
That's too close to RackSpace. Choose another name :)

------
GoodIntentions
My impression of that page was not very good TBH. The words are kind, but for
me at least, the message boiled down to:

"We're down." "It isn't our fault." "We'll be back up when someone else gets
it fixed."

You must be at least _thinking_ about how to avoid/limit this impact in the
future. Why not give the user more information along these lines, and thereby
position yourself as an active participant in the process at the same time?
Polite helplessness is probably not what you want to project as a business,
but that is what I took away from it.

------
jongalloway2
The only reported outage I'm seeing is Storage / South Central US -
[https://www.windowsazure.com/en-us/support/service-
dashboard...](https://www.windowsazure.com/en-us/support/service-dashboard/)

------
lucb1e
I expected to see them stop using Azure or something, but this is a good idea
that's entirely opposite from what I was expecting!

