

Why SLAs are redundant, but Zencoder offers one anyway. - jon_dahl
http://zencoder.com/encoder-blog/2010/08/23/why-slas-are-redundant-but-zencoder-offers-one-anyway/

======
ora600
SLAs seem to have the effect of dramatically changing the culture of the team
that is now measured by the SLA.

I remember two times when my managers announced an SLA. First time was "Every
priority 1 ticket has to be answered within 24 hours". Before the SLA we were
a cheerful and overworked IT team, doing our best to handle all tickets within
reasonable times and with our best quality responses. Afterwards - We started
negotiating priorities, replying "investigating" within 24 hours to make sure
management doesn't get on our case, and we had lots of not-so-funny jokes
about announcing "love every 38 hours" SLAs to our spouses.

The second time they actually went as far as saying "This SLA is between
marketing, sales and the customer. You don't need to worry about it. Just keep
on doing what you always did."

They had to say that, because with our current equipment and processes,
offering 99.9% availability to our customers was impossible. Our real
availability was an average of 99.5% with some services as low as 97%. Not
because we were horrible IT - but because of trade-off decisions we reached in
the past together with our management.

It didn't help much. Because we didn't trust management not to start holding
us to those SLAs at a random moment. Which they did.

Again, from a team that did their best to keep servers up, we became expert
negotiators of which downtimes "count" and which don't. Also, we because very
defensive: "We can't risk upgrading this month, we were already down 13
minutes."

I understand the need to sell to corporates, but I would be very very careful
communicating SLAs to my team.

~~~
mdisraeli
The problem there is not with SLAs at all, but with how they are set. It is
sadly common for SLAs to be dictated by marketing/bidding/sales, with no
regard for reality.

Similarly, if you can't reply to every priority 1 ticket within 24 hours, that
suggests that not every one is priority 1, rather than simply an issue with
SLAs as a concept.

------
DifE-Q
While the many points of this article are true...I would like to point out
that an SLA provides a means to punish the service provider...in hopes of
better behavior.

I run an IT department with a massive MPLS network from ATT. When a circuit
goes down - it is true that I don't regain MY costs by enforcing the terms of
the SLA and collecting the remuneration that is due. However, I do many times
collect a significant amount of credits at times...and that causes regional
managers at ATT to take notice and start making infrastructure improvements to
improve actual up time...which is what I want - and then that allows me to
actually pay them for their circuit - which is what they want.

~~~
jon_dahl
From my perspective as a service provider, I feel major pain whenever our
service hiccups. If something goes down, the fear of unhappy customers (or
losing customers) looms much larger than the fear of having to issue a service
credit. For us at least, the latter is a tiny motivator compared to the
former.

But a small startup is probably motivated by different things than a large,
vaguely monopolistic company (like an ISP).

------
mjs
The reason for refunds in the hosting business is not to compensate the buyer
for any loss of business they might suffer if their site goes down (this will
probably be many times what the service provider would make from the account);
it's to tie the failure of the buyer's systems to something the service
provider would self-evidently prefer to avoid.

Service providers could also signal their reliability by offering "loss of
business" insurance at low premiums. If a service provider will pay you
$10,000/day in the event that your site is unavailable, and the premium on
this insurance is $5/month, it's pretty likely that they're not expecting to
have to pay this out very often.

(Royal Mail offers fairly cheap loss/damage insurance, which is a good sign
their service is reliable. Not sure about DHL/Fedex.)

------
colonelxc
"Emergency Maintenance" seems to be a pretty broad get out of jail free card
to me. I trust the Zencoder guys aren't going to try and weasel out of their
commitments (and this SLA is probably better than most), but if you're having
downtime for almost any reason, wont you be doing "Emergency Maintenance" to
bring the site back up?

~~~
jon_dahl
Fair point. We'll reconsider the wording of that one. The expected use case is
something like a critical security patch - we might have to bring a service
down for a security update - but that is pretty broad.

