

PagerDuty (YC S10) wakes the right person up for your tech emergencies - alexsolo
http://www.readwriteweb.com/hack/2010/10/pagerduty-wakes-the-right-pers.php

======
ccheever
We (Quora) have been using PagerDuty for a few weeks and so far, it is working
pretty well.

~~~
brianr
Any gripes? (Wondering why you qualified "well".)

~~~
146
It's been completely reliable when there were actual problems. The web
interface was pretty intuitive to us as well. I really like the login flow.

Our PagerDuty is integrated right now with Scout, Pingdom, and our own custom
alerting system.

So far most complaints we've had while using the service for the first week
were our own faults: for example, our monitoring was too sensitive, which was
fixed by using the regexp filters, and by eliminating spurious errors from
reporting on our side. One thing that PagerDuty did was that it basically
forced us to fix these reporting issues so that we weren't woken up at 5AM
unless it was a real emergency.

The SMS interface got a little confusing when we had two errors at once. For
example, a frequent case is getting two pages at once, "Service X is DOWN" and
"quora.com is DOWN". I think what I tried doing was:

1\. Receive the first report (site).

2\. Receive the second report (service).

3\. ACK both reports using the second report's code.

4\. Fix the service.

5\. Attempt to resolve the second report, receive a "code already used" error.

Resolving things via SMS is a little bit clumsy (it's what I usually default
to). A link to the PagerDuty login would be cool, but I don't know if it would
fit in the 160 character limit.

~~~
alexsolo
Thanks for the feedback! Very good point, we should allow you to send multiple
replies per SMS alert to ack and then resolve (currently, there's a limit of
one reply per alert).

We'll look at fixing this soon.

------
duck
This looks like a great tool for corporate IT departments too. From their
website (<http://www.pagerduty.com>), I like how it works with any monitoring
system _as long as the tool can send email_. That would make it easy to
integrate.

~~~
mikedanko
Large scale engineering dept guy here, the problem there is that most
companies like that won't accept a SaaS solution.

Don't get me wrong, it's absolutely brilliant. I think it's the first time
I've ever given a thumbs up to a third level metasolution to a problem.

Pagerduty needs to push some use cases on their site. It might break the SaaS
reluctance to steaks and strippers type corporate managers. "It can eat my
rediclously complicated jasper report that I send straight to the trash bin on
arrival so I don't have to read it and figure out what buttons on the phone I
have to push with all the reluctance of a four year old kid with a plate of
brussel sprouts and broccoli in front of them? Sign me up!"

------
synack
PagerDuty has been acting as my alarm clock for the last few weeks and has
performed admirably. Good for them, bad for me and my servers.

~~~
snissn
It would be funny to set up a server that specifically crashes when you want
to wake up, similar to those alarm clocks that require you to solve a puzzle
to stop blaring

------
patio11
Considering the kind of people who have engineers on call, I wonder if all
prices aren't low by an order of magnitude or so.

------
DougBarth
We integrated PagerDuty with scoutapp.com and pingdom.com. Works great.

------
ashleyreddy
The pager duty guys are great. I had the pleasure of meeting them at the
reception last week.

------
HappySushiCo
Beyond their features, these guys are great - they're determined and helpful.
And for a service like theirs these things are paramount. Congratulations
PagerDuty!

------
paul9290
Do u have an API? I'd like to integrate your service with ours
(<http://sleep.fm>)

~~~
alexsolo
We have a simple integration API and we can also integrate with any system
that can send email. Regarding the integration, just shoot us an email:
support@pagerduty.com.

~~~
paul9290
ok great will do. I was on iphone when i wrote question, yet now looking via
old fashion computer Im not easily seeing it on your site.

~~~
terrym
Was looking for it too, found it: <http://www.pagerduty.com/docs/api/api-
documentation> took a while, though -- in my opinion, a link to the API
documentation is important enough to put in the footer, maybe next to or under
"integration guides".

------
frankwiles
We've been using it as well. Works great!

------
natch
You reinvented Nagios?

~~~
MiguelHudnandez
Not really. It doesn't do any monitoring, it just turns an e-mail into a phone
call, and decides who to call based on a schedule you maintain in the web app.

The fact that it doesn't do monitoring is pretty much my only major gripe with
PagerDuty. I've been a customer for a few months.

------
cashmo777
And this is better than widely adopted, open source Nagios how?

~~~
AngryParsley
It is nothing like Nagios. PagerDuty handles alerting, not monitoring. Once
your monitoring system discovers something is wrong, it tells PagerDuty.
PagerDuty then gets ahold of the right person. It lets you set up on-call
rotations. It does automatic escalation if the primary contact doesn't
acknowledge the incident. If your monitoring system supports it, it will even
automatically resolve the incident once you fix your servers.

~~~
cashmo777
In point of fact, Nagios (which I use extensively) handles alerting,
notification, auto escalations, acknowledgements, on-call rotations, and
resets itself post incident. <http://library.nagios.com/> Sounds like re-
inventing the wheel for a well-solved problem.

