

Ask HN: How to be a responsible single founder? - matth

I'm on the cusp of launching my own startup in the near future. I've never been so excited about anything I've ever worked on and I think it shows in the near-final product (damn thing looks and works great).<p>I also happen to be in the single founder camp. This was just some small side-project that I realized had the potential to provide a nice little side income.<p>But I have more than a few worries. Don't get me wrong, I have good people around me to lean on for legal, business, and financial matters - but not too many technically-minded folks.<p>I'd love to hear from you guys about what specific tools and processes you've put in place to be sure your site(s) are always online, or that you can easily put up a "down for maintenance" notice on your frontpage if your server starts throwing lots of 404's, etc.<p>I imagine you guys have some pretty cool mobile apps specific to your company. But anything you can share with me about me a responsible business owner would help put me at ease.<p>I'm afraid of scenarios in which I go out with friends on a Friday night to see movie, the site goes down, I have no idea until the next morning - and then come Monday half my userbase has evaporated.
======
pedoh
You've got two questions here, I think, and they both deserve careful
consideration.

1) How do you monitor your site to alert you of problems?

2) How do you prevent your site from going down in the first place?

In my opinion, #1 is more straightforward than #2, and the more uptime you
need for #2, the more you're going to pay.

a) Write a script to send you an email / SMS when your site is down. Run it on
a separate server, or run it on something like Google App Engine.

I'm sure there are some simple free tools out there to achieve this for you. A
quick search provided:

<http://www.siteuptime.com/>

They offer a check per 30 minutes for free. Can't beat free, but this is not
an endorsement, I've never used them.

b) Use something with more features. Run it on a separate server or use
someone who provides it SaaS style.

Nagios, ZenOSS, Zabbix, Groundwork, OpenNMS. I can personally vouch for
Nagios. It's pretty easy to get a simple configuration going, and then it can
get very complicated (you might want to monitor the monitor, right?).

2) If you've got two servers that you can connect to a load balancer, you may
be able to run in active-active mode so that if one fails, you simply lose 50%
capacity. For your data store, options include a database solution like MySQL
in master-master form, or a relational database that has data redundancy as a
feature. In my opinion you don't choose your data backend solely because of
it's redundancy and failover capabilities, but it could be a factor.

If you can't do active-active mode right now for some reason, then active-
passive can allow you to stay up enough to deliver that message until you
restore your services. If you can get an "extra" IP address, you can even do
this without a load balancer involved. Take a look at keepalived.org to see
how you can float an IP address between multiple servers. CDNs such as Akamai
also provide site failover features that might be worth investigating.

There are so many ways to skin the cat on this one, and I'm just scratching
the surface. If your site is more than informational (e.g. if you're building
a web based service or application), then monitoring and failover / redundancy
are critical to your success. If you just want your informational site to be
up, it's still pretty important. The fact that you're thinking about this now
and not after your first 24 hour window of downtime is a good sign!

If you provide more information about what your startup is doing (at least
from an tech architecture perspective) and what sorts of resources you're
willing to spend to improve your uptime and failover capabilities, you might
get more specific suggestions.

Best of luck, and congrats on the startup!

Pete

------
gte910h
Automation, Automation, Automation

Hire part time or temporary first, then full time.

Get a phone with push notifications. Setup a second system to constantly check
for failure in the primary system. Get the second system to notify you in case
of failures. Better yet, if you can, have the second system be a failover
clone and have the notification dashboard be a third system. Base all these
systems on different vendors systems if you can.

Buy 3g/laptop/other mobile ssh solution and use this to fix things when you're
more than 30 minutes from your office/home.

And lastly, don't worry so much. Money isn't everything, and you honestly
sound like you're possibly going to worry so much about this, it will make
your life a net negative.

------
jhancock
"damn thing looks and works great"

If this is the case, don't worry so much. If your app uses a well-understood
(by you), proven (by others) software stack and you've been testing well
enough, your worries may be for naught.

Some folks around here recommend tools like pingdom. Get something like that
if it helps you enjoy a break with your friends.

Good luck!

~~~
matth
Something like pingdom is exactly what I was looking for. I'm loosely familiar
with such services, but never given one a shot myself - so I really don't know
which ones are considered the best.

~~~
mattew
Pingdom is great. We use if to monitor all our client sites and it is awesome.

------
patio11
Oooh, this would be a _great_ topic to go in depth on for a blog post. I'll
get something done for you this weekend because this is a topic near and dear
to my heart.

~~~
matth
It would be awesome to get even more in-depth perspective. I'll keep a lookout
for your post.

------
shadowsun7
Hi matth,

Not in a single-person startup, but there were some useful HN-linked articles
in the past. In particular, I found patio11's article about running a software
startup as a single-founder (and on 5 hours a week) particularly useful:
<http://news.ycombinator.com/item?id=1206649>

I think he's made some really good points about doing a lot of time-as-asset
stuff (go read the article, it's really good). Maybe you should get in contact
with him?

I wish you the best of luck.

~~~
JacobAldridge
Not being technical experienced I don't know how this works, but Patrick's
downtime alert is my favourite 'here's how amazing technology is' story to
people with less knowledge than me.

Not sure exactly how he set it up (possibly through Twilio as well -
[http://www.kalzumeus.com/2009/12/29/twilio-phone-call-web-
ap...](http://www.kalzumeus.com/2009/12/29/twilio-phone-call-web-api-is-crazy-
fun/)) but his phone calls him playing Ride of the Valkyries anytime the site
goes down.

~~~
patio11
Really simple actually: monitoring service set up to mail my Japanese cell
phone, cell phone has custom ringtone for emails from their address.

------
Tawheed
I'm a single founder as well, and here is what I've done:

1) Look into chef to automate 99% of your sys-admin tasks (setup, tear up,
tear down, deployment)

2) Set up pingdom for monitoring your servers externally

3) Set up hoptoad (<http://www.hoptoadapp.com>) to monitor for exceptions and
errors internally with e-mail notifications

4) Set up a laptop with a MiFi for remote access - you're going to keep this
in your car whereever you go (if you're in a city or something, set up VNC on
an iPad or something and have access to a workstation where you can look into
emergencies)

The above will a) notify you of critical things happening and b) give you a
way to act upon them. Thats pretty much all that you can do at this point --
and I'd look into Rackspace managed services to see if you can leverage them
for basic system troubleshooting (this is on my ToDo list).

By the way, we've got a single founder community going, its new and we're
still getting to know eachother but it is great for moral support - let me
know if you want to join.

------
marcamillion
Pingdom is pretty cool. So is <http://aremysitesup.com/> Here are some others
too: [http://sixrevisions.com/tools/12-excellent-free-tools-for-
mo...](http://sixrevisions.com/tools/12-excellent-free-tools-for-monitoring-
your-sites-uptime/)

------
imp
I use Binary Canary to alert me for down-time, which is about the same thing
that Pingdom does.

Just do whatever it takes for you to fall asleep at night. I recently bought a
smartphone (N900) that allows me to ssh into my server from anywhere, so if a
small fix is needed, I can do that quickly. Before I had that, I actually took
my laptop camping with me during a busy period last year and drove to a nearby
coffee shop periodically to check for any important issues.

Other than that, I'd say to scale up your architecture as your site gets more
popular. If you later need some dual-DB setup with a heartbeat monitoring
system then you can add that when the time comes.

Have good backups too. That might be more important than just uptime.

------
marcamillion
Also, no affiliation with these guys, I have been watching a company that has
an interesting offering. If you run Rails, or can run a ruby script on your
server, this offering looks very interesting: <http://scoutapp.com/>

------
andrewljohnson
We use site canary for www.trailbehind.com.

It ran smoothly for months, and then we had a bit of a problem, but site
canary alerted us right away.

Also, site canary was recently made free, so set it up! It will email you when
you have an issue. Should definitely have some sort of uptime tool like
pingdom or site canary.

Disclaimer: I have nothing to do with site canary, other than I use their
software.

------
petervandijck
<http://pingdom.com> is great for monitoring, they can send you an sms message
when your site is down too.

As for automating restart, are you running on AWS? Then you can automate that.
For that part of your question, you have to give some technical background on
your setup etc...

------
TotlolRon
I've been doin this for a while now. The problem is more mental than
technical.

The first thing to accept is that this is by definition an irrisponsiable
thing to do...;)

Then you need to flow with it. The main thing is not aiming at fixing
everything as fast as possible but rather in a way that will be as reliable as
possible. Or in other words, your primary consideration should not be an early
alert on Friday night, but rather a wilingness to work the weekend making sure
whatever happened will not happen again.

The tech details really depend on the specific application.

