Hacker News new | past | comments | ask | show | jobs | submit login

I made Server Check.in 3.5 years ago (HN announcement thread: https://news.ycombinator.com/item?id=4901350 ), and it's been earning around $2k/year with almost no maintenance. Just a few updates to the Drupal front-end/UI, and the Node.js backend every year (maybe 20 hours total).

Costs are incredibly low, as I have ~15 low end box-type servers running as check servers interacting with Drupal via a private API, and one DigitalOcean droplet running prod, with a hot backup droplet. Everything was automated via Ansible early on, and I don't have to touch anything except for patching/updating from time to time.

I also have run Hosted Apache Solr for almost double the time, and it actually earns a decent secondary income. Up to about 25 DigitalOcean droplets now, also all managed via Ansible/Jenkins, and it has a few hundred clients (a couple who have been stable clients for over 5 years, and a few very large names that made me realize even a side project can be stable/good enough for 'big companies' to trust them).

I haven't advertised either except for mentions here and there and having them in some of my social media profiles, but I've learned so much from running both—and even turned some of that knowledge into a book that gives decent passive income on top!

I'm happy for your success, but I'm not sure I'd consider running a hosted distributed database as a side project. You're on-call 100% of the time for a single point of failure. Also, a bad/inexperienced tenant can serious performance issues.

I learned early on, it's all about proactive monitoring, 'self healing' (e.g. restart on fail) services, and simplicity in everything.

I've been running between 300-500 Apache Solr search cores continuously for almost 5 years, starting with 1.4, migrating up to now 4.10.4 (5 and 6 are in the works), all with around 99.98% uptime (averaged on all the servers over time).

I have detailed logs and per-search-core stat tracking (queries, index size, query time) for the past 3 months and archives back further, and I have only had to remove noisy neighbors a few times (and did so quickly, by first isolating them on their own VM, then helping them move off to dedicated resources if needed).

Also, I explicitly state I offer no SLA in the support docs—some people are okay with that.

But yes, I'm always on call, technically. I've only had to fix 'emergency' scenarios about 3 times in the past 5 years though. Even security updates are automated via Ansible/Jenkins. I just need to log in and click a button and things are updated, or a new server is built.

I highly recommend Google's new book on SREs; whether a one man shop or a multi billion corp, the learnings are exactly the same!

Would that be this book? https://landing.google.com/sre/book.html

That's the one!

I'm always on call, technically

Do you have a partner who can do support when you need time away?

I run a small game website. A couple of years ago it experienced an outage while I was away from home on a 2-day climbing trip. I didn't learn about it until I checked email the first evening. It was a terrible feeling to realize I'd been out having fun, and the site had been down for most of the day. I had to drive home to get the site up again. It was miserable.

All the primary automation and functionality can be controlled by my smartphone; and if the worst comes to be, I can still login to all my servers via Prompt. I've only had to do it once, but it's good to know if can control all aspects of the system on any device, as long as I have my password manager and the right SSH keys.

That goes along with the 'keep it simple' philosophy; since the service has a very small public API and surface area, it's easy enough to diagnose issues quickly with limited analysis of log data and simple monitoring.

But no, I don't have any partner actively, but I do have a contingency/succession plan in case anything would happen to me (for the sake of my customers).

maybe you can try something like chatops, ex: hubot

Very interesting case study! And kudos for being able to pull this off! Curious to learn about some $ numbers

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact