Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's fascinating that a single engineer, who they had on staff, was able to write a patch in one night that improved performance by their messaging system 5x...

... and hadn't already done it.

This isn't meant as a slight against Heroku at all. They've got an incredible team of engineers. But imagine if Ricardo had said "hey, I could write a patch today that would speed up our messaging system 5x, should I do it?" the rest of the team would've said "OF COURSE!"

It reminds me of what happens to your brain when you launch a site. Even before you get feedback, somehow the knowledge that people can use it drastically changes your motivation system. Things that before seemed important are obviously not. Other things that were invisible before become the singular focus of your resources.

Maybe we should do fire drills...

* Your requests are suddently taking 100x as long to complete. Go!

* Your "runway" disappears due to an accounting error and you have 7 days to turn a profit. Go!

* 50% of people visiting the site have no idea how to use it. Go!

How could we achieve the focus and clarity that a crisis brings on, without having the crisis?



The thing about performance tuning is you don't often know what your bottlenecks will be until the system is under load, what to patch until you dig deeper, and what the magnitude of improvement will actually be.


I understand the gist of your post, but I don't quite understand where you get 1 night from. From what I can tell the 1 night thing was to fix a bug, not improve performance.

Regarding the 5x increase:

> One of our operations engineers, Ricardo Chimal, Jr., has been working for some time on improving the way we target messages between components of our platform. We completed internal testing of these changes yesterday and they were deployed to our production cloud last night at 19:00 PDT (02:00 UTC).


Ah, you're right. I was reading too quickly. :) The general point stands, but yeah. Not a good example.


I've improved our system here likewise, and in short order. It comes down to working on the most pressing task.

I could spend all my time working on optimizations to the system, or I could work on things that will bring us more money. The company would rather the latter, I would rather the former.

But here's the key: Until there's a problem, you don't always notice the small things you could do to give a huge performance increase.

I also don't think mental drills are the answer. When I find a bottleneck, I don't just sit there and think. I'm out reviewing logs and watching the actual performance of the system. Without any details as to what's happening, how could I possibly find the real bottleneck?

Storytime: Once upon a time, we had a customer who abused our system. He submitted more data than all the rest of our customers combined. I loved him, because he showed us all kinds of bottlenecks. Management didn't love him because his stress on the system would frequently cause issues for other clients. Even better, the stress he put on the system really was just like regular clients, and not some abnormal stress. By the time they let him go, I had found so many bottlenecks that we didn't have another system issue for -years- after he was gone.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: