1. Developing a fix without understanding root cause (try-something development)
2. Sufficient testing, including load testing, prior to initial deployment
3. Better change control after initial deployment
4. Sufficient testing for changes after initial deployment
5. Rollback ability (Why wasn't that an option?)
6. Crisis management (What was the plan if they didn't miraculously find the bad line of code? When would they pull the plug on the site? Was there a contingency plan?)
7. Perfect being the enemy of good enough
Looks like they were bailed out of the cost but what if that didn't happen?
In the companies I've worked for, these guys would be written up and likely put on a performance improvement plan, if not flatly fired.
I don't understand how you can build a complex application like that without doing basic performance checks like, are we hitting the file system or database too often, our the image assets correctly sized, etc.
I'm not a software engineer however.
A money clock on the table isn't fun and if you replace the the app with a landing page and a newsletter form it's completely acceptable for the visitors.