Hacker News new | past | comments | ask | show | jobs | submit login
Multiple engines are down: incident report for OpenAI (status.openai.com)
29 points by Pneumaticat on Jan 27, 2023 | hide | past | favorite | 12 comments



This was pretty annoying. I had a lot of homework and a science fair due today!


That incident report is about an issue two days ago, the title of this post (which matches the page) is misleading.

They have however also been having issues today, at least with chatGPT, that looked to me to be issues with the web server configuration for their Next.js ChatGPT front end. It seemed to be returning the apps root html file for all requests for JS. All seems fine now though.


When chatGPT is down, does it dream?


If it down so that they can apply some more recent training data, then maybe the answer could be yes.


And does it dream of pets or cattle?


or electric sheep


Looks like openai infra is either kubernetes based or nomad based. Would love to learn how the infra is actually setup.


> Other issues are due to misbehaving bad hardware that need to be identified and removed from operation.

> We are actively working on addressing those limitations this quarter.

This always boggles the mind, but I've seen similar several times in the past on different HPC clusters. Hardware bugs that you just cannot seem to shake down, that are triggered just often enough to be a problem but seldom enough to be "impossible" to debug.


Maybe someone was Screaming in the Datacenter again, disturbing nearby spinning Disk Drives...


That sounds like almost every dodgy disk drive I've encountered, and those clusters can have hundreds of them.


azure strikes again


Welcome to the shitshow that is Azure!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: