

Ask PG: And we're back...What happened? - aston

The outage, I mean.
======
pg
There's a problem in the server software. When the load gets high, it fails
catastrophically instead of gradually. Robert and Patrick Collison are
investigating, but they're still not sure what the problem is. My guess from
the external evidence is that it's related to garbage collection.

Killing the server process fixes the problem, at least for a day or two.

~~~
ericb
Interesting. Are they investigating by load testing a dev environment, or some
other method?

~~~
pg
Patrick managed to reproduce the problem by writing something to replay server
logs, but he's not sure yet what's causing it.

~~~
wfarr
Interesting.

I'm assuming it's traffic-related, because I've run had ASV running on my
server for weeks on end without real issue (I don't get much in the way of
traffic).

