Hacker News new | past | comments | ask | show | jobs | submit login

This is very interesting. Looking at the data though it is classified as "Errors per server". It isn't really disclosed what variables this figure entails but I'd have to imagine adding more information than simply error counts would improve the separability of the data?

It would certainly help separability to observe the data in a higher dimensional space, however when you're taking automated action against the results its sometimes pertinent to know which specific metric is causing the server to be an outlier.

For example if network tcp retransmits are throwing it off we probably just want the system to kill it and let the autoscaling group bring up another server. If its memory usage we probably want to page someone.

Yes, which is a reason why all non-linear clustering techniques can be a challenge when investigation follows outlier detection.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact