Hacker News new | past | comments | ask | show | jobs | submit login

The data looks like it could lend itself to an approach where you model the error rate based on the prior data (at simplest, get a mean and variance out of it) and then use a Chi-square critical range check to see if the last n (degrees of freedom in the check) measurements are likely to have come out of the modelled distribution. Is that something you've considered?

Hey there,

We have considered modeling the distribution from which the data is typically drawn and then calculating likelihood of newly observed data. Some of the approaches that we use to detect anomalies on stream starts per second (SPS) now depend on these services. Same software package, slightly different solution.

A colleague of mine (Chris) implemented a data tagger which allows users to annotate data that is typically fed into this system. We have plans to have the backend automatically swap out the algorithm based on performance against their tagged data.

We've written about SPS here: http://techblog.netflix.com/2015/02/sps-pulse-of-netflix-str...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact