
A Benchmark for Streaming Anomaly Detection - gk1
https://blog.dominodatalab.com/numenta-anomaly-benchmark/?r=1
======
acobster
I highly recommend Jeff Hawkins' (the founder of Numenta) book _On
Intelligence_ to anyone interested in anomaly detection and AI in general.
Although HTMs haven't been as successful as neural networks, his concept of
what intelligence _is_ is very lucid and well-reasoned. Even if he turns out
to be wrong about the technology side of things, I'm happy to have learned a
thing or two about the neocortex.

------
graycat
Okay, I used to work in _anomaly detection_ , right, for more important server
farms and networks, in particular for real-time, _zero day_ detection, that
is, detecting problems, _anomalies_ , never seen before.

The OP has 6 "ideal characteristics of a real-time anomaly detection
algorithm". My work seems to do relatively well on the criteria.

A challenging criterion is the 6th one:

"The algorithm should minimize false positives and false negatives."

So, no false positives and no false negatives would be a _perfect_ detector,
and in practice they are rarely possible.

A more realistic formulation of the criterion is that for a given rate of
false positives (false alarms) achieve the lowest rate possible for false
negatives (missed detections of actual anomalies).

How to do this is in the now classic Neyman-Pearson result. Right, in some
cases the computations are a knapsack problem where that problem is, IIRC, in
NP-complete.

Really, another important criterion is to have the false alarm rate
adjustable, that is, can set the false alarm rate in advance and get that rate
exactly in practice.

Yup, that's possible.

Why? Really, anomaly detection is nearly necessarily some continuously
(continually) applied statistical hypothesis tests where the _null hypothesis_
is that the system being monitored, that is, the source of the data, is
_healthy_ and the data _normal_ and not an anomaly or a symptom of a problem.

So, we are into statistical hypothesis testing, and that is the context of the
classic Neyman-Pearson result.

The literature on statistical hypothesis testing goes back to at least the
Neyman-Pearson result before 1950; the field is old with a lot known.

In that field, it is usual to be able to set the rate of false alarms ( _type
I error_ ) in advance and get that rate exactly in practice.

For monitoring server farms and networks, there are two additional criteria
that are just crucial -- right, in particular for getting a low rate of missed
detections (false negatives, _type II errors_ ) for a given rate of false
positives:

(1) Distribution-free.

The easy way to be able to do the calculations of an hypothesis test is to
assume that know the probability distribution of the input data when the null
hypothesis holds. There, sure, the most common distribution assumed is the
Gaussian.

But there are also statistical hypothesis tests that are _distribution-free_
(also _non-parametric_ , e.g., not using the _parameters_ of the Gaussian
distribution) that is, that make no assumptions about the probability
distribution of the data when the null hypothesis holds.

For monitoring server farms and networks, too often should be using
distribution-free tests.

(2) Multi-dimensional

The data readily available from monitoring server farms and networks is
horrendously large, like flowing oceans of data.

So, it's easy to get data on each of dozens of variables at rates from one
_point_ each few seconds up to hundreds of points a second.

Treating such _jointly_ multi-dimensional data one variable at a time must
typically throw away a huge fraction of the relevant information. So, the data
should be handled _jointly_. So, need statistical hypothesis tests that are
multi-dimensional, and those are not so easy to find in the literature.

So, the work I did for detecting zero-day problems in server farms and
networks was to create a large collection of statistical hypothesis tests that
are both distribution-free and multi-dimensional.

E.g., if the data consists of a time series of points on a checkerboard and
_normal_ data is on the red squares, then my work will detect, with known and
adjustable false alarm rate achieved exactly in practice, points on black
squares. In this case, treating the data on each of two perpendicular edges
(axes) of the checkerboard separately yields exactly nothing. So, this example
illustrates the extra _power_ of being multi-dimensional. Also the
distribution of the data when the null hypothesis holds is just the red
squares, that is, far from anything common in statistics.

Moreover, can argue that there are some good _qualitative_ reasons to believe
that, really, a good detector for zero-day problems in server farms and
networks should be able to _know_ when it is in a fractal such as the
Mandelbrot set and when not. Yes my work does that.

But, I have to say, I did my work -- at IBM's Watson lab to improve on our
approaches via AI -- and published it decades ago, and since then I've seen
essentially no interest at all in the work.

For more, for years I jerked the chains of a huge fraction of all the venture
capital firms in the US with essentially no interest.

My conclusion is good work in anomaly detection and a dime won't cover a ten
cent cup of coffee.

But, if by now anyone really is interested, then the remarks here should help.

~~~
d215
I would be really interested to read the publication you did on this. We are
trying to do anomaly detection from a large stream of traceroutes coming on
from approx. 10,000 locations continuously over the internet.

~~~
whatwasmypwd
I would also like to read this.

~~~
graycat
As in my

[https://news.ycombinator.com/user?id=graycat](https://news.ycombinator.com/user?id=graycat)

here, somehow let me have an e-mail address.

~~~
d215
I left my email address in the `about` field of my hn profile.

~~~
graycat
Now you should have both the reference to the paper and a PDF of the paper.

Thanks for your interest. Hope the paper is useful.

My posts in this thread should help in reading the paper.

