
MacroBase: Prioritizing Attention in Fast Data - wackspurt
http://macrobase.stanford.edu
======
wackspurt
MacroBase's pipeline is broken up into the following operators: Transform,
Classify, Explain.

I find the Explanation operator very valuable and haven't seen something like
this in any work in monitoring/anomaly detection (correct me if I'm wrong). It
finds attribute value combinations that are disproportionately concentrated in
the outlier datapoints. Outliers are determined by the classifier by looking
at metric value(s). The classifier can be based on static thresholds,
percentiles, unsupervised/supervised learning algorithms, etc.

Example: "A mobile application manufacturer issues a MacroBase query to
monitor power drain readings (i.e., metrics) across devices and application
versions (i.e., attributes). MacroBase's default operator pipeline reports
that devices of type B264 running application version 2.26.3 are sixty times
more likely to experience abnormally high power drain than the rest of the
stream, indicating a potential problem with the interaction between devices of
type B264 and application version 2.26.3".

I think this sort of engine/system is valuable for detecting and isolating
anomalies in telemetry data.

------
fouc
> [a solution for the] relative scarcity of human attention and overabundance
> of data: return fewer results, prioritize iterative analysis, and filter
> fast to compute less.

>By combining streaming operators for feature transformation, classification,
and data summarization, MacroBase provides users with interpretable
explanations of key behaviors, acting as a search engine for fast data.

------
nycdatasci
Is there anything to this platform beyond the two new SQL operators mentioned
in the docs?
[https://macrobase.stanford.edu/docs/sql/docs/](https://macrobase.stanford.edu/docs/sql/docs/)

