Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Memory-Based Anomaly Detection in Multi-Aspect Streams (github.com/stream-ad)
36 points by siddharthb_ 10 days ago | hide | past | favorite | 3 comments

MemStream detects anomalies from a multi-aspect data stream and outputs an anomaly score for each record. MemStream is a memory augmented feature extractor, allows for quick retraining, gives a theoretical bound on the memory size for effective drift handling, is robust to memory poisoning, and outperforms 11 streaming anomaly detection baselines. Preprint of the paper is at https://arxiv.org/pdf/2106.03837.pdf

Very interesting. We were curious about the data prep for CUP99. Since one might actually expect "normal" traffic to be more heterogeneous than attack traffic (which must actually implement an attack), switching the classes like this might actually overreport accuracy. Further, one could imagine a network where the majority of traffic is "anomalous", such as a DDoS where only the time-series data would be sufficient to detect the anomaly. Since CUP99 does not contain time-series data could you explain how you would learn concept drift on this dataset?

Thanks! Great questions. Existing papers switched the class for KDDCUP99 so we followed them to have a fair comparison. We also use UNSW and DoS which are time-series based datasets. Once the encoder is trained on few normal samples and use that to initialise the memory, MemStream will be able to handle anomalous samples even if they are in majority. Retraining can also be done at a later point if required.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact