
Show HN: Log File Anomaly Detector - stochastimus
https://www.zebrium.com/anom-detector
======
stochastimus
Hi! I'm Larry, founder and CTO at Zebrium. I used to build bespoke machine
data analytics platforms for software and appliance vendors. These systems
were really useful, but it took a lot of work to make and maintain them. So, I
took a couple years to myself to learn, apply, and develop the ML chops I
needed to build such a system automatically. That's what Zebrium is all about:
applying a range of techniques at different scales to achieve near-100%
structured log data coverage, with no supervision or pre-structuring, from as
little as 100K of real-world log data.

Our software builds a catalog of event types and parameters for each of a set
of log types (for us, a "stack" corresponds to a set of log types). With this
context, it finds "anomalies" through the logs, using a model built on a set
of features. These features include simple things like severity and first/rare
occurrence, and complex things like change in rate/periodicity, cross-event-
type and cross-stream correlation, NLP topic, and timeseries features. We've
trained our model on data from a few dozen stacks, including dozens of real
and interesting "anomalies" the operators would have liked to see uncovered.

Our software seems to spit out "good stuff" most of the time, and tends not to
want to "ring the pager" when nothing breaks. I think it's become quite useful
- but I'm biased. :) We want to continue to make it better, so we want to get
feedback on what happens when people submit log files with interesting
problems that could have been spotted within the logs. Let us know what we get
right and wrong.

Our SaaS service is in private beta, but anyone can try our log anomaly
detection. You can upload up to 5 logs at once (more data and related files
improve accuracy) and get a report listing your anomalies, the reasons for the
anomalies, and a visualization of the event patterns within your logs. The
report is sent to you by email (this is why we ask for your email address).
The service is free and you can use it as many times as you want (limit 500MB
of logs each use). Please try it (www.zebrium.com/anom-detector) and let us
know what you think.

------
bradknowles
Interesting concept, but I’m not going to trust a tool from an unknown
provider with my potentially sensitive data.

Give me a tool that I can run locally, and I might be interested.

~~~
Ajs1
Co-founder here. Understand your concern and perspective Brad. In the near
term this is a SaaS offering, so the best we can do to address these concerns
is to offer strong security controls (a short list on our site, longer list on
request) backed up by external certification (underway). Plus the credibility
and track records of the team behind it.

~~~
eps
I’m going to echo the GP - this does look like an interesting tech, but it
must work in a privacy-preserving fashion. Logs contain loads of sensitive
info, _especially_ if they are of a verbose/debug type, which is what will
probably be used to capture failures. But in SaaS form, where it wants raw
logs, it’s a no-go even for the eval purposes.

~~~
gdcohen
Gavin from Zebrium here. Again, we understand this concern. All data is
encrypted in-flight and at rest and we have a lot of security controls in
place (see our website). We also have the option of a dedicated VPC assigned
to a single customer. Beyond this, we have a unique capability: since our
machine learning structures and types everything, we have a feature that lets
us hash (or delete) any sensitive information in logs (and we can record match
to find other places this info occurs that you might not even be aware of).
Not dodging the fact that we're SaaS, just pointing out what we do.

