
Ask HN: What is a simple tool for parsing and analyzing log files? - thojest
Hi there, 
I have a small app (frontend, backend, database) with &lt;100 users. I would like to use my log files to analyze basic metrics, like for example
 - number of requests
 - number of comments created
 - number of shared links
per unit of time.<p>All of this runs on one single server, no big infrastructure, no big data. I am unable to find a SIMPLE tool to collect my logs in some way, parse them, and visualize them in histograms for example.<p>I know there exist ELK stack, Splunk, Graylog and many others, but all these solutions are much too complex. Especially I do not want to spend weeks setting this up correctly. Further more, most of these solutions need an extra server for aggregating the logdata in some timeseries db.<p>I would be very happy if you know about any opensource tool which can do this job.
======
baccredited
I've had luck with [https://goaccess.io/](https://goaccess.io/)

example: goaccess logfile.log -o report.html --log-format=COMBINED

~~~
thojest
Now this looks really interesting and could be exactly what I was looking for.
Many thx. Will check it out now.

------
tekronis
Maybe checkout LNAV: [http://lnav.org/](http://lnav.org/)

There's also angle-grinder, which has less features, but also pretty useful:
[https://github.com/rcoh/angle-grinder](https://github.com/rcoh/angle-grinder)

------
gmuslera
Grafana´s Loki may have a lighter weight than the other examples you gave
above.

For some kinds of logs there are tools for summarization and reports (like
awstats for web or pflogsumm for mail servers).

And, of course, for particular queries on existing logs the standard text
tools in a linux box let you generate a lot of info.

~~~
thojest
Grafana Loki was the one I really tried out. To be honest, I had the
impression that this is still a bit alpha. You can work around a few
shortcomings by reading from loki as a prometheus endpoint, but I experienced
a few things in the data which were strange, and I could not observe when
double checking the logs.

While it is true that the server footprint of promtail for collecting and
pushig logs is much smaller, you still have to setup your loki sever for data
aggregation. I spent nearly a week on the setup of promtail, loki and grafana
and wasn't quite satisfied with stability and the end result. Of course this
could be due to my limited experience considering log query language, time
series db, prometheus, ... But overall I had the impression that what they aim
for is quite similar to an optimized ELK stack.

------
runjake
For web logs, I still use Webalizer. For everything else, as long as we are
not talking tens of gigs, I’ll be using some mix of Perl, Python, Shell, Awk,
etc.

------
kjs3
Syslog + something to massage the data (sh/awk/sed, perl, python)? Like we've
been doing for simple apps for decades?

~~~
wdroz
It's even simpler today with systemd, you can write stuff like:

    
    
        journalctl -u nginx --since yesterday

~~~
kjs3
That's not simpler than 'cat /var/log/messages' or the equivalent, if for no
other reason than you still need something to digest/summarize the output.
It's just different, mostly for the sake of being different, but I'll not bait
the systemd fanbois any farther.

~~~
wdroz
It's simpler in the sense that journalctl already provides some filter options
that you no longer need to build yourself.

There are a lot of people that use OS with systemd, but only a small part know
about journalctl. I took this opportunity to spread this information.

~~~
kjs3
The bastard in me wants to say "once again systemd and its ecosystem reinvent
something that the Unix Way already solved decades ago, just in a different
way".

But in reality, thanks for the tip. I used your example today to look into an
issue I'd have otherwise been cussing "where's the damn logs!".

------
PaulHoule
pandas?

