
Show HN: Top, but for Nginx - squiguy7
https://github.com/gsquire/topngx
======
maxmalysh
Monitoring capabilities are missing from Nginx on purpose. They are not and
will never be available for free because there is "NGINX Plus".

This is why I recommend switching to HAProxy.

~~~
heipei
I'd love to just "switch to X", but there is no X which provides all of the
above in one great package: Static file serving, load-balanced proxying
(TCP/HTTP), fine-grained caching, automatic Let's Encrypt update, API-based
configuration (for dynamic upstreams etc), monitoring. Maybe there shouldn't
be such a tool. For all other use-cases I go with nginx since it at least
provides decent proxying, caching and static file serving.

~~~
brightball
Correct me if I’m wrong, but doesn’t Caddy 2 do almost all if that?

~~~
411111111111111
You mean the software which wouldn't start when let's encrypts acme server was
offline and which developers said this is working as intended?

I mean, I'd definitely encourage people to use it for hobby projects, but if
that's how the developers see their software, I would never trust them with
anything serious.

~~~
mholt
Someone's a little out of the loop.

~~~
411111111111111
I know it was "fixed" after thousands of people chimed in.

Nonetheless, I still wouldn't be able to trust developers who think that's
reasonable.

if it had been an error and unintentional i wouldn't have been worried.
mistakes happen to everyone. but it was an actual design decision. Without
serious code review i'd be too worried the developers had any other bright
ideas.

~~~
likeclockwork
You're responding to caddy's author.

------
gregoriol
Maybe not as lightweight, but GoAccess
([https://github.com/allinurl/goaccess](https://github.com/allinurl/goaccess))
does an awesome job at parsing the logs and displaying statistics, works for
nginx and other webservers too

~~~
guanzo
Goaccess used to work perfectly, but recently when ever I try to run the real
time HTML command, it exits without any error messages after ~2 million
records. Maybe out of memory.. any ideas?

~~~
joshyi
Doubt very much that ~2M will be a memory issue (unless you got less than
~130MB).
[https://goaccess.io/faq#performance](https://goaccess.io/faq#performance)

We're running v1.4 in production and it has been working pretty nice for us.

------
jrumbut
This is the tool I've wanted (and half written 3-4 times) my whole career.
From reading the github it looks lightweight, not a big infrastructure
addition, and that it helps you figure out wtf is going on with the web
server.

Regarding the branding, for me top is a real-time tool rather than a logging
tool. I was picturing something that may have been more useful for older style
Apache httpd installs where you have several virtual hosts on a server and
you'd want to know who is hogging the resources or causing the problems.

~~~
jluxenberg
Pro tip: you can make any command into a real-time one with `watch`:

    
    
      watch bash -c "topngx < /path/to/access.log"
    

Will run `topngx` against access.log every two seconds and display the output.

~~~
gregoriol
That's not "real time"! And definitely won't behave well if the processing
takes more than 2 seconds (imagine log files of many millions rows)

~~~
cyberpunk
It'll just wait two seconds after the command returns, not spawn one every two
seconds.....

------
randomstring
My last company had something like that and included response time percentiles
(50th, 90th, 95th, 99th) and we had these values graphed and displayed on a
big screen in our office. Along with a ton of other performance stats: queries
per second, various measures of system load, etc.

Averages can lie, especially when something like an empty query can take close
to zero time compared to a non-trivial transaction. If some robot or other
artifact of your site is generating a some amount of null queries that will
make your average response time look better than it actually is. Percentiles,
particularly on the tail of 90th or above, tell a better story of how well and
consistently you're responding to traffic under load.

~~~
harpratap
How "recent" are your percentiles? I have found that calculating percentiles
is a pretty CPU heavy task. And you if you have a giant Grafana querying every
30s it can stress out your prometheus/graphite whatever. But if you take small
data size, like 95th percentile of latencies in the last 2minutes, it's not
really a very accurate representation either.

And ofcourse there is another problem of correctly storing all your latencies
accurately which becomes pretty hard if you are using something like
prometheus.

------
eterps
I wouldn't mind a screenshot before installing it.

~~~
fb03
I thought the same thing. I was expecting to see some kind of screenshot so I
could have a glimpse of the software.

EDIT:

1# I'm gonna compile it and provide a screenshot via a pull request.

2# Compilation failed because it needed sqlite3 headers and this is not
reported in the app. I'm gonna edit that in the readme too :)

This year (specially with the whole covid thingy) I set a goal to contribute
more to open source. I'm trying to find every little issue I can find and
contribute to :P

~~~
squiguy7
Thanks for submitting a patch! I have been throwing around the idea of using
the "bundled" feature flag listed here:
[https://github.com/rusqlite/rusqlite#optional-
features](https://github.com/rusqlite/rusqlite#optional-features). I was
hesitant to initially because it would force users to have a specific version
of SQLite.

~~~
fb03
[https://github.com/gsquire/topngx/pull/1](https://github.com/gsquire/topngx/pull/1)

o/

------
jimjag
Hmmm... looks like nothing more than a weblog analyzer. Someone correct me if
I'm wrong. It's not "real time" since it can only report on what the web-
server has _done_ not what it is _doing_. AFAIK, nginx has nothing like Apache
httpd's mod_status... at least, nothing open source.

~~~
dijital
If you're in a Kubernetes environment, the NGINX Ingress Controller has a
pretty decent set of realtime metrics: [https://kubernetes.github.io/ingress-
nginx/user-guide/monito...](https://kubernetes.github.io/ingress-nginx/user-
guide/monitoring/)

Which presumably means those metrics are available in the OSS edition
somehow...

------
hk__2
> This tool is a rewrite of ngxtop to make it more easily installed and
> hopefully quicker.

Why make a whole new tool with limitations instead of improving the existing
one?

~~~
ccmcarey
Original maintainers of a project probably wouldn't just accept a PR of a
complete rewrite.

------
dheera
Interesting, but I would have thought "top" for nginx would be a tool that
shows you all the connections, paths, and resource usage live, like the "top"
command. Is there a tool that does that?

------
deft
How does this compare to goaccess? Similar tool that I've used briefly. One
issue I had was how complicated it was, I'm assuming since this is nginx
specific it's simpler.

------
madsmtm
I made something similar in Python [0], but for parsing the error_log
directive. Just for the odd time you need to parse that.

[0] [https://github.com/madsmtm/nginx-error-
log](https://github.com/madsmtm/nginx-error-log)

------
superkuh
>a rewrite of ngxtop to make it more easily installed and hopefully quicker.

What world does this guy live in that a program in Rust is easier to get
running on any random machine than python script?

~~~
squiguy7
I made a new GitHub release with binaries for Mac and Linux pretty recently.
In this sense, you can just download the binary and get up and running.

