Hacker News new | past | comments | ask | show | jobs | submit login
Linux server monitoring tools (aarvik.dk)
248 points by adionditsak on Jan 26, 2014 | hide | past | favorite | 78 comments

These tools aren't limited to Linux; if you're running FreeBSD:

* http://www.freshports.org/sysutils/htop/

* http://www.freshports.org/sysutils/py-glances/

* http://www.freshports.org/sysutils/apachetop/

Some others mentioned in the comments:

* http://www.freshports.org/net-mgmt/iftop/

* http://www.freshports.org/net-mgmt/bwm-ng/

* http://www.freshports.org/sysutils/xsysstats/

* http://www.freshports.org/sysutils/atop/

If you're running OS X / FreeBSD / Solaris, there are many useful DTrace scripts for system monitoring and profiling:


I recently had to test compilation of a project for work on FreeBSD and feel like my eyes were opened.

I went into the task thinking "the BSDs are OLD" yet there must be something great about this operating system. It's beloved in certain communities and used at popular successful startups such as Netflix. There must be something I've been missing that those who cherish it have come to understand.

What I found was the perfect blend of modern and old school. It's not old feeling at all!

It had all of the newest GNU software (and otherwise) that I've come to know and love on other operating systems but also a sense of stability about the core that you don't get with Linux. There was a sense of underlying structure that had actually been planned out instead of discovered over time and it made the whole process of learning about how it worked a pleasure.

In addition, the FreeBSD manual was actually helpful and gave me a sense of completeness rather than "the text in this wiki is just scratching the surface of a complex wrapper for x that used to be y".

It's simple yet powerful, up to date but solid and I'd highly recommend FreeBSD as a result of the experience.

I'm curious how you got all that from what sounds like a small task: getting something to compile on FreeBSD.

You have to install some things, poke around the system, do some general management, etc... I'm not claiming to be an expert but I really liked what I saw especially in the documentation (it looks outdated but the content is absolutely on point)

>it looks outdated but on point

Which is it? It would be difficult for documentation to be both complete and incomplete.

I think he/she meant that the FreeBSD documentation is complete, although the styling of the site is somewhat dated.

In addition (or even instead of) to htop I strongly recommend atop [0]. This tool has been of an invaluable help to me during a lot of diagnostic sessions.

It can collect detailed memory usage profile of processes and when combined with some smart scripting it has a nice leak detection functionality [1]. Very useful when you run out of memory and want to find which daemon has used all of it.

[0] http://www.atoptool.nl/

[1] http://www.atoptool.nl/download/case_leakage.pdf

edit: formatting

An ncurses disk usage analyzer: http://dev.yorhel.nl/ncdu I find this an extremely handy alternative to du, it's somewhat similar to TreeView on Windows.

That's great, thanks! I've been doing the du -sh * tree crawl for far too long

I have this aliased as duf:

  $ du -sk * | sort -n | perl -ne '\''($s,$f)=split(m(\t));for (qw(K M G)) {if($s<1024) {printf("%.1f",$s);print "$_\t$f";last};$s=$s/1024}'\''

I also find this very useful :) It shows you where your space has got to in a simple and easy to use manner.

This is awesome. Thanks for sharing it.

Very neat, thanks for the tip.

I'd like to recommend this book to get a current overview of monitoring tools: http://www.amazon.com/Systems-Performance-Enterprise-Brendan...

While most sysadm books are years out of date, this one covers all the hot recent stuff like Dtrace and its equivalents on Linux, pidstat etc. Solid coverage and the author (from Joyent) knows his stuff. Available on Safari too.

If you have multiple nodes, I recommend new relic. It's a bit pricey, but if an issue arises in your stack, new relic can help you immediately pinpoint where and what the issue is.

ps. I can view new relic on my phone, so if I get a pagerduty, I can still see what's up if I'm at the beach.

New Relic is great for application monitoring, but the systems monitoring is kind of meh.

And I've been getting way too may false positives on the systems alerts.

If you only want systems monitoring and not deep application performance insight, New Relic is way too pricey and not really that good.

Sounds like Munin would suffice for you. Also, it's free.


> not really that good

New relic has such rich functionality that it is easy to overlook some of its utility. It took us a while to get it tuned to our needs, but now that we have it configured, I couldn't imagine running a high availabilty web service with anything else. Suppose I get a pagerduty for high memory usage on a server. I would then go look at that server in new relic, see what processes are using the memory, see what the memory usage for that process has been like for the last 6 months, perhaps notice a slow steady increase in memory consumption, realize there's a memory leak, etc.

If you have multiple nodes that includes load balancer and other SOA services and want to link them all to view a request as one transaction (e.g.: a request comes in through a load balancer, gets processed by app-server-1, which in turns calls service-2 that queries to DB-1 {or memcache}, Appneta Traceview can connect them all).

Or you can monitor your web-app using Appview Web too for synthetic monitoring. Plus you can monitor your network as well using Pathview.

I'll also support New Relic.

If you're just starting out, the free tier is pretty good.

What the monitoring tools lack on specificity (and depending on your stack, it may provide varying levels of awesome -- server monitoring is weak but improving), it has massive win on zero-configuration installation.

Just sign up, instrument, and start monitoring.

If you find bits lacking, there are almost always local tools you can use to supplement.

Yes New Relic is pretty awesome. Lot of information you can monitor there, with a very easy installation. You can get a free account in there aswell, just to test it. I just made one, and i really like the simple UI: http://i.imgur.com/oGGfTrp.png

New Relic is really great. Especially if you use Ruby, as it has instrumentation deep into the application run level. Unmatched for finding bottlenecks really.

Appneta does this for Python/ruby/java/.net/php.

We recently open source our Ruby instrumentation:


Yup, all of application level instrumentation and performance metrics/exception reporting is also available for python developers with https://appenlight.com.

No server monitoring as of today though.

Isn't it the same with PHP? And probably other languages too.

I think it is the same :-) As far as i can see these are definitely supported: Ruby, PHP, Java .NET, Pythom, Node.js.

I've run a lot of Python apps on New Relic and definitely some great detailed monitoring there too.

Is there a tool that would allow collecting historical data on memory and CPU usage patterns of individual processes? In troubleshooting you are frequently dealing with the situation that some process is "exploding" in memory or/and CPU usage and either you are not there at the moment to run htop or you might not even be able to easily log in on the server to do checks.

"atop" can do it to some degree. When you run atop interactively, it uses the process accounting facility to find out not what is running exactly when it takes a snapshot of the system, but also what processes started and exited since the last refresh interval.

Installing atop will also (depending on your distro etc.) set it up to snapshot the system state every 600 seconds. If you run "atop -r" you can review that legacy old data from today or an older day, and switch between the 10-minute snapshot with t and T.

Personally i like "sar" for quick text only overview (sysstat package). Once enabled you have a 10 minute snapshot of a huge amount of performance metrics (e.g. sar -r for memory, sar -b for disk). Of course, it's even better if you use something to collect them centrally (I signed up for DataDog which takes very little effort to integrate compared to rolling your own stuff).

pidstat from the sysstat package seems to be excellent for what I was looking to do. Thanks a lot, both tools are keepers!

I was really surprised not to see atop in his list. Very handy interactive monitor, and the history mode, as you not, puts it well ahead.

Scalyr [1] can do this. (Disclaimer: I am the founder of Scalyr, and it's a commercial product.) We aim to be a one-stop-shopping monitoring tool: collect everything you might want to collect, and let you analyze it in any way you want. To your question, we can collect CPU, memory, I/O, and other stats for specified processes [2], and give you graphs, rolled-up dashboards, and alerts on that data.

We're always looking for feedback, and we're happy to give out discounted or free accounts to startups. Drop me a line -- steve@[company domain] -- if you're interested.

[1]: https://www.scalyr.com [2]: https://www.scalyr.com/appDashboard

You need more images on your site...

I agree, I want to see screenshots.

Thanks for the feedback. We're hearing this often so we're including plenty of screenshots in our new design, which is currently in the works.

An answer somewhat stolen from stackoverflow says "ps -o rss $(pgrep executablename)" but I guess that assumes that you only have one process running, maybe it would be easier to put it in a script and use "ps -o rss $!"


Real simple way: Cron job to dump "top" to a file. It will tell you all processes and their memory/CPU usage every x minutes. Once you need data on a specific process, you can just grep its pid.

  $ man sar

Another good one is dstat[1]

1: http://dag.wiee.rs/home-made/dstat/

This makes me curiouser and curiouser every day - are you guys typing this citation brackets by hand or there is something I'm missing?

It's a fairly old convention for adding links or references in plaintext. Used on mailing lists also. The alternative is to move to hypertext (not supported in some contexts, not preferred in other contexts), or to add the link or other citation information inline in parentheses, which can make text look cluttered.

Typing them by hand. :)

This is a bit off topic but I created a tool a little while back for my specific (very basic) needs: https://github.com/afaqurk/linux-dash

Demo here: http://afaq.dreamhosters.com/linux-dash/

Easily extensible if anyone wants to use it.

really nice!. PS you have a typo in the title, dashboad instead of dashboard :)

looks awesome... will try this out

My personal list of tools that I use all the time:

    ...I'll add more as I remember them, currently gotta sleep.

If you want to monitor all log files for security purpose, I can recommend OSSEC: http://www.ossec.net/

And OSSEC Web User Interface (ossec wui): https://scottlinux.com/wp-content/gallery/site/ossec_web.png

The sysstat suite [1] is quite handy for single nodes. Includes utilities to monitor system performance and activity over time.

[1]: http://sebastien.godard.pagesperso-orange.fr/documentation.h...

iftop is also a very good one, gives nice graphs of the network utilisation sorted by host or port— http://www.ex-parrot.com/pdw/iftop/

Check out bwm-ng

I am a contributor to glances, very pleased to see it mentioned. Glances can also run as a server, which can then allow glances clients to connect, or even the android app Android Glances.

I don't think PowerTop's been mentioned. Maybe because it's more useful for a laptop than a server. Once calibrated, it can output a HTML report on a machine's consumption, which includes a handy list of tunable power saving options. It was written by Intel so it may, or may not, work that well on other processors. https://01.org/powertop

Enable sysstat and install one of the sysstat graphing tools (or roll your own with your preferred scripting tool).

Noting when things go titsup.com can be particularly useful.

Anybody here now of a ApacheTop-like tool for nginx?

The default value for "log_format" is "combined" -- identical to the Apache "combined" log format -- so apachetop can read nginx log files without any needed changes.

GoAccess is awesome, free and open source console based. It may output an HTML, JSON, CSV report too.


apachetop works fine with nginx logs too.

That is nice mgz. Not tried this yet. Should give it a try soon then :-)

Strange definition of a "monitoring" tool. It essentially requires a human being to run the tool, look at the graph, and make a decision on the data. There isn't even a baseline to compare the data to. This isn't really monitoring, it's equivalent to typing df and looking at how many bytes on disk is being used, and doing this every 5 minutes.

I could never get apachetop to work on Debian or Ubuntu. It displays the data all right, but doesn't respond to most of my keypresses, and it segfaults. Maybe related to its not having been updated since 2005 or so? It's too bad, because it promises to do exactly what I want.

I think it is amazing with all the great suggestions everyone here support this post with. Much appreciated - thank you :-) I will add them to the post later, as a list with an URL to their website.

https://news.ycombinator.com/item?id=7180300 - The follow up post is made :-)

One of 7 Buzzfeed-headline-style pages for system administrators you won't believe!

Seriously, glances is nice, but I generally use tmux with a bunch of panels, showing stuff like `watch -c df -h +nr`

https://news.ycombinator.com/item?id=7180300 - The follow-up post is made :-)

A very lightweight tool I've found useful is nmon.

Actually, bwm-ng is a nicer alternative to iftop. more information an nicer presented (I think)! try it :)

bwm-ng really has a neat and to the point interface, but it doesn't gives break up of individual established network sessions. Some times the individual details are needed too. That's where iftop comes in handy.

Yeap. bwm-ng > iftop for most uses. If there's a performance problem, iperf is the old standby and even works with 10gbe links.

Could anyone please explain why top shows one instance of Chrome while htop shows numerous stances of it?

sysstat (sar), slabtop, free

Everyone who deals with *nix in any way has known of all these applications by default.

Couldn't come up with anything else to promote your useless blog?

Yes, most people know of them :-) I just felt like presenting them to people who might not know of them and their strengths. Why not present tools, although many know of them already? I think it is important to talk freely about any software, as there is always someone who might not know them. I appreciate this myself, and i am actually also new with *nix systems. If everyone had the same approach to information as you, i would not be a part of it.

It's probably worth adding to this article as you find new utilities or new ways to use them - look at all the great suggestions in this comment page!

All the suggestions here is amazing. I will take a look at this later. At least make an edit with a list of the suggested software. Did not see this coming :-) Thank you everyone.

I didn't know about apachetop and glances. I was actually trying to install an app like zabbix before I stumbled on this thread. If these tools can help me monitor the servers, that's even better! Thanks @adionditsak for sharing.

I for one did not know about glances and apachetop, thanks OP

Comment sent to bottom, that says it all, haha.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact