Hacker News new | past | comments | ask | show | jobs | submit login
A Unix Utility to Know About: lsof (2009) (catonmat.net)
149 points by kercker on Aug 26, 2016 | hide | past | web | favorite | 53 comments

For finding what uses a file, I'm an 'fuser' kid but lsof is fine too. Using posh right now, so had to make my own:

    function fuser($relativeFile){
      $file = Resolve-Path $relativeFile
      foreach ( $Process in (Get-Process)) {
        foreach ( $Module in $Process.Modules) {
          if ( $Module.FileName -like "$file*" ) {
            $Process | select id, path
In use:

    > fuser .\node_modules\

      Id Path
      -- ----
    2660 C:\Program Files\nodejs\node.exe
Since it's object pipelining, you can:

    > fuser .\node_modules\ | kill

I find myself using lsof for it's ability to show me what TCP sockets are in use more than what actual files are open.

lsof | grep TCP

For that I often use:

netstat --ip -a -p

Those particular options mean to show just IP ports (UDP/TCP), in all statuses (active, listening) and to display the local process which is involved.

`lsof` and `ps` have the most dense man pages I have ever tried to plow through, and that's causing me to use these tools only at a minimum.

I am guilty of using `ps|aux` for the sole purpose of answering this question: Is it (still) running ?

edit: `ps aux|grep`, indeed. Silly me in the early hours of a rest day :).

You mean `ps aux | grep` which I am also guilty of using every time ? You might be interested in pgrep (http://linux.die.net/man/1/pgrep) as well.

For interactive use, I strongly recommend htop if you're still doing stuff like "ps aux | grep" and then "kill 84728". Htop is top on steroids: ncurses-based and very fast, has colors, threaded or flat process view, sorting by different criteria like cpu or ram use, searching for substring match in process name, interface to killing/signalling to selected process, etc.


I am more a `killall <processname>` guy.

> I am more a `killall <processname>` guy.

I've made a note to name the executable for my next daemon "humans".

Don't do that on Solaris ;-)

Been there, done that :(

See also the 'reboot' (or is it the 'poweroff' ?) command, which is much more direct than the Linux equivalent...

Sometimes, I want something a little more murderous, so I tend to write this script on most systems I use:

    ps -ef | grep $1 | awk '{print "kill -9 " $2}' | sh
Usually stored as /usr/local/bin/kll

I've found it's very useful for killing specific Java programs without having to killall -9 java.

(and, yeah, I should probably be using ps -efww instead of ps -ef, but old habits die hard... I've already got ps aliased to ps -ww for interactive use, so I should probably change this script...)

You should try jps it's similar to ps but only lists java processes with their real names instead of just "java". It's included in pretty much every JDK.

'pkill' can do a bit more, it's the better alternative in my opinion.

oh god i thought I was the only one

For just viewing system information, I usually prefer glances to htop, although it can't signal. It has a curses interface and also includes network traffic, docker containers, disk IO, and it's nicely pluggable for both output formats (it can dump to CSV at interval as a daemon or while you're running it interactively (or something like graphite/influxdb if you have those)) as well as collection mechanisms (you write some python).

I discovered htop when regular top changed its default behavior to be nigh unusable. htop was recommended as an alternative, and while I've since learned how to configure top to go back to its old behavior [0], I've found that I just prefer htop now.

[0] https://bbs.archlinux.org/viewtopic.php?id=189757

Agreed, htop is pretty awesome!

I have this little bash function which I like to use to find out if a program is running, and to optionally kill it. Originally my goal was to keep it fitting inside a tweet, but adding helpful messaging just pushed it over.

    pgk ()
        [ -z "$*" ] && echo 'Usage: pgk <pattern>' && return 1;
        pgrep -fl $*;
        [ "$?" == "1" ] && echo 'No processes match' && return 1;
        echo 'Hit [Enter] to pkill, [Ctrl+C] to abort';
        read && pkill -f $*

>> I am guilty of using `ps|aux`

What's wrong with doing that?

I've been doing that for so long I don't know why I do it

Because there's so many process selectors and formatters in ps from procps?

  ps -C procname
  ps -p pid
  ps -u username
  ps -t ttyname
  ps -f [-w [-w]]
  ps -o output-format

Modern Linuxes have pgrep to help avoid the inevitable "lol, the grep process I fired will match" issue that results in having to filter that out as well with yet another program.

Yeah, I always felt having to type the extra bit was strange, like maybe grep could have a switch to say "don't find grep" but I guess having a switch for such a specific use case, grepping processes, is overkill.

Anyway, it's just a reflex now to always:

  ps aux | grep <program> | grep -v grep

You can also surround each character in the program's name with square brackets.

    ps -ef | grep [p][r][o][g][r][a][m]

Yes, I never knew that. But I saw it in another post and as the sibling post mentions it's enough to surround just the first character with the brackets...

This is why I love HN, I learn stuff I never knew I wanted to know!

You only need surround one character.

One problem with pgrep is it has no option to ignore case.

Yes, I do sometimes have processes with camel-cased names which I may or may not remember exactly. So you:

  ps auwx | grep -i {lc-version-of-camel-cased-name}

Good to know, I always just `ps aux | grep [j]ava`

Sometimes powerful tools require effort to understand. Lsof is absolutely something you should spend some real time grokking.

In regard to ps you can memorize a few commands until something truly weird comes up since you will probably use other more specialized snapshotting and tracing tools for anything somewhat difficult to troubleshoot.

> Sometimes powerful tools require effort to understand. Lsof is absolutely something you should spend some real time grokking.

Yup, this is true. And I did. But sometimes you just need less dense manual pages.

I find lsof and ps too hard to remember, so I usually just use regular linux utilities like cat/grep on the /proc file system.

  - grep bash /proc/*/status
  - readlink /proc/*/fd/* | grep /run

That's what I've been trying to tell people about Git, too. It's a powerful tool and it takes some effort to wrap your mind around it.

> `lsof` and `ps` have the most dense man pages I have ever tried to plow through

That's why I was so happy to stumble across pstree the other day. Even the man page is excellent! Every now and again, I'll come across a util like this that I can put into immediate action w/o lots of deciphering. pstree -a is awesome for a relative newbie like me.

lsof -i -n -P is a regular "reflex" command whenever I'm trying to figure out why a daemon I've just setup isn't responding to requests (immediately answers the questions: is it running? and, if yes, under what privs? and on what interface?).

That together with a dump of active iptables rules normally results in an immediate fix for 90% of "why can't I connect to X" problems :)

had to try this lsof to see it's essentially `netstat -atpu`

Most days, HN threads make me feel really dumb. On rare days, they make me feel really smart.

Although a bit rare in my usage, I have found basic use of lsof quite useful when needed. I haven't even tried many command line options. The options it provides really seem comprehensive. I have also found similar tools on Windows (like WhoLockMe or Process Explorer) quite useful.

I'm not exposed to windows machines as much as i used to be, but yeah process explorer was pretty cool.

What kernel functions does this command invoke to retrieve this information? How could I write my own version of lsof?

The stat() family functions for sure, BSD Sockets, user account, covering everything from "process environement" to "filesystem" functions.

"FAQ about lsof"[1] is very instructive.

[1] ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/FAQ

On Linux, it probably uses files under /proc. For other systems, I don't know.

strace it and see. As the other poster notes probably open() call to procfs or stat()

If this sort of thing interests you, I cannot recommend https://www.nostarch.com/tlpi enough. The hardcover is worth every penny.


The following can be adapted to provide other information: whatever procfs provides. This is a rough equivalent of "pgrep -fl .|less". Work-in-progress. Don't know if Linux grep has "-a" option.

  #! /bin/sh 
  # Almquist clone, not Bash
  case $# in
  exec grep -a . proc/[0-9]*/cmdline \
  |exec tr '\000' '\040' \
  |exec sed '
             /grep -a .* proc/d;
             #parent: '"$$"';
             s/proc./ /;
             s/\/cmdline:/ /;
 ' \
  |exec less
  exec grep -a . proc/[0-9]*/cmdline \
  |exec tr '\000' '\040' \
  |exec grep $@ \
  |exec sed '
             /grep '"$@"'/d;
             s/proc./ /;
             s/\/cmdline:/ /;
 ' \
  |exec less

You don't need any of those `exec`s. Yes, GNU grep has `-a`.

If you don't like the backslash-newline-pipe sequence, if you put the pipe at the end of the previous line, you don't need the backslash; but it's less obvious that the next line is operating on the output of the previous.

Multi-line arguments to sed can be a pain (good luck getting your editor to auto-indent them). Instead, you can use -e to specify multiple sed commands.

There's nothing in there that would make it not work in bash. That should work in any Bourne-family shell.

It only handles 0 or 1 arguments correctly, not > 1.

Thanks for taking the time to comment.

Multiple arguments could be added if you want that. I personally do not need it as I search the cmdline patterns I need without using spaces. I use dots instead. Quick and dirty.

I write 100's of these small scripts for my own use only so I have my own style, peculiar as it may be. I never need indentation because I always keep scripts short; I only use it occasionally and randomly.

I do not use -e with sed, unless I'm using branches or loops.

The execs seem superfluous but actually make a difference, at least on the UNIX I use. Try it with and without and see if you notice.

All my scripts are portable to Bash, but they're also portable to the most basic of Bourne-compatible shells too. I do not use Bash.

What shell are you using? I could see a naive shell forking twice without exec, but I don't know of a shell that does that. Exec just says "don't fork(3) before calling exec(3)", but, but inside of a pipeline like that, it shouldn't fork again anyway. I have tried it with and without.

In theory it should not make a difference but my scripts seem to run faster when I use exec after pipe.

Normally I would only add exec to the last command in the script, as djb does. But then I started experimenting with using it after pipes.

If you or anyone can explain why this could make scripts "seem" to execute faster, I would be grateful.

Here's the shell source: ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-release-7/src/bin/sh

Incidentally, have you ever tried execlineb? I use that sometimes too.

lsof is my go-to command for "why won't this drive unmount?". That alone makes it incredibly useful.

Also, back in the bad old days, 'lsof | grep snd' helped track down what the hell was hogging my sound card (setting up proper mixing has made that a distant memory, though).

lsof | grep Trash is super useful on os x when Finder refuses to empty the trash because some process is holding onto one of the files.

You can save quite some time waiting for reverse dns lookups if you use "lsof -n"

I frequently use lsof to get the location of a log file created by a process.

something like,

lsof | grep <pid> | grep log

fstat(1) is similar on FreeBSD, lsof is in the ports.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact