Hacker News new | past | comments | ask | show | jobs | submit login
Progress – A Tool to Monitor Progress for Commands in Linux (2016) (tecmint.com)
263 points by mpweiher on May 13, 2018 | hide | past | web | favorite | 38 comments



I know what I am about to say is an absolutely ugly hack, but...

If you cannot install non-standard software (example: customer owner, company operated RHEL boxes) you can get an idea of the current progress of cp/tar and similar programs by inspecting file-descriptor files in /proc.

Assuming that you have a process (like cp) with pid 1234, then in /proc/1234/fd you will find files with a number (corresponding to the file descriptor number in the context of the process) that are either symbolic links to the actual file or to socket connection (or other data structures). Let's assume that source file has fd 3 (you get this information via ls -l /proc/1234/fd )

Then, after you can cat /proc/1234/fdinfo/3 and look at the pos value: that's the position of the cursor in the file.

It's ugly and unpractical, but it should work on every Linux systems featuring basic tools.


  #!/bin/sh
  
  for pid in $@; do
  	cd /proc/"$pid"/fd || continue
  	
  	for fd in *; do
  		file=$(readlink "$fd");
  		if [ ! -z "$file" ]; then
  			size=$( (du -b "$file" 2>/dev/null || echo 0) | awk '{print $1}')
  			if [ "$size" = "0" ]; then
  				continue
  			fi
  			
  			pos=$(< ../fdinfo/"$fd" grep -F 'pos:' | grep -oP '\d+')
  			
  			percent=$(echo "100*${pos}/${size}" | bc -l | xargs printf "%2.2f")
  			echo "${file}: ${percent}% (${pos}/${size})"
  		fi
  	done
  done


I ended up writing very similar code to do the same thing: https://gist.github.com/jhlb/83999422a8d7e8a6e28e

EDIT: as mentioned elsewhere in the thread, the pv program has a way to watch a PID as well that makes this program unnecessary.


This is what I call a "22/7" solution. Close enough to be useful but you'd never use it formally. This is really great to have in my back pocket the next time a long running copy process feels hanged (often via. Ansible) and I just want a sense of, "okay are we minutes, hours, or days away from being done?"


This is just fantastic. I had all the bits of information in my head to be able to do this, but never connected the dots.

Not ugly at all!


Thanks mate :)


That's exactly what the tool in the link does.


This is pretty intense, but applicable to many enterprise setups!


I often use the basic: $ watch -n.2 "du -sb src dest" I suppose you could add awk to get a percentage.


Thanks for the explanation, TIL :)


On macOS you can monitor the progress of many commands with Ctrl-T. With cp for example:

    $ cp Test Test2
    ^T
    load: 3.42  cmd: cp 86526 running 0.01u 4.36s
    Test -> Test2   6%



Yes, but Ctrl T goes back to at least TOPS-20


SIGINFO is very useful. I make it a point to implement support for it whenever it makes sense. Wish Linux would get it.


Some GNU utils use USR1 instead. It's not ideal, but it gets the job done.

I'd also like to map ^T to `kill -USR1 <pid>` on my machines, but didn't research it much.

> Sending an ‘INFO’ signal (or ‘USR1’ signal where that is unavailable) to a running dd process makes it print I/O statistics to standard error and then resume copying. [0]

[0] https://www.gnu.org/software/coreutils/manual/html_node/dd-i...


On FreeBSD, in addition to the ^T, procstat -af is often useful.


Usually I have used pv.

https://linux.die.net/man/1/pv

cp

    pv source > target
tar

    tar czf images.tar.gz image1 image2 image3 | pv > target
gzip

    pv source | gzip > target.gz
    # or
    gzip < source | pv | target.gz


you can give it a pid too

           -d PID[:FD], --watchfd PID[:FD]
              Instead of transferring data, watch file descriptor FD  of  process  PID,
              and  show  its progress.  The pv process will exit when FD either changes
              to a different file, changes read/write mode, or is  closed;  other  data
              transfer  modifiers  -  and  remote  control  - may not be used with this
              option.

              If only a PID is specified, then that process will be  watched,  and  all
              regular  files  and  block devices it opens will be shown with a progress
              bar.  The pv process will exit when process PID exits.


Cool! I love pv and it's one of my favourite utilities but I never knew about this option. Thanks :)


Wow, did not know about this. TIL!


I've just recently learned about the usefulness of PV, here is an excellent quick demonstration:

https://youtu.be/ui4DFIfqH_U


I always use pv when I want to monitor a long running operation, mainly because it shows the rate of progress as well as the ETA for the operation (of course, this is an estimate, but it's still useful).

'man pv' is a short read and is something anyone who want to monitor such things should read.


Comes with rate limiting too. "-L 25k" keeps the network guys off my back when I have to copy logs across continents.


You mean:

gzip < source | pv > target.gz


You are correct, thanks


This is my super-simple progress indicator, called dots. I wrote it at least 12 years ago when I was on a slow terminal and wanted to know unpacking a tarball was still progressing (the verbose output would have taken too much bandwidth!). Now I copy it onto every new system I use.

It prints one period (.) for every 1000 lines of text piped into it. It just shows that there is still activity, not progress, but often this is enough.

A typical usage is:

  tar xvfz some_huge_tarball.tar.gz | dots
It defaults to one dot per 1000 lines, but the first argument can be a number to specify another interval.

  #!/usr/bin/perl

  $| = 1;
  $i = 0;

  if ($#ARGV < 0) {
    $number = 1000;
  } else {
    $number = shift(@ARGV);
  }

  while (<STDIN>) {
    print '.' unless ($i % $number);
    $i++;
  }

  print "\n";


"pv" is a favourite of mine for single command progress indicators


Thanks, I didn't know that existed! I might have to cut a corner off my Unix guru card :)


With respect to cp and mv, I've found that rsync is often a good alternative for anything more than trivial moves and copies.


rsync's ability to resume is especially handy.


It can also avoid IO using hashed blocks.


I had recently written a Python program (called watch.py) somewhat like the GNU watch [1] command (which user kjeetgill mentioned in another comment here).

[1] http://man7.org/linux/man-pages/man1/watch.1.html

A Python version of the Linux watch command:

https://jugad2.blogspot.in/2018/05/a-python-version-of-linux...


zooming out, progress in the cloud is something that should be easier

plenty of our headless jobs know exactly where they are and how big the job is. There should be a standard dash for getting this.

even things like hadoop & luigi that have dashboards built in aren't great at progress & ETA

we should surface & store stats from any long-running loop. Further benefit that you can use prev runtime to properly schedule the next run (or run more sophisticated feedback & alerting).


Peter Krumins has a post about pv (Pipe Viewer, which others here have mentioned) on his blog:

A Unix Utility You Should Know About: Pipe Viewer:

http://www.catonmat.net/blog/unix-utilities-pipe-viewer/


Needs gcc-4.9 to build, which has been deprecated in Ubuntu repositories.


Ubuntu itself packages the tool as the `progress` package. Try an `apt install progress`.


Which tool? Progress or pv? I suspect you could just bump the version.


Built fine using gcc 6.3.0




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: