
GNU Parallel – The command line power tool - vsbuffalo
http://www.slideshare.net/fscons/gnu-parallel-ole-tange
======
ealloc
I use parallel all the time for embarrasingly parallel scientific computations
on a cluster. It is very easy to use and elegant, and it's one of the programs
I'm most grateful for.

Recently the developers fixed a major bug for me, that child jobs on other
nodes would not be killed when parallel was killed. This was the only thing
stopping me from recommending it to my labmates, now there's no reason not to
use it!

~~~
ole_tange
Remember to show them 'parallel --bibtex'

------
oofabz
I use GNU Parallel. I like it because its interface is simple - input is
piping filenames to it, just like xargs, and output is nicely collated to the
screen.

I used to use ppss, which does the core task just as well, but the interface
is more complex.

I mostly use these tools to optimize large numbers of PNGs before deployment,
using optipng, pngout, and/or my own lossypng. These programs take a while to
run so using all my cores gets the job done a lot quicker.

------
felixr
The documentation of GNU parallel
([https://www.gnu.org/software/parallel/man.html](https://www.gnu.org/software/parallel/man.html))
also contains a lot of nice examples on how to use parallel.

------
rcthompson
i use GNU parallel exclusively in place of xargs simply because it has --dry-
run.

~~~
_ZeD_
wouldn't an "echo" just do the same?

    
    
        $ mkdir a
        $ cd a
        $ touch b c d e
        $ find -type f | xargs echo rm
        rm ./b ./d ./e ./c
        $ ls
        b  c  d  e
        $ find -type f | xargs rm
        $ ls
        $

~~~
jasomill
One advantage of "parallel --dry-run" over "xargs echo" is that the former
quotes its output:

    
    
        $ touch 'Ham
        Jam
        Spam'
        $ touch 'J.R. "Bob" Dobbs'
        $ find . -type f -print0 | parallel -n1 -0 --dry-run echo
        echo ./Ham'
        'Jam'
        'Spam
        echo ./J.R.\ \"Bob\"\ Dobbs
        $ find . -type f -print0 | xargs -n1 -0 echo echo
        echo ./Ham
        Jam
        Spam
        echo ./J.R. "Bob" Dobbs
    

For my own "dry runs", though, I've always preferred passing the command line
to

    
    
        #include <stdio.h>
        
        int main(int argc, char* argv[]) {
            char** argp;
            int i;
            printf("argc = %d\n", argc);
            for (i = 0, argp = argv; *argp != 0; ++argp, ++i) {
                printf("argv[%d] = %s\n", i, *argp);
            }
            return 0;
        }
    

to remove all reasonable doubt.

~~~
peterwwillis
Why does the parallel example look weird? It put the single quotes in all the
wrong places, and completely around Jam. It should have printed

    
    
      echo './Ham
      Jam
      Spam'
    

but your output looks different. As an example of how it should look, try
this:

    
    
      $ find . -type f -print0 | xargs -n1 -0 perl -e'print "\"$_\" " for (@ARGV)'
      "./Ham
      Jam
      Spam"

~~~
davvolun
Having never used parallel, I still believe parallel was correct.

./Ham' 'Jam' 'Spam

would be identical to ./Ham\nJam\nSpam (if \n were the correct translation to
the newline in this case) or './Ham Jam Spam'

This would be identical to what you wrote, but only punts to quotes when it
doesn't have a canonical method of representing the character otherwise. The
fact that you don't need to explicitly concatenate two strings in the shell
may be what's throwing you off?

Interestingly enough, 'Ham\n\nJam\nSpam' becomes

./Ham' '' 'Jam' 'Spam

So parallel is just literally outputting all newlines using quotes. I believe
this would be identical, if you analyzed it and saw that two newlines are next
to each other:

./Ham'

'Spam' 'Jam

------
zurn
Anybody have a link to a version viewable without proprietary plugins? "Flash
Player 9 (or above) is needed to view presentations"

~~~
dreen
This looks very similar but is actually a year newer than the slides on
slideshare:

[http://www.luga.de/Angebote/Vortraege/GNU_Parallel_LIT_2011/...](http://www.luga.de/Angebote/Vortraege/GNU_Parallel_LIT_2011/GNU_Parallel_LIT_2011.pdf)

------
shrike
I use GNU Parallel with s3cmd to move big data sets in and out of S3. I can
easily saturate any network connection. I was able to GET ~2TB from S3 onto a
Gluster cluster in a little more than an hour by using GNU Parallel to spread
the GETs across 8 instances. Incredibly powerful, easy to use tool.

------
adrianN
Wow, I must have reinvented this particular wheel at least five times.

~~~
mineo
This is exactly what I feel like every time I see an article/presentation
about (maybe even really small) tools that just get the job done but I didn't
know about and didn't even think about looking for although I can think of so
many cases where they would've been incredibly useful.

------
gnoe
Is parallel buggy or is it just me? For example if i have a list of ip
addresses:

    
    
      $ cat ips.txt | sort | uniq -c | sort -rn
    
       3 127.0.0.1
       2 192.168.1.1
       1 192.168.1.2 
    

Now i want to reformat the output of uniq -c, i want the count to the last
column:

    
    
      $ cat ips.txt | sort | uniq -c | sort -rn | parallel --colsep ' ' echo {2} {1}
    

But gives empty output.. what gives? It only works if I double pipe it thru
parallel like this:

    
    
      $ cat ips.txt | sort | uniq -c | sort -rn | \
          parallel --trim lr echo | parallel --colsep ' ' echo {2} {1}
    
      127.0.0.1 3
      192.168.1.1 2
      192.168.1.2 1

~~~
ole_tange
You have more than 1 space from uniq. 2 options:

    
    
      parallel --colsep ' +' echo {2} {3}
    

or:

    
    
      parallel --colsep ' ' echo {7} {8}

~~~
gnoe
Thanks!, but how come the whitespace is not trimmed by --trim lr? The manpage
says it trims whitespace left and right if --colsep is used.

------
jftuga
I wrote a similar program for windows.

[https://github.com/jftuga/Windows/tree/master/mp](https://github.com/jftuga/Windows/tree/master/mp)

The only file you need to download is mp.exe. Source code is mp.au3.

~~~
RachelF
nifty tool!

------
guangnan
Load test with parallel:

    
    
      cat urls | parallel --jobs 4 --load 6 'curl -s -w "%{time_total}\n" -o /dev/null {}'

------
ck2
I love pssh for simplicity but I guess I better look at fancier stuff too.

