Hacker News new | comments | show | ask | jobs | submit login
GNU Parallel (gnu.org)
31 points by ingve 1 hour ago | hide | past | web | 13 comments | favorite





As always gets brought up when GNU parallel is mentioned: xargs does most of the use cases you'd need for parallel.

xargs -n1 -P4

Would be at most one arg from the arg list run with 4 jobs. http://stackoverflow.com/questions/28357997/running-programs...

reply


I have a small cluster of machines that I run experiments on. GNU parallel makes the dispatch of jobs on remote machines very easy.

In addition, I often use it to search for sequences by running grep in parallel. For example

$ parallel 'grep {1} -f haystack.txt' :::: many_needles.txt

Where {1} is a single line in many_needles.txt

reply


Unless those patterns are regexes, you should just be using

    $ fgrep -f many_needles.txt haystack.txt

reply


How much faster is a plain text search really than an regexp without special characters? You'd think this would be quite easy to optimise for a regexp engine.

I admit I try to use -f all the time but your post suddenly made me realise I'd never actually measured the effect. :/

reply


GNU Parallel - an amazing tool with the most user unfriendly brick-wall-in-your-face documentation imaginable. Shame really - it's great.

reply


To be honest the same can be told about many GNU tools. At least myself I still experience this moment of being totally lost infront of the man screen from time to time. Parallel has a nice tutorial https://www.gnu.org/software/parallel/parallel_tutorial.html Have you seen it?

reply


Am I the only person who finds GNU parallel way too complicated? I tried to perform a very easy parallel task with it and spent hours reading the documentation and various tutorials. If a person with Unix command-line skills can't easily pick it up, what's the point of having it?

reply


There do seem to be some very complex use cases. On the other hand, their example of parallel gzip of files seems straightforward:

find . -name '*.html' | parallel gzip --best

Generally, using it in places where you would normally use xargs seems uncomplicated.

reply


What was it that you tried to achieve? I find it very easy for trivial parallelisation over files but am quickly lost when it becomes more complex.

reply


There is also an in-development GNU Parallel clone/alternative written in Rust. https://github.com/mmstick/parallel

reply


GNU Parallel rules! I made $100K with just a few lines of code involving Parallel.

reply


cat 100GB_data_file_with_DUPLICATE_lines | parallel --pipe awk \'\!a[\$0]++\' > data_file_with_UNIQUE_lines

is BETTER since it uses less computer resources than

gawk '!a[$0]++' 100GB_data_file_with_DUPLICATE_lines > data_file_with_UNIQUE_lines

reply


GNU Parallel sucks. Use xargs when possible and paexec when needing fancy features. BTW, paexec even supports piping next to process invocation.

reply




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: