Hacker Newsnew | past | comments | ask | show | jobs | submit | zhanxw's commentslogin

I had the same issue and now I cannot use "F" and "Right arrow". It is a smart idea to disabled the right arrow key and mapped capslock + J K L I !

> mapped capslock + J K L I

This is such a good idea that it makes other peoples machines nearly useless for you

All credit to https://tonsky.me/blog/cursor-keys/


fyi. The keyboard in my last MacBook Pro also failed. Apple Store charged ~$500 to fix it.

Why this can be faster than GNU parallel?


GNU Parallel is extremely sluggish because it does all sort of different things behind your back: It buffers output on disk (so the from different jobs are not mixed and you are not limited by the amount of RAM - it will even compress the buffering if you are short on disk space), it checks if the disk is full for every job (so you do not end up with missing output), it gives every process its own process group (so the process with children can be killed reliably with --timeout and --memfree), and a lot of other stuff.

It lets you code your own replacement string (using --rpl), and lets you make composed commands with shell syntax:

    myfunc() { echo joe $*; }
    export -f myfunc
    parallel 'if [ "{}" == "a" ] ; then myfunc {} > {}; fi' ::: a b c
It does not need a special compiler, but runs on most platforms that have Perl >=5.8. Input can be larger than memory, so this:

    yes `seq 10000` | parallel true
will not cause your memory to run full.

You can read a lot more about the design in `man parallel_design` and see the evolution of overhead time per job compared to each release on: https://www.gnu.org/software/parallel/process-time-j2-1700MH...

In other words: Treat GNU Parallel as the reliable Volvo that has a lot of flexibility and will get the job done with no nasty corner case surprises.

It is no doubt possible to make a better specialized tool for situations where the overhead of a few ms per job is an issue and where you neither need brakes, seatbelts nor airbags. xargs is an example of such a tool, and you can have both GNU Parallel and xargs installed side by side.


One possible reason is that GNU parallel is a perl script.


a version in C that I think was first released before GNU parallel is in "moreutils" https://joeyh.name/code/moreutils/

A few years ago, debian made GNU parallel provide the "/usr/bin/parallel" executable, instead of moreutils. The maintainer of moreutils had some interesting things to say about that: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=597050#75


> It is 5143 lines of code, and the anthethesis of a simple, clean, well-designed unix command.

Only 5k? :D

But, yes, the criticisms are valid. I recommend moreutils.


He lost me at the point when he complained that GNU parallel "includes the ability to transfer files between computers". For me at least that is _the_ feature of GNU parallel that actually makes it really useful. Which I guess is the problem with all these discussions, one persons useless bloat is another persons nr. 1 killer feature.


But in the spirit of Unix, shouldn't parallel exec some file transfer program?


The actual Perl code calls out to ssh and rsync (or can be configured to use something else) when it's time to actually connect and transfer files. It just does it in a way that is nice and reasonably transparent to the end user.

It felt like his complaint was that that was 'bloat' since Real Men can achieve almost the same thing by just piping some output through some bash scripts they just hacked together.


And it is exactly the hacking part that GNU Parallel tries to help with: A lot of the helper functions in GNU Parallel could be done by expert users (--nice, --tmux, --pipepart, env_parallel, --compress, --fifo, --cat, --transfer, --return, --cleanup, --(n)onall).

But non-expert users will invariably make mistakes (e.g. get quoting wrong, not getting remote jobs to die if the controlling process is killed, or re-scheduling jobs that were killed by --timeout), and why not just have small wrapper scripts built into GNU Parallel that are well-tested, so the non-expert users can enjoy the same stability as the expert users?


Having written my fair share of those hacky wrapper scripts before I discovered GUN parallel I certainly am very happy that they offer everything I need in a single easy to use command.


It does: Rsync.


One of the reasons for the high line number count is the decision not to depend on modules that are not part of the core of Perl 5.8.

Quite a few of the lines are to deal with different flavours of operating systems.

The benefit of this is that by copying the same single file you can have GNU Parallel running on FreeBSD 8, Centos 3.9, and Cygwin.


> a version in C that I think was first released before GNU parallel

While this is technically correct, it is misleading: GNU Parallel existed before it became GNU. See details on: https://www.gnu.org/software/parallel/history.html


I'm sure that's the cause. Their test script is just a straight `echo` of the input, so each test process will exit essentially immediately - it's unlikely the parallel aspect actually kicks in to any decent degree. The majority of the test is spent in the Rust/Perl code vs actually running commands. That said, while the test isn't hugely useful, the fact that this Rust implementation has much less overhead is still a notable improvement.

To add to this, parallel mentions as much in its man pages (that there is a certain startup cost, and a certain job-startup cost), and offers tips for speeding up the processing of jobs which exit fairly fast. But there's no reason you couldn't also do those things in the Rust version, so it's going to win every time. When dealing with commands which take a while to complete though, the extra overhead of the perl script would probably be negligible.


Err, no. It execs other processes. the runtime overhead of running the interpreter is irrelevant as it provides no overhead over the general runtime of the sub-process.


GNU Parallel takes a surprising amount of CPU time. It does have various tasks (track it's children, feed them input, gather all their output and output it to the screen in the correct order), but I'm still surprised how much CPU it takes.


Hmm.. The example put up on the github README is a bad test case(for performance comparison), as it does almost no processing on the actual subprocesses. A better one could be creating thousands of small files(with dd or something in a directory, though it's io-bound not cpu-bound). Best might be some kind of repeated floating point exponentiation repeatedly.


Does this imply that to tune 11b params your basement needs to have 16 machines with 4 GPUs each (64 GPUs in all)?


Slides 28 and 35 suggest it was 3 servers, each with 4 GPUs, i.e. 12 GPUs total. If that's the case, then you can probably build a Google scale network (1 billion parameter, 9 layer neural network, which needed 1,000 machines and 16,000 cores running for a week in 2012) at home for around £4K (US$6.2K).


Does anyone know why it's called klib? I don't know about the prefix "k", but guess it's borrowed a lot of kernel codes?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: