
Moreutils - signa11
http://joeyh.name/code/moreutils/
======
dredmorbius
This is a great set, but a request to Joey:

There are two conflicting implementations of a parallel utility. And from what
I can tell, the GNU parallel utility is much more useful than the one in
moreutils. Which meant that when I was 1) doing processing which benefited
greatly from parallelization and 2) found that the moreutils version wasn't
doing what I wanted nor could I figure out how to make it do so (compounded by
confusion over online searched providing GNU parallels syntax which didn't
work), I had to remove the entire moreutils set to install GNU parallels under
Debian.

The two versions aren't even a candidate for /etc/alternatives resolution as
the commandline syntax and behavior differs.

Either a name change or refactoring to a different package for the 'parallel'
utility would avoid much of this.

And I'd really like to see numutils packaged.

Also: 'unsort': sort -R | --random-sort

(using GNU coreutils 8.23)

(I'm not familiar with a seed-based randomized sorting utility though.)

~~~
dima55
[https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=597050](https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=597050)

~~~
dredmorbius
I'm familiar with that.

Joey's apparent resistance _to simply splitting out 'parallel' to its own
package_ is ... disappointing. His final comment (regarding other utilities in
upstream and switches) is non sequiturs and red herrings.

~~~
moe
Welcome to Debian politics.

Where boneheads like Joey get to block trivial fixes for decades (5 years and
counting in this case). The project really needs a better process to terminate
'lame' maintainers.

~~~
ole_tange
Just for completeness [https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=518696](https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=518696) Had parallel been packaged at that time (Mar
2009), it would have pre-empted the parallel in moreutils (which was added
June 2009).

------
falcolas
I find it entertaining that bash (zsh, ksh, fish, etc) itself provides ways to
do what many of these utilities do. The anonymous pipes, named pipes, and
process substitution mechanisms can replace many of these tools.

For example:

    
    
        pee -> 
        some_process | tee >(command_one) | tee >(command_two) [...]
        # This one might need a bit more magic with named pipes to consolidate the output without race conditions, since command_N will be executed in parallel. Or take a note from the chronic replacement below and use a temporary file to execute them serially.
    
        chronic ->
        TMPFILE=$(mktemp) some_process 2>&1 > $TMPFILE || cat $TMPFILE; rm $TMPFILE
    
        zrun ->
        command <(gunzip -c somefile)
    

Still, having a utility to abstract away the pipes makes sense.

------
pjungwir
These are cool, and I use chronic all the time. But is there any more
documentation beyond this page? I can't find any, and I'd love to read more
about pee and see some examples. It seems there is more documentation for the
rejected utilities than the accepted ones!

~~~
avar
When you clone the Git repository and build the package all the utilities come
with manual pages built from DocBook, e.g. for chronic:

    
    
        $ man ./chronic.1 | col -b | grep -v ^$ | head -n 12
        CHRONIC(1)                                                                                                                              CHRONIC(1)
        NAME
               chronic - runs a command quietly unless it fails
        SYNOPSIS
               chronic COMMAND...
        DESCRIPTION
               chronic runs a command, and arranges for its standard out and standard error to only be displayed if the command fails (exits nonzero or
               crashes).  If the command succeeds, any extraneous output will be hidden.
               A common use for chronic is for running a cron job. Rather than trying to keep the command quiet, and having to deal with mails containing
               accidental output when it succeeds, and not verbose enough output when it fails, you can just run it verbosely always, and use chronic to
               hide the successful output.
                       0 1 * * * chronic backup # instead of backup >/dev/null 2>&1

~~~
pjungwir
Ah, thanks!

More people might use this if the author put those docs online and linked to
them from the main page. I don't know docbook but it looks like styling it as
HTML should be easy.

Now I understand pee. I wrote something less generalized here:

[https://github.com/pjungwir/stutter](https://github.com/pjungwir/stutter)

I guess `stutter foo` is equivalent to `pee cat foo`.

------
akkartik
Wait, is _sponge_ just another way to redirect to a file? What's the benefit
of:

    
    
      $ echo hi |sponge y
    

over:

    
    
      $ echo hi > y
    
    ?

~~~
mraison
From the man page:

> Unlike a shell redirect, sponge soaks up all its input before opening the
> output file. This allows constructing pipelines that read from and write to
> the same file.

~~~
akkartik
Ah! That is indeed useful.

------
boon
If you're good with vim (and particularly with vim macros), `vidir` is
indispensable.

~~~
fafner
GNU Emacs has something like that in Dired. It's called Wdired (writable
dired) and allows editing the Dired buffer and then applies the changes. Think
of it as editing `ls` output.

[https://www.gnu.org/software/emacs/manual/html_node/emacs/Wd...](https://www.gnu.org/software/emacs/manual/html_node/emacs/Wdired.html)

~~~
hk__2
`vidir` uses the EDITOR variable, so if you set it to emacs you can use with
it:

    
    
        EDITOR=emacs vidir .

~~~
boon
AWESOME. I didn't know that, but it makes sense and is very unixy.

------
davexunit
One weird thing about moreutils is that the source releases are only available
via the source package on debian.org.

~~~
fredsted
There's a link to the git repo in OP.

~~~
davexunit
source release != git repo

~~~
jerf
Well, yes, because git repo > source release so it's clearly not equal.

I just checked; the repo tags releases in some reasonably proper manner.
(Personally I prefer some prefix like "release_0.2" to a tag simply named
"0.2", but it does the job.)

~~~
davexunit
>because git repo > source release so it's clearly not equal.

No, they have different uses. I simply want to install the software, so I want
a source release tarball. Source releases include more than what a git repo
provides, such as pre-built configure scripts and Makefiles. A tag in a git
repo is no substitute for a proper release.

~~~
avar
That's true for something that uses e.g. autoconf, but moreutils doesn't build
any makefile or configure script, it's right there in the Git repository. So I
see what your objection is for packages in general, but it doesn't apply in
this case.

~~~
davexunit
We package moreutils in GNU Guix, and it's much more preferable to download a
source tarball than have to clone a git repo, so we download the tarball from
Debian. We clone the git repo when there's no other choice, but it's far from
ideal.

------
robmccoll
Shameless plug but I think it might be useful to others:
[https://github.com/robmccoll/bt](https://github.com/robmccoll/bt)

bt (between)

counts the time between occurrences of the given string on stdin stdin is
consumed. output will be the times in floating point seconds, one per line

------
stevekemp
Relatedly you might enjoy this collection of sysadmin-tools:

[https://github.com/skx/sysadmin-util](https://github.com/skx/sysadmin-util)

~~~
xai3luGi
Considered putting these in Debian?

------
jcoffland
I created just such a utility several years ago. It's called rlimit and is
basically a command line interface to the standard getrlimit() and setrlimit()
unix calls. You can find it here
[http://freecode.com/projects/rlimit](http://freecode.com/projects/rlimit).
I'd be happy to move the source to GitHub.

~~~
vezzy-fnord
ulimit(1) is an interface precisely to the *rlimit calls that is a built-in to
the Bash shell, and presumably others.

~~~
jcoffland
ulimit can read and set limits for the current shell. rlimit set limits for a
child process. Admittedly you could do nearly the same thing by setting limits
with ulimit, running the target command and then resetting the limits to their
former state or by running a sub-shell, setting the limits there and then
running your command in that environment. For example:

    
    
        (ulimit -d 1024; <command>)
    

Or you could do it in one normal looking command with rlimit.

    
    
        rlimit -d 1m <command>
    

Plus rlimit can set things like real-time priority which ulimit cannot.

------
hk__2
If you’re on OS X with Homebrew you can install it with `brew install
moreutils`.

------
anon4
As I see no one has mentioned it, let me pipe in with one more text processing
tool that is invaluable in our modern world - jq
[https://stedolan.github.io/jq/](https://stedolan.github.io/jq/) the
commandline JSON processor. Consumes JSON input and its power is somewhere
between sed and awk.

~~~
dredmorbius
jq _is_ fucking amazing. I think it counts as my Most Awesome Shell Tool
Discovery of 2014.

