
Moreutils – Unix tools that nobody thought to write (2012) - pmoriarty
https://joeyh.name/code/moreutils/
======
jfreax
Nice collection of quite useful tools. Some of these can be easily replicated
by using a more modern shell (bash, zsh) like mispipe, others are just
shortcuts (e.g. ifne, chronic).

But what immediately stood out to me is `vidir`. I really like the idea of
editing file names with an editor. Using loops and regex in a shell for mass
renaming can be a mess. It should be way easier with `vim`. This tool made me
install moreutils.

~~~
ainar-g
On most POSIX systems you can use fc(1) to edit a command in your $EDITOR. In
vi(1) and friends you can then use "%!ls" to replace the contents of the
buffer with directory listing and edit the commands you want.

[https://pubs.opengroup.org/onlinepubs/9699919799/utilities/f...](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/fc.html)

[https://pubs.opengroup.org/onlinepubs/9699919799/utilities/v...](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/vi.html)

~~~
RMPR
> you can then use "%!ls" to replace the contents of the buffer with directory
> listing and edit the commands you want.

I often used :r !ls for that, thanks for the tip + it's shorter

~~~
gpanders
Note that in general this is not a one-to-one replacement. `%!cmd` sends the
current buffer contents to `cmd`’s stdin and replaces the buffer with `cmd`’s
output. In the case of ‘ls’ this works since it doesn’t take anything on stdin

------
stevekemp
Inspired by this project I put together sysadmin-tools:

[https://github.com/skx/sysadmin-util](https://github.com/skx/sysadmin-util)

Later they were replaced by a busy-box style collection of utilities written
in golang (mostly to ease installation):

[https://github.com/skx/sysbox](https://github.com/skx/sysbox)

------
rozab
There's a few programs I wish were standard on all Unix systems by now:

* tree - print directory structure

* bat - cat, but actually designed for reading files. Syntax highlighting, line numbers, automatic paging.

* rg - better grep

* direnv - local environment variables

~~~
beojan
> rg

Never going to happen because it's written in rust. ag
([https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher))
on the other hand is written in C.

~~~
nindalf
Never say never :)

If Rust being added to the Linux kernel[1] isn't far-fetched, I don't think
adding a utility in Rust is crazy either.

[1] -
[https://lore.kernel.org/lkml/CAKwvOdmuYc8rW_H4aQG4DsJzho=F+d...](https://lore.kernel.org/lkml/CAKwvOdmuYc8rW_H4aQG4DsJzho=F+djd68fp7mzmBp3-wY
--Uw@mail.gmail.com/T/#u)

~~~
chalst
Interesting.

Isn't it the case that LLVM doesn't support all architectures that Linux does?

~~~
saagarjha
I think this is just for kernel modules that you can choose to compile out.

~~~
geofft
That's what I expected too (as one of the people on the thread / one of the
authors of the farthest-along set of bindings for modules), but Linus and Greg
KH both seem fairly inclined for it to be on by default and used for core code
(small bits of core code, in the beginning) and not simply modules. I'm not
sure how that's going to play out with architecture support, and that's one of
the things I want to get out of the Plumbers session this upcoming week.

[https://github.com/fishinabarrel/linux-kernel-module-
rust/is...](https://github.com/fishinabarrel/linux-kernel-module-
rust/issues/112) is a chart of Linux architectures and whether Rust and LLVM
support them.

~~~
saagarjha
Oh, wow, that would be huge. Not just architecture support, but even having a
toolchain on some platforms might be problematic as I can imagine making rustc
a requirement to build Linux to be fairly controversial in and of itself. And
that architecture list is not looking very good…would be very interested to
know what they decide on as well!

------
parentheses
I'm still stuck on the fact that one of the utilities is called `pee`..

In fact the docs even say:

> make sponge buffer to stdout if no file is given, and use it to buffer the
> data from pee.

Translation, pee into a sponge!

------
krick
I seriously hate this package and the manner of combining different utils with
different names in a same package in general. The reason being, "sponge" is an
actually useful tool and for me it's pretty much the only useful tool in the
package. So I need to install whole moreutils package on ubuntu to have
"sponge" and I have to clog my bin namespace with all this trash. It would be
mildly annoying from the perfectionist point of view, but ok. But there is
also "parallel" command in this package, which is trash as well. Meanwhile GNU
"parallel" is not trash, but an actually useful program, maybe more useful
than "sponge", and it comes in a separate package. So I either have to do some
tinkering, or I have to choose which one I need more: "sponge" or "parallel"
and install only one package. It's year 2020. This is stupid.

This is by the way the reason why GraphicsMagick is better than ImageMagick (I
still use the latter, though, because it doesn't cause the same problems for
me as moreutils does, and it's just more popular than GM).

~~~
appleflaxen
you raise a great point. why not install the package, then delete the /usr/bin
files you don't need?

would it solve the problem if the author namespaced the commands with a
hyphenated prefix? curious about these considerations, which it seems like
you've spent time thinking about. what do you see as the "best practices"
regarding a set of utilities that are maintained and published together?

~~~
krick
In a most general sense, this is not really a problem of moreutils, but of how
tools are installed/distributed in Linux distributions. But since we have to
be realistic, yes, package authors should take such problems into
consideration.

I think if tools are absolutely unrelated, they just should be distributed as
a separate packages. GNU coreutils is tolerated mostly because it's so
ubiquitous (so much, that it causes Stallman to grumble about "you mean
GNU/Linux, not Linux"). moreutils is late to the party, it isn't ubiquitous,
and the usefulness of any tool in the package is questionable, so the author
really shouldn't be so brave to assume that if he thought he needed all of
them, everyone will.

If there is a reason enough to distribute a package with several separate
callable binaries (as with ImageMagick), I think git or GraphicsMagick are
perfect examples of how it should be done. Hypenated prefix is also ok. Even
if your tool is supposed to be used by somebody 50 times a day, too long of a
name isn't really a problem, since user can always just make an alias (as I do
with most of the tools I use frequently).

Sure you can always combine binaries from 2 packages manually, but as I said,
it just requires some tinkering, so I cannot simply have in some textfile a
list of utils to install on a new PC in a matter of minutes.

~~~
boomboomsubban
Stick the binary wherever your textile is, copy it to path on setup.

------
rmetzler
I wish there was a short and simple command for “list all files of a directory
and no subdirectories“. ls doesn’t seem to have a switch for that kind of
functionality.

It’s discussed on stack-overflow. [0]

Someone even wrote a nodejs tool for this functionality [1], but I would
rather have something written in a compiled language.

[0]: [https://stackoverflow.com/questions/10574794/how-to-list-
onl...](https://stackoverflow.com/questions/10574794/how-to-list-only-files-
and-not-directories-of-a-directory-bash)

[1]: [https://github.com/mklement0/fls](https://github.com/mklement0/fls)

~~~
MaxBarraclough
But the top answer has it:

> find . -maxdepth 1 -type f

A little clunky, but there's certainly no need to mess about in JavaScript
land. If you use it often, create a shell macro.

~~~
rmetzler
find doesn't return the same output as ls when run on another directory. Maybe
there is a switch to do this? I'm not sure.

    
    
        $ mkdir -p tmp && touch tmp/a tmp/b tmp/c
        
        $ find ./tmp -maxdepth 1 -type f
        ./tmp/a
        ./tmp/c
        ./tmp/b
        
        $ ls -1 ./tmp 
        a
        b
        c

~~~
fooblat
I think this is what you are looking for

    
    
        $ find ./tmp -maxdepth 1 -type f -printf '%f\n'

~~~
rmetzler
I'm on MacOS... :-/

    
    
       $ find ./tmp -maxdepth 1 -type f -printf '%f\n'
       find: -printf: unknown primary or operator

~~~
fooblat
If you use brew or macports you can install the coreutils package and then
you're good to go with gnu find.

I also use MacOS and I have setup the gnu userland on my local so it matches
the linux environment on our servers and containers.

~~~
gpanders
You’re right that defaulting to the GNU coreutils is probably more convenient.
However, NOT doing this is a good way to ensure that any scripts you write
remain portable and don’t use GNU extensions. That’s the reason I stick with
the default BSD coreutils on macOS.

~~~
fooblat
> You’re right that defaulting to the GNU coreutils is probably more
> convenient.

It's not about convenience. It is about having a development environment that
matches the target runtime environment.

You can run into nasty surprises when the default behavior of a tool in your
dev environment is different than in your production environment.

I've been writing shell scripts professionally for over 20 years and I have
always taken this approach and it has served me well.

~~~
saagarjha
You could also try restricting yourself to POSIX shell.

------
ORioN63
I use vipe frequently, to let me pipe from/into vim. This enables fun
shortcuts as:

    
    
        xclip -o | vipe | xclip
    

This one in particular, let's you edit your clipboard with $EDITOR.

------
dang
If curious see also

2016:
[https://news.ycombinator.com/item?id=12023277](https://news.ycombinator.com/item?id=12023277)

2015:
[https://news.ycombinator.com/item?id=9013570](https://news.ycombinator.com/item?id=9013570)

2015 (1 comment):
[https://news.ycombinator.com/item?id=9004302](https://news.ycombinator.com/item?id=9004302)

------
skrause
I use `chronic` in almost all of my cron jobs so that cron sends me an email
with the output only if the command has a return code of != 0.

------
thomashabets2
ts looks like it's a subset of
[https://github.com/ThomasHabets/ind](https://github.com/ThomasHabets/ind)

sponge can be replaced with dd

Other than that, yeah nice ideas. And very much in the unix spirit of one tool
to do one thing.

~~~
pepve
sponge buffers the whole input before writing the output. Its utility would be
in reading from a file, working on it with other tools, and writing the result
back to the original file, all in one line.

~~~
kyran_adept
I know I wanted to do this, but I am very happy such a utility doesn't exist.
Unix utilities, usually work in a pipeline fashion, read line -> process ->
write to output. This allows them to allocate a limited amount of memory. If
you read all input in memory, you are asking for trouble: 'cat /dev/zero |
sponge a'.

~~~
zeroimpl
I routinely operate on machines with gigabytes of memory, but rarely write
pipelines which output gigabytes of data, so this has never been a concern for
me. But even then, there’s still swap space which is effectively like using a
temp file but lazier.

By the way, I think you’d be in even more trouble if you wrote:

    
    
        cat /dev/zero > a

------
JdeBP
Yes, they weren't written "when Unix was young", but people _had_ thought to
write some of them over a decade before this toolset. lckdo and ts came long
after setlock and cyclog (also the precursor to multilog) from the 1990s.

* [https://cr.yp.to/daemontools/setlock.html](https://cr.yp.to/daemontools/setlock.html)

* [http://cr.yp.to/daemontools/multilog.html](http://cr.yp.to/daemontools/multilog.html)

* [http://cr.yp.to/daemontools/upgrade.html](http://cr.yp.to/daemontools/upgrade.html)

* [http://jdebp.uk./Softwares/nosh/guide/commands/cyclog.xml#CO...](http://jdebp.uk./Softwares/nosh/guide/commands/cyclog.xml#COMPATIBILITY)

------
tarruda
The sponge example:

    
    
        sed "s/root/toor/" /etc/passwd | grep -v joey | sponge /etc/passwd
    

I think it can be rewritten as:

    
    
        sed -n '/joey/! s/root/toor/p' -i /etc/passwd

~~~
ISO-morphism
Yes, but -i on sed is specific to GNU, I don't think it exists on
BSD/OSX/busybox

~~~
lhoursquentin
-i is definitely not specificed by POSIX, but it supported on all those platforms with some small differences, for instance on OSX the backup extension (-i.bak) is not optional.

~~~
gpanders
Amazingly, there is no portable way to use -i that works on both GNU and BSD
sed implementations. Which means if you’re writing a portable script, you
can’t use -i at all.

(Would love to be proved wrong on this)

~~~
bewuethr
According to Stack Overflow [1], this works:

    
    
        sed -i.bak 's/foo/bar/' filename
    

[1]:
[https://stackoverflow.com/a/22084103/3266847](https://stackoverflow.com/a/22084103/3266847)

~~~
gpanders
That works when specifying a backup extension, but not if you don’t want to
create a backup file.

    
    
        sed -i ‘’ ...
    

works on BSD sed but not GNU. Meanwhile:

    
    
        sed -i’’ ...
        sed -i ...
    

both work on GNU but not BSD.

~~~
bewuethr
Well, yes, you can't use it without creating a backup file, but it uses "-i"
and is portable.

------
blauditore
What is the difference between `sponge` and redirecting with `>` (or `>>`)? Is
it about better compatibility with pipes or something?

~~~
JoelMcCracken
Well one thing is that if you’ve tried it you’ll run into the problem he
illustrates there that you are writing to the Sam file you are reading from,
which won’t work right.

My guess is sponge buffers all the input and then sends it to output once
stdin is closed.

------
aasasd
While we're on the topic and on the wave of modern riffs on classic tools,
personally I'm pining for a remake of xargs—because I never can whack it into
submission with anything more complex than `xargs rm`. Specifically, passing
multiple arguments from the input to the called command apparently just can't
be properly done, at least not on OSX.

~~~
nick0garvey
parallel may be what you are looking for:
[https://www.gnu.org/software/bash/manual/html_node/GNU-
Paral...](https://www.gnu.org/software/bash/manual/html_node/GNU-
Parallel.html)

The defaults are a lot saner, and it's easier to pass arguments how you want.

~~~
aasasd
Hmm, I guess I considered Unix utils rather narrowly specialized, so
`parallel` meant ‘my CPU got too many free cores’ for me. I'll take a closer
look at it, thanks.

------
timonoko
Where is Msdos utility "ncd"? It was like "cd" but guessed from few character
hint where you wanted to go. If the choice was not immediately obvious if
offered a menu or a tree. I shortened it to "n". -- There was some linux
utility but setting it up was annoyingly tedious, with lots of useless and
cryptic options.

~~~
andrewshadura
cd cannot be an external command in Unix-like systems, it has to be a shell
built-in.

~~~
oweiler
But cd can be invoked from a shell function (which doesn't create a subshell).

------
donatj
A lot of these seem easily achieved in a POSIX shell manner, but most
annoyingly to me

> sponge: soak up standard input and write to a file

I do that all the time...

cat > file.txt

Update: reading the comments on here, apparently sponge sucks up all content
before opening the destination file which allows editing an input file in
place. Minor advantage there.

~~~
jpab
The purpose of sponge is that you can do this:

grep foo file.txt | sponge file.txt

If you do this with redirections then file.txt will be truncated before it's
been processed, leaving you with an empty file instead of what you wanted.
Sponge collects its input first and then writes everything out at the end, so
you can output to a file that was used as an input.

(Parent updated while I was writing. Oh well)

~~~
tarruda
That doesn't seem like an efficient way to do it.

Commands that process file in place (sed -i) write to a temporary file in the
same filesystem and then rename to the target file, which works if you want to
process files that don't fit into memory.

~~~
jpab
I apologise, I was speaking loosely. I don't know if sponge collects input in
memory or in a file. (Frankly I've never needed to care since the files I've
ever needed this tool for have all been small)

------
aserafini
I wrote a util a bit like this when I realised there was no command for
emptying a directory:
[https://github.com/adamserafini/emptydir](https://github.com/adamserafini/emptydir)

~~~
jfreax
What the difference in just deleting the whole directory and recreating it
with `mkdir`. Or just do an `rm -rf dir/*`?

~~~
ggrrhh_ta
Hidden directories, links within dir... (usually, you want a "nofollow"
default for destructive operations)

------
iforgotpassword
So this where errno comes from. I discovered it by accident on my system a few
years ago and have been using it ever since. Really handy when working in C.
Still no idea why I have that package installed though.

On another note

> pee: tee standard input to pipes

Goddammit guys!

------
vagab0nd
In case you are like me and didn't get the sponge example, the difference from
a shell redirect is it doesn't truncate the input so allows in-place
modification of the input/output file.

------
siraben
Is it possible to make some of these moreutil commands aliases of some
combination of coreutils commands?

------
zzo38computer
Some of these I do not find so useful and stuff, although I do use ts, and
sometimes sponge.

------
forgotmypw17
It's funny that `vidir` spawns `nano` by default on my system.

~~~
jdub
That'd be the Debian alternatives system in action.

$ update-alternatives --list editor

~~~
JdeBP
Or sensible-editor.

* [http://jdebp.uk./FGA/unix-editors-and-pagers.html](http://jdebp.uk./FGA/unix-editors-and-pagers.html)

------
thekaleb
ts seems to fit the Unix philosophy of do one thing and do one thing well
quite nicely. It works pretty well for writing to logs.

------
cel1ne
> sponge

Why not use > file?

> mispipe

In bash there's PIPESTATUS for that.

~~~
charliesome
Re sponge: the shell will open the output file for writing before invoking the
command so in the example joey provides, /etc/passwd will be an empty file by
the time sed opens it.

~~~
prakashdanish
That makes sense but in the example provided, why can't we use sed with the
in-line flag?

~~~
gpanders
Because the -i flag works differently on GNU sed vs non-GNU sed and there is
no way to use it portably.

------
pkphilip
Can I use pee into a sponge?

------
rurban
With the annoying half-way implementation of GNU parallel. Does not support
the basic features, is not better nor faster, but still tramples over this
namespace.

I really like many of the moreutils tools, but dealing with constant parallel
breakage (kind of a Hadoop for dummies) is annoying.

