
Top Unix Command Line Utilities - coldgrnd
http://blog.coldflake.com/posts/2012-12-30-Top-10-Unix-Command-Line-Utilities-2012.html
======
VMG
These obviously aren't related to 2012 at all.

Some issues:

\- don't forget that _/dev/random_ blocks

\- It's easier to use dd_rescue to track progress than to signal dd

\- Using _dd_ to zero out a hard drive repeatedly doesn't increase
security[1]. Using ATA secure erase does[2]

\- an alternative for summing file sizes is

    
    
        du -ch **/*.png
    

[1]
[http://en.wikipedia.org/wiki/Data_erasure#Number_of_overwrit...](http://en.wikipedia.org/wiki/Data_erasure#Number_of_overwrites_needed)

[2] <https://ata.wiki.kernel.org/index.php/ATA_Secure_Erase>

~~~
Breakthrough
I think the only "secure" way to erase the contents of a hard drive is to
repeatedly overwrite the disk surface with a mix of random/patterened data
(like Darik's Boot & Nuke does).

Also, for those wondering about the blocking of _/dev/random_ , it will
restrict the number of bits you can copy using _dd_ , but this won't be
apparent unless you attempt to copy more bits from the entropy pools than
there are available for random number generation. For more information, see
this question on Super User: [http://superuser.com/questions/520601/why-does-
dd-only-copy-...](http://superuser.com/questions/520601/why-does-dd-only-
copy-128-bytes-from-dev-random-when-i-request-more)

~~~
dsr_
Secure erase depends on the technology of the drive and the prowess of your
attacker.

If you fear the NSA seizing your disks, consider the tradeoffs of explosive
disposal.

If you fear a technically-savvy reporter going through your trash bin,
overwriting your disk three or four times with patterns will be fine. But it's
probably faster to take a drill and make a couple of holes. Make sure you hit
the platters.

If you are selling your old hardware and just don't want your unencrypted
stuff to be recovered by a sixteen year old with no budget but lots of time,
overwrite the disk once.

If you're trashing an SSD, make sure any patterns you use for overwriting are
not compressed out of existence by the controller. Or pull off the controller
and crunch it.

~~~
rgbrenner
The whole "overwrite xx times" is a myth. Overwriting it once with random data
makes it completely unreadable. Overwriting it more than that is a complete
waste of time.

[http://computer-
forensics.sans.org/blog/2009/01/15/overwriti...](http://computer-
forensics.sans.org/blog/2009/01/15/overwriting-hard-drive-data/)

[http://www.howtogeek.com/115573/htg-explains-why-you-only-
ha...](http://www.howtogeek.com/115573/htg-explains-why-you-only-have-to-wipe-
a-disk-once-to-erase-it/)

------
p4bl0
I'm a heavy command line user (I don't have a graphical file explorer/manager
for instance). Here is my top 42, in order of usage:

    
    
        ls, cd, git, ssh, make, e, cat, veille, rm,
        wpa_supplicant, grep, evince, mv, x, dhclient,
        cp, echo, todo, mplayer, scp, man, mkdir, ack,
        pdflatex, apt-get, apt-cache, sed, less, feh,
        racket, gcc, wget, xrandr, bg, svn, pmount,
        for, gpg, halt, ping, tail, top.
    

"e" is an alias for emacsclient ; "veille" is a script which toggles between
"xset s 5" and "xset s default" ; "x" is an alias for "xinit" ; "todo" is a
script which manage a text file which I use as a todo list.

"ls", "cd", and "git" are far more used than any other commands : 14808, 13256
and 10078 times respectively, against 3919 times for "ssh" which is just
behind.

I obtained these data from my .bash_history. Here are the place of the
commands that are listed in the article :

    
    
        "tr" is 76th
        "sort" is 66th
        "uniq" is 120th
        "split" is there only one time like many other command so its rank is not relevant
        Substitutions operations are what most "for" do so it is in my top 42. However see (1) below about the article.
        Files size are a mix of "ls" for individual file and "du" for multiple files, "du" is 65th
        "df" is 63rd
        "dd" is 473rd
        "zip" is 123rd (and funnily "gzip" is 122nd)
        I didn't use "hexdump"
    

(1) About the following line

    
    
        for i in *.mp4; do ffmpeg -i "$i" "${i/.mp4}.mp3"; done
    

I have two remarks. First, using the "-vn" of ffmpeg would accelerate the
conversion by making ffmpeg ignore the video entirely. Second, substituting
with '/' in the Bash expansion is not the right way to do that. "${i/.mp4}"
would be "$i" without the _first_ occurence of ".mp4". It is '%' that you want
to use here.

~~~
fqsxr
I wonder how large is your HISTFILESIZE, to get these accurate statistics?

~~~
p4bl0
My .bash_history weights 1.7Mo (almost 100k lines). I set a very high HISTSIZE
since I don't see any reason to lose this data.

~~~
dfox
One reason might be that bash reads this whole file into memory on interactive
startup and rewrites it completely when shutting down.

~~~
a3_nm
I second this, I think it is good practice to rotate the history file manually
when it reaches several MBs. (Using zsh, the main symptoms of a large history
file is sluggishness when using ^R and a lag of a few tenths of a second when
closing a terminal.)

~~~
p4bl0
For now I don't feel any annoyance, the shell startup is instantaneous and it
closes instantaneously as well. If I'm starting to feel a slow down I might
rotate the log, but I guess that's not happening soon.

------
visarga
My 2012 top, in order of usage:

joe, ls, cd, time, cdbdump, tail, more, cat, rm, grep, wc, apachectl restart,
find, curl, chmod, history, mv, locate, cpan, apt-get, pwd

But the most useful one is a command line Perl utility I called "flt" that
executes a block of perl code for each in the stdin.

cat file.txt | flt ' $line=~ s|\s+| |gsi; print $line."\n"; '

That would compact free spaces

find . | flt ' if (-f $line) { print (-s $line)."\n"; } '

This would print the size of all files in the current folder and subfolders.

So it works like awk, but with full Perl, no need to learn awk syntax. You can
do conditionals, loops and whatnot. I write 30% of my one time throw away
scripts directly in the command line.

~~~
EvilTerran
Are you familiar with perl's -n and -p switches?

<http://perldoc.perl.org/perlrun.html#*-n*>

Those "flt" lines could be written

    
    
      perl -lpe 's|\s+| |gsi' file.txt
      find . | perl -ne 'if (-f) { print -s }'
    

(-l chomps the incoming newlines, and puts them back on the output)

Of course, "perl -ne" is longer than "flt", and I appreciate all this implicit
use of $_ is not to everyone's tastes.

~~~
visarga
Yes, but I bundled slightly better syntactic sugar in my tool. It's the same,
in essence.

------
cldwalker
A more interesting set of commandline utilities to know -
[http://www.cyberciti.biz/open-source/best-terminal-
applicati...](http://www.cyberciti.biz/open-source/best-terminal-applications-
for-linux-unix-macosx/)

~~~
IamBren
This list is definitely more interesting to me. I discovered a few of these
already this year and have been using them a lot (mtr, pv, curl for inspecting
headers) and several others that I know I'm going to start messing with
immediately (siege, multitail).

Another VERY useful tool I didn't see on this list is iperf. From the Debian
package description:

Iperf is a modern alternative for measuring TCP and UDP bandwidth performance,
allowing the tuning of various parameters and characteristics.

Features:

* Measure bandwidth, packet loss, delay jitter

* Report MSS/MTU size and observed read sizes.

* Support for TCP window size via socket buffers.

* Multi-threaded. Client and server can have multiple simultaneous connections.

* Client can create UDP streams of specified bandwidth.

* Multicast and IPv6 capable.

* Options can be specified with K (kilo-) and M (mega-) suffices.

* Can run for specified time, rather than a set amount of data to transfer.

* Picks the best units for the size of data being reported.

* Server handles multiple connections.

* Print periodic, intermediate bandwidth, jitter, and loss reports at specified intervals.

* Server can be run as a daemon.

* Use representative streams to test out how link layer compression affects your achievable bandwidth.

I use iperf initially when I'm troubleshooting poor file server transfer
speeds, for example. There's a pretty Java GUI too if you want that.

------
minhajuddin
Not really a bash utility, but delegating to this script from vim has been a
huge timesaver for long running scripts. For instance I can trigger my ruby
specs from Vim without vim getting blocked. I wrote a blog post about it here:
[http://minhajuddin.com/2012/12/25/run-specs-tests-from-
withi...](http://minhajuddin.com/2012/12/25/run-specs-tests-from-within-vim-
without-blocking-your-flow)

    
    
        #!/bin/bash
        #~/.scripts/runinbg
        #Author: Khaja Minhajuddin
        #Script to run a command in background redirecting the
        #STDERR and STDOUT to /tmp/runinbg.log in a background task
    
        echo "$(date +%Y-%m-%d:%H:%M:%S): started running $@" >> /tmp/runinbg.log
        cmd="$1"
        shift
        $cmd "$@" 1>> /tmp/runinbg.log 2>&1 &
        #comment out the above line and use the line below to get get a notification
        #when the test is complete
        #($cmd "$@" 1>> /tmp/runinbg.log 2>&1; notify-send --urgency=low -i "$([ $? = 0 ] && echo terminal || echo error)" "$rawcmd")&>/dev/null &

------
meaty
So basically the same as in 1985?

~~~
etrain
Basically. And that's a good thing.

------
FuzzyDunlop
You can also use Ctrl+T to get the status of a running command, instead of
faffing about with pids and kill.

~~~
helper
That doesn't work for me, but sending USR1 does.

~~~
kps
Linux still hasn't picked up 'stty kerninfo'.

------
stiff
"for i in *.mp3" doesn't work with spaces and some other "special" characters
in filenames, see here:

[http://unix.stackexchange.com/questions/9496/looping-
through...](http://unix.stackexchange.com/questions/9496/looping-through-
files-with-spaces-in-the-names)

~~~
decklin
The for loop works fine. Performing word splitting/wildcard expansion/etc on a
variable will, well, split it into words/expand wildcards/etc, whether it was
introduced by a for loop or not.

~~~
stiff
I am not sure I understand what you mean. Word-splitting is performed by the
"for x in y" construct, using the IFS variable, so by default if you have a
file called "foo bar.mp4" the command line from the article:

    
    
      for i in *.mp4; do ffmpeg -i "$i" "${i%.mp4}.mp3"; done
    

will result in executing:

    
    
      ffmpeg -i foo foo.mp3
      ffmpeg -i bar.mp4 bar.mp3
    

Which is obviously not what was meant. So it's a good habit to learn to loop
over files in a directory in a different way.

~~~
decklin
Maybe you are confusing this with what happens when you do

    
    
        for i in `find -name '*.mp4'`; do # ...
    

or similar. In that case, the output of `find` is indeed split first, and
`for` sees "foo", "bar.mp4", and so on.

~~~
stiff
Ah yes, you are right, thanks!

------
tomaac
Combine sort and uniq is useless since sort has already -u flag.

~~~
etrain
sort | uniq -c | sort -n

is something I use all the time to get sorted frequency tables.

~~~
etrain
And, by the way, the two commands are something that could be done in O(n),
rather than O(n*log(n)) - but this little procedure is so damn easy to write,
that on relatively thin inputs that are less than 20m lines long, I usually
just do this.

------
akurilin
Does anybody know of great guides to master the majority of the useful CLI
tools available on Unix and Linux? For me the challenge is that I don't even
know I have some of these fantastic solutions readily available. I'd really
benefit of knowing they're there in the first place instead of hacking a
homebrew solution every time.

Learn Linux the Hard Way talks about them quite a bit, is that the go to guide
in 2012 or can I do better?

~~~
sea6ear
<http://www.commandlinefu.com/commands/browse> is a good resource to at least
browse through for inspiration.

I also learned a lot from "The Linux Cookbook" (Second Edition) by Michael
Stutz (this might be the first edition online:
<http://dsl.org/cookbook/cookbook_toc.html>).

------
jamescun
Got a feeling they ran out of utilities to mention by the end. Even very new
users of *nix will likely have heard of and used `find` and `zip`.

~~~
lucb1e
Probably dd and df too. But, he's just listing what commands he found the most
useful, not just the esoteric ones.

------
Zash
xxd is a pretty nice hexdump substitute that has a reverse mode of operation
(-r), turning a hexdump into binary.

~~~
shabble
From what I recall xxd is actually part of the vim distribution, so it's
common, but not everywhere. An approximate standard (coreutils) alternative is
`od -x' (although it doesn't include the ASCII readable char column at the
right, which can be annoying)

------
niggler
Where's the love for awk? It has been tucked away in a sub item but doesn't
deserve first class status?

~~~
imglorp
That whole "find -ls | awk" is wicked slow anyway; try wc and xargs...

    
    
      $ time find -ls | awk '{s += $7} END {print s}'
      15970582120
    
      real	0m27.721s
      user	0m1.256s
      sys	0m1.780s
    
      $ time find | xargs wc -c 2> /dev/null | tail -1
      604260969 total
    
      real	0m0.332s
      user	0m0.068s
      sys	0m0.204s

~~~
aidos
You sure that's not because of memory swapping? Once warmed up the awk command
is much faster for me.

Also, the results are different - though I'm too lazy to figure out why right
now :)

~~~
cynwoody
You need to filter out directory entries.

    
    
        find -type f -ls|awk '{s += $7} END {print s}'
        find -type f -print0 | xargs -0 wc -c | tail -1
        find -type f -exec wc -c {} + | tail -1

------
4ad
The second #6 example, the one with xargs, is wrong. Xargs(1) doesn't
necessarily create a single du process, it might create several.

~~~
coldgrnd
I was wondering about that, but is it really wrong? From what I found in the
"BSD General Commands Manual":

 _Any arguments specified on the command line are given to utility upon each
invocation, followed by some number of the arguments read from the standard
input of xargs. The utility is repeatedly executed until standard input is
exhausted._

and

 _-P maxprocs Parallel mode: run at most maxprocs invocations of utility at
once._

The way I interpret that is that you _could_ run xargs in parallel mode, but
by default the "utility is repeatedly executed" in the same process.

~~~
cben
Even in sequential mode, for a sufficiently long input xargs will invoke the
command multiple times to comply with kernel limits on command line length.

------
andrewcooke
cat ~/.bash_history | cut -f1 -d' ' | sort | uniq -c | sort -rn | head -10 |
cut -b9-

~~~
p4bl0
That's not enough, you need to split lines on "|" and ';' and before that
after for's "do" and if's "then".

~~~
andrewcooke
sure, and search through your file system to find all scripts written this
year, to include those too....

give me a break.

~~~
p4bl0
Hey calm down, I didn't mean to be agressive sorry if you felt my comment that
way.

You can count scripts as commands (I did it in my other comment elsewhere on
this page) but the way you do it you will miss a lot. For instance you won't
count any "uniq", "sort", … that are almost exclusively used as filter and not
in first command, you will also miss a lot of "less" and "grep" for instance.

------
bejar37
Aaa

