
Stop Piping Cats - helwr
http://www.ibm.com/developerworks/aix/library/au-badunixhabits.html?ca=lnxw01GoodUnixHabits
======
jrockway
This is my least-favorite Internet meme. People "pipe cats" because they want
the entire pipeline to read left-to-right, like:

    
    
       cat file | xargs foo | grep bar | sort | wc -l
    

It just _looks nicer_ than:

    
    
       < file xargs foo | grep bar ...

~~~
blasdel
Speak for yourself, and I don't know anyone that puts the input redirection
first. I do this:

    
    
      (xargs foo | grep bar | sort | wc -l) < file
    

It makes the pipeline one command, both lexically (verb comes first) and
concretely (the pipeline is kicked off in a subshell)

~~~
pyre
But does the time it takes to launch the subshell equal the amount of time it
takes to launch cat? If so your improvement is a NOOP.

~~~
blasdel
fork + 3x(fork+exec) is going to be cheaper than 4x(fork+exec), especially if
_cat_ isn't resident.

The point is a logical improvement anyway (not burying the input argument near
the beginning). I'm kind of surprised that the bash folks haven't turned _cat_
into a builtin like they did with _time_ and some of the other coreutils.

It's too bad it's about 30 years too late to stem the tide of shit like _cat
-v_ : <http://harmful.cat-v.org/cat-v/>

~~~
pyre
By the reasoning in your link, your suggestion of bash making 'cat' one of the
built-in commands is the "cancer that's bloating UNIX."

~~~
blasdel
The shell has always been the kitchen sink full of glue holding the whole
thing together. There have always been builtins: language control structures,
job control, etc. -- there are some things you can't trust others not to fuck
up, and where the coupling would just get ridiculous.

Here's cat as a pure sh builtin:

    
    
      shcat {
        for arg in "$@"; do
          exec 3<>"$arg"
          while read line <&3; do
            echo "$line"
          done
          exec 3>&-
        done
      }
    

The shell by it's very nature can't just do exactly one task well, it's a
programmable environment for living in. The _cancer that's bloating UNIX_ was
the way that the BSD and especially the GNU crews took simple tools and cross-
pollinated them randomly with stupid shit. Try running "/bin/true --help" on a
GNU system sometime -- there's a damn good reason why "your shell may have its
own version of true".

~~~
pyre

      % /bin/true --help
      Usage: /bin/true [ignored command line arguments]
        or:  /bin/true OPTION
      Exit with a status code indicating success.
    
            --help     display this help and exit
            --version  output version information and exit
    
      NOTE: your shell may have its own version of true, which usually supersedes
      the version described here.  Please refer to your shell's documentation
      for details about the options it supports.
    
      Report bugs to <bug-coreutils@gnu.org>.
    

I wouldn't necessarily call that 'bloated' unless you feel that any program
that uses one bit more than _absolutely_ necessary should be scrapped as
'bloated beyond belief.'

> _there's a damn good reason why "your shell may have its own version of
> true"._

Because why exactly?

------
Goladus
I use cat because often I wind up doing a number of different commands on the
same file. It's a lot easier to edit the end of the line, especially if you're
just adding another filter, than to go back and modify the beginning.

eg

    
    
         cat file | less
         cat file | grep thing
         cat file | grep otherthing
         cat file | grep otherthing | cut stuff
    

instead of

    
    
         less file
         grep thing file
         grep otherthing file
         grep otherthing file | cut stuff

~~~
swolchok
Sounds like you might want to learn about !$ (last token of previous command).

~~~
Goladus
Yeah that's interesting, but the second version is still more complicated to
assemble and takes more keystrokes, assuming up arrow gives the entire
previous command.

~~~
swolchok
Let's count for your example. In all cases, I'll exclude the actual name of
the file. We type in the first command:

    
    
         cat file | less
         less file
    

The second version is 5 fewer keystrokes. On to the next command:

    
    
         cat file | grep thing
         grep thing !$
    

The second example is the same or fewer keystrokes. In both cases, you have to
type "grep thing". In the first, you have to press the up arrow and backspace
over "less" (at least three keystrokes), and in the second, you have to type
an extra " !$".

I'll skip moving to "cat file | grep otherthing" or "grep otherthing !$", and
consider the change to get to

    
    
         cat file | grep otherthing | cut stuff
         grep otherthing file | cut stuff
    

In both cases, you have to type " | cut stuff". If you key in the second
example as "!! | cut stuff", that's an extra two keystrokes. If you key in the
first as up arrow + "| cut stuff", that's only 1 extra keystroke.

In total, my version saves keypresses in the specific example and doesn't seem
much worse in general.

------
aarongough
The most useful thing I took away from this was not 'piping cats' but instead
the interesting syntax for creating multiple directories in one go:

    
    
      ~ $ mkdir -p tmp/a/b/c
      ~ $ mkdir -p project/{lib/ext,bin,src,doc/{html,info,pdf},demo/stat/a}

~~~
jerf
It should be pointed out, since the article does not, that that is merely one
example of shell expansion that can be used anywhere. It is not a mkdir
feature.

    
    
        $ echo project/{lib/ext,bin,src,doc/{html,info,pdf},demo/stat/a}
        project/lib/ext project/bin project/src project/doc/html 
        project/doc/info project/doc/pdf project/demo/stat/a
    

(I added a linefeed to prevent wrapping.)

~~~
danudey
My favourite use is when installing packages using package managers,
especially when I need the dev packages.

Old fink example:

    
    
       sudo fink install lib{png,jpeg,ssl,whatever}{,-{dev,shlibs}}
    

which expands to:

    
    
       sudo fink install libpng libpng-dev libpng-shlibs libjpeg libjpeg-dev libjpeg-shlibs libssl libssl-dev libssl-shlibs libwhatever libwhatever-dev libwhatever-shlibs

~~~
jrockway
Brilliant.

I am ashamed to admit that I usually say "apt-get install libfoo.*", wait for
the downloads to start, hit Control-c, and then cut-n-paste the package names
I actually want onto the command-line.

It sounds bad when I type it out, but it's really not the most horrible thing
ever. But your way is definitely like 83x better.

~~~
yason
If you don't know the exact package names, Ubuntu happily auto-completes apt-
get and aptitude on package names. So I would just type "aptitude install
libfoo" and hit tab a couple of times to see what libfoos can I install.

~~~
jedbrown
Or `M-*` to expand all the completions and then, if necessary, delete the ones
you didn't want. Beats typing the extensions.

~~~
yason
This is golden, I didn't know that.

A perfect example of the infinite features that can be found in bash manual
page. I've been using bash since the early 90's and I've written lots of non-
trivial programs in bash, and I know a lot many other people don't know and
yet I had somehow managed to miss this pearl.

------
pretz
That 0.005 seconds I saved by not piping cat will significantly increase my
productivity! No more wasting time!

------
jerf
Oh come on, you can't mention that without mentioning the Useless Use of Cat
Award: <http://partmaps.org/era/unix/award.html>

------
tyrmored
Great practical advice. I still pipe cats though because it's just more
intuitive to me for setting up complex pipes.

~~~
protomyth
especially if your using a test file instead of the program you might
eventually use

------
ynniv
You also can't follow a file with grep, but you can with tail. Doing so
requires the grep option "--line-buffered", which sacrifices some performance,
but not much compared to actually viewing the log data.

    
    
      tail -f access.log | grep --line-buffered "GET /blog/post "

~~~
randallsquared
I do this a lot, and I've never used "--line-buffered", nor apparently needed
it. Google shows me a lot of people using that as a solution to a problem (not
getting results immediately) I've never seen. Weird.

~~~
silentbicycle
It's more likely if you have grep and several other operations piped together
- the normal buffering leads to grep only printing every time it gets a full
block of data, typically several lines.

------
almost
Why? The extra 3 letters of "cat" are usually going to take less time to type
the mental overhead of "where do I put the input file for this command" or the
additional pipeline reasoning. I actually sometimes do it one way and
sometimes the other, whatever comes to my fingers first (which is going to
vary according to how I'm visualizing the pipeline of tasks in my head).

In general I far prefer the (it seems to me) more Unixy way of having lots of
very simple commands and chaining them together over the use of extra options
and arguments.

------
chuhnk
There is some really great info on that site. I'm a sys admin of 3 years and
to see the comparison in performance is going to force me to change my
scripting habits.

~~~
mhansen
Really? You're changing your scripting habits based on gaining milliseconds?

~~~
chuhnk
You have to define usage here. They are giving an example of a single piped
command on a file of who knows what size, probably not very big, hence
milliseconds. However when it comes down to scripting large log processing,
backups, and other forms of automation then I think it will yield better
performance. Not to mention good coding practices, I am looking to improve in
any form, something like this changes the way you think which will prove to be
very useful when programming in ruby, c, java, etc. Not necessarily the use of
piping commands, but just about how to be more efficient and less wasteful.

------
Zarkonnen
<http://everything2.com/title/I+am+forced+to+smoke+my+cat> \- sorry!

