
The Mighty Named Pipe - vsbuffalo
http://vincebuffalo.com/2013/08/08/the-mighty-named-pipe.html
======
aidos
Nice article. Really easy to follow introduction.

I only discovered process substitution a few months ago but it's already
become a frequently used tool in my kit.

One thing that I find a little annoying about unix commands sometimes is how
hard it can be to google for them. '<()', nope, "command as file argument to
other command unix," nope. The first couple of times I tried to use it, I knew
it existed but struggled to find any documentation. "Damnit, I know it's
something like that, how does it work again?..."

Unless you know to look for "Process Substitution" it can be hard to find
information on these things. And that's once you even know these things
exist....

Anyone know a good resource I should be using when I find myself in a
situation like that?

~~~
Kiro
A bit OT but I don't understand why Google doesn't supply a way to do strict
searches where everything you input is interpreted literally.

~~~
reitanqild
They have, I have complained loudly about this[1], never hard anything back
(this is SOP I understand), but I have seen improvements last year.

Double quotes around part of a query means make sure this part is actually
matched in the index. (I think they still annoy me be including sites that are
linked to using this phrase[2], but that is understandable.)

Then there is the "verbatim" setting that you can activate under search tools
> "All results" dropdown.

[1]:And the reason they annoyed me was because they would still fuzz my
queries despite me doublequoting and choosing verbatim.

[2]: To verify this you could open the cached version and on top of the page
you'd see something along the lines of: "the following terms exist only in
links pointing to this page."

~~~
tokenizerrr
They still ignore any special characters:
[https://www.google.com/search?q=%22%3C()%22&tbs=li:1](https://www.google.com/search?q=%22%3C\(\)%22&tbs=li:1)

------
unhammer
Once you discover <() it's hard not to (ab)use it everywhere :-)

    
    
        # avoid temporary files when some program needs two inputs:
        join -e0 -o0,1.1,2.1 -a1 -a2 -j2 -t$'\t' \
          <(sort -k2,2 -t$'\t' freq/forms.${lang}) \
          <(sort -k2,2 -t$'\t' freq/lms.${lang})
        
        # gawk doesn't care if it's given a regular file or the output fd of some process:
        gawk -v dict=<(munge_dict) -f compound_translate.awk <in.txt
        
        # prepend a header:
        cat <(echo -e "${word}\t% ${lang}\tsum" | tr [:lower:] [:upper:]) \
            <(coverage ${lang})

~~~
repsilat
> # gawk doesn't care if it's given a regular file or the output fd of some
> process:

Something wonderful I found out the other day: Bash executes scripts as it
parses them, so you can do all kinds of awful things. For starters,

    
    
        bash <(yes echo hello)
    

will have bash execute an infinite script that looks like

    
    
        echo hello
        echo hello
        echo hello
        ...
    

without trying to load the whole thing first.

After that, you can move onto having a script append to itself and whatever
other dreadful things you can think of.

~~~
unhammer
That's actually one of the things that I really dislike with bash, that it
doesn't read the whole script before executing it. I've been bitten by it
before, when I write some long-running script, then e.g. write a comment at
the top of it as it's running, and then when bash looks for the next command,
it's shifted a bit and I get (at best) a syntax error and have to re-run :-(

~~~
LukeShu
There are several ways to get Bash to read the whole thing before executing.

My preferred method is to write a main() function, and call main "$@" at the
very end of the script.

Another trick, useful for shorter scripts, is to just wrap the body of the
script in {}, which causes the script to be a giant compound command that is
parsed before any of it is executed; instead of a list of commands that is
executed as read.

~~~
unhammer
Ah, thank you for that. I may just start using these tricks in all my scripts
:-)

------
larsf
Pipes are probably the original instantiation of dataflow processing (dating
back to the 1960s). I gave a tech talk on some of the frameworks:
[https://www.youtube.com/watch?v=3oaelUXh7sE](https://www.youtube.com/watch?v=3oaelUXh7sE)

And my company creates a cool dataflow platform -
[https://composableanalytics.com](https://composableanalytics.com)

~~~
noselasd
[http://doc.cat-v.org/unix/pipes/](http://doc.cat-v.org/unix/pipes/) . And
there's a bit more about how pipes came to be in unix here: [http://cm.bell-
labs.com/who/dmr/hist.html](http://cm.bell-labs.com/who/dmr/hist.html)

------
Malarkey73
Vince Buffalo is author of the best book on bioinformatics: Bioinformatics
Data Skills (O'Reilly). It's worth a read for learning unix/bash style data
science of any flavour.

Or even if you think you know unix/bash and data there are new and unexpected
snippets every few pages that surprise you.

------
dbbolton
In zsh, =(cmd) will create a temporary file, <(cmd) will create a named pipe,
and $(cmd) creates a subshell. There are also fancy options that use MULTIOS.
For example:

    
    
        paste <(cut -f1 file1) <(cut -f3 file2) | tee >(process1) >(process2) >/dev/null
    

can be re-written as:

    
    
        paste <(cut -f1 file1) <(cut -f3 file2) > >(process1) > >(process2)
    

[http://zsh.sourceforge.net/Doc/Release/Expansion.html#Proces...](http://zsh.sourceforge.net/Doc/Release/Expansion.html#Process-
Substitution)

[http://zsh.sourceforge.net/Doc/Release/Redirection.html#Redi...](http://zsh.sourceforge.net/Doc/Release/Redirection.html#Redirection)

------
amelius
If you like pipes, then you will love lazy evaluation. It is unfortunate,
though, that Unix doesn't support that (operations can block when "writing"
only, not when "nobody is reading").

~~~
falcolas
If nobody is reading, you will eventually fill the pipe buffer (which is about
4k), and the writing will stop. It's a bigger queue than most of us would
expect when compared to generator expressions, but it can and does create back
pressure while making reads efficient.

~~~
noselasd
*about 4k

64k on linux these days.

------
baschism
AFAIK process substitution is a bash-ism (not part of POSIX spec for /bin/sh).
I recently had to go with the slightly less wieldy named pipes in a dash
environment and put the pipe setup, command execution and teardown in a
script.

------
mhax
I've used *nix for ~15 years and never used a named pipe or process
substitution before. Great to know about!

~~~
a3n
Named pipes have been rare for me, but simple process substitution is every
day.

Very often I do something like this in quick succession. Command line editing
makes this trivial.

    
    
      $ find . -name "*blarg*.cpp"
      # Some output that looks like what I'm looking for.
      
      # Run the same find again in a process, and grep for something.
      $ grep -i "blooey" $(find . -name "*blarg*.cpp")
      
      # Yep, those are the files I'm looking for, so dig in.
      # Note the additional -l in grep, and the nested processes.
      $ vim $(grep -il "blooey" $(find . -name "*blarg*.cpp"))

~~~
icebraining
That's actually command substitution, not process substitution :)

~~~
a3n
Thanks for the correction. The unix is large and I'm so very small.

------
anateus
In fish shell the canonical example is this:

    
    
       diff (sort a.txt|psub) (sort b.txt|psub)
    

The psub command performs the process substitution.

~~~
frankerz
It seems like fish shell's ">" process substitution equivalence is not working
as well as bash's though

[https://github.com/fish-shell/fish-
shell/issues/1786](https://github.com/fish-shell/fish-shell/issues/1786)

------
AndrewSB
Does anyone have a working link to Gary Bernhardt's The Unix Chainsaw, as
mentioned in the article?

~~~
agumonkey
That's the kind of video I might have downloaded. At least I hope so. Gonna
check my backups.

update 1 : found it, time to upload.

~~~
AndrewSB
Thank you kind sir. Waiting for a link

~~~
agumonkey
[http://goo.gl/X5jo83](http://goo.gl/X5jo83)

------
frankerz
How does the > process substitution differ from simply piping the output with
| ?

For example (from Wikipedia)

tee >(wc -l >&2) < bigfile | gzip > bigfile.gz

vs

tee < bigfile | wc -l | gzip > bigfile.gz

~~~
unhammer
Say that you have a program that splits its output into two files, each given
by command line arguments. A normal run would be

    
    
        <input.txt munge-data-and-split -o1 out1.txt -o2 out2.txt
    

but since the output is huge and your disk is old and dying, you want to run
xz on it before saving it to disk, so use >():

    
    
        <input.txt munge-data-and-split -o1 >(xz - > out1.txt) -o2 >(xz - > out2.txt)
    

If you want to do several things in there, I recommend defining a function for
clarity:

    
    
        pp () { sort -k2,3 -t$'\t' | xz - ; }
        <input.txt munge-data-and-split -o1 >(pp > out1.txt) -o2 >(pp > out2.txt)

------
chuckcode
Anybody know of a way to increase the buffer size of pipes? I've experienced
cases where piping a really fast program to a slow one caused them both to go
slower as the OS pauses first program writing when pipe buffer is full. This
seemed to ruin the caching for the first program and caused them both to be
slower even though normally pipes are faster as you're not touching disk.

~~~
jquast
Both mbuffer and pv by default contain fairly large in-memory buffers for pipe
data, and accept parameters for particularly large buffers.

[http://www.maier-komor.de/mbuffer.html](http://www.maier-
komor.de/mbuffer.html)
[http://www.ivarch.com/programs/pv.shtml](http://www.ivarch.com/programs/pv.shtml)

~~~
chuckcode
Thanks - hoping that there was a built in solution but a buffer program makes
sense

------
jamesrom
Is this guy a bioinformatician? I think he's a bioinformatician.

Can't be sure if he is a bioinformatician because he never really mentions
that he is a bioinformatician.

~~~
pmags
Seems entirely appropriate given his blog post, and others like it on his site
as well as the book he wrote, are clearly aimed at people interested in
learning bioinformatics.

------
leni536
moreutils [1] has some really cool programs for pipe handling.

pee: tee standard input to pipes sponge: soak up standard input and write to a
file ts: timestamp standard input vipe: insert a text editor into a pipe

[1] [https://joeyh.name/code/moreutils/](https://joeyh.name/code/moreutils/)

------
hitlin37
i heard somewhere that go follows unix pipe link interfaces.

------
Dewie
Pipes are very cool and useful, but it's hard for me to understand this common
_worship_ of something like that. Yes, it's useful and elegant, but is it
really the best thing since Jesus Christ?

~~~
Dewie
Wow. I guess that's what I get for not being totally enamoured of Unix.

~~~
JustSomeNobody
No, that's not why you were down voted. You were down voted because you were
condescending to the people who enjoy working with *nix.

~~~
Dewie
It really is a question that I've been having for a long time. I didn't just
say that to piss people off. I guess that's the risk you run of coming across
when you try to insert yourself in a conversation where the other participants
have already agreed on a set of shared opinions - this is _great_ \- and you
try to question that common assumption/opinion.

I have honestly been questioning my own understanding of _pipe_ , since I've
failed to see the significance before; first I thought it was just `a | b` as
in "first do a, then b". So then it just seemed like a notational way of
composing programs. Then I thought, uh, ok say what? Composing things is the
oldest trick in the conceptual book. But then I read more about it and saw
that it had this underlying wiring of standard input and output and forking
processes that made me realize that I had underestimated it. So, given that, I
was wondering if there is even more that I've been missing.

I have for that matter read about named pipes before and tried it out a bit.
It's definitely a cool concept.

~~~
tedunangst
Perhaps try questioning shared opinions without using the word worship?

~~~
Dewie
I'll have to admit that that sounded unnecessarily judgy. :)

