
How do Unix pipes work? - v3gas
https://www.vegardstikbakke.com/how-do-pipes-work-sigpipe/
======
nneonneo
This article promotes bad practices for dealing with SIGPIPE.

1\. Closing stderr in Python is not a good idea because that’ll swallow any
other errors that occur at exit. Redirecting stdout to devnull is really just
a way to prevent the flushed output from going to the now-closed stdout and
triggering another SIGPIPE. That’s more preferable than closing stderr and
losing error output at exit.

2\. Ignoring SIGPIPE is a terrible idea for a process that should do stream
processing. Try making a yes clone and ignoring SIGPIPE - your process will
likely run forever trying to shove “y” into a closed pipe. There’s a reason
SIGPIPE was invented! Very few programs bother to check the return value from
write/printf/etc.

~~~
seneca
Agreed. I like articles like this because they show the learning process,
which I think is super valuable. However, they really need a big disclaimer
stating the author is experimenting and doesn't know the correct answer.
Otherwise people stumble upon it and take it as authoritative.

~~~
v3gas
Great point, I should probably add that disclaimer!

------
cperciva

        If we cat this file, it will be printed to the terminal.
        > cat brothers_karamazov.txt
        ... many lines of text!
        ***FINIS***
        It takes a noticeable amount of time to finish.
    

The amount of time it takes for cat(1) to read and output the file is almost
certainly insignificant. The time the author is noticing is probably related
to how long it takes for his _console_ to process the text.

~~~
happytoexplain
This is the first thing I noticed too.

>how does cat know to stop when head is finished

I'm no expert on Unix, so correct me if I'm wrong, but surely this line of
reasoning is misleading because pipes create a unidirectional data flow, so
`cat` _can not_ know anything about `head`. It does not "stop" \- it passes
the whole text along just as it did without the pipe. As you said, the delay
comes in printing to the console, not in the `cat` command.

~~~
kyuudou
This is a great example of Useless Use of cat and why it is bad - the full
text is indeed sent through the pipe simply for head to chop n initial lines.

I've actually had "developers" go "but, readability". Yea ok.

------
ur-whale
This article only shows basic usage of pipes (this is what they mean by "how
pipes works"), but doesn't explain at all "how pipe works" (as in: how are
they implemented).

~~~
userbinator
It's implemented as a buffer and some associated state. A process that writes
to the buffer can do so until it is full, at which point the thread is
suspended (blocked on the write() call) until it is not full. The read() side
is similar --- reads return successive data in the buffer unless it is empty,
at which point the read() call will block.

~~~
emmelaich
ur-whale probably knows that

------
no_gravity
I had a nice surprise and learning experience, when I discovered that the
output of

    
    
         (echo red; echo green 1>&2) | echo blue
    

is indeterministic:

[http://www.gibney.de/the_output_of_linux_pipes_can_be_indete...](http://www.gibney.de/the_output_of_linux_pipes_can_be_indeter)

As it turns out, this short line and its behavior nicely demonstrate a bunch
of aspects that happen under the hood when you use a pipe.

------
pierremenard
See Section 1.2 this & 1.3 of the MIT Unix teaching OS for a great intro to
FDs and pipes: [https://pdos.csail.mit.edu/6.828/2019/xv6/book-riscv-
rev0.pd...](https://pdos.csail.mit.edu/6.828/2019/xv6/book-riscv-rev0.pdf)

------
ryanmccullagh
Here's something that you should remember about using pipes and fork(2) in
Python 3: By default, O_CLOEXEC is passed to the pipe(2) system from the
CPython runtime.

This means, that reading the read end of the side in the parent process after
you forked will not work. Thefore you should explicitly change fctl flags and
remove os.O_CLOEXEC:

    
    
      fcntl.fcntl(readfd, fcntl.F_SETFL, fcntl.fcntl(readfd, fcntl.F_GETFL) & ~os.O_CLOEXEC)

------
kccqzy
My own rule of thumb of whether or not to ignore SIGPIPE is simple:

* If you only deal with file descriptors provided to you (stdin, stdout, stderr) as well as some files that you open (including special files like FIFOs), do not ignore SIGPIPE.

* If you deal with sophisticated file descriptors (socket(2) and pipe(2) count as sophisticated), you'd better ignore SIGPIPE, but also make sure to check for EPIPE in every single write.

In my view, SIGPIPE is a kludge so that programs that are too lazy to check
for errors from write(2) (and fwrite(3) and related friends) will not waste
resources. But if you are dealing with sophisticated file descriptors, there
is a lot more happening than just open/read/write and a lot more error cases
you must handle, and at that point the incremental cost of handling EPIPE
isn't a significant addition.

------
RoutinePlayer
My favorite sentence from Brian Kernighan's latest book "UNIX A History and a
Memoir": Pipes are the quintessential Unix invention, an elegant and efficient
way to use temporary connections of programs .. so I'll read this article :-)

------
ilammy
Another point where you have to ignore SIGPIPE is concurrent code that handles
multiple fds (say, like a web server). In this case you _have to_ ignore the
signal and process EPIPE correctly, because the signal is not associated with
a particular fd so you cannot tell which one of them failed.

