Hacker News new | past | comments | ask | show | jobs | submit login
How do Unix pipes work? (vegardstikbakke.com)
269 points by v3gas on March 21, 2020 | hide | past | favorite | 61 comments



This article promotes bad practices for dealing with SIGPIPE.

1. Closing stderr in Python is not a good idea because that’ll swallow any other errors that occur at exit. Redirecting stdout to devnull is really just a way to prevent the flushed output from going to the now-closed stdout and triggering another SIGPIPE. That’s more preferable than closing stderr and losing error output at exit.

2. Ignoring SIGPIPE is a terrible idea for a process that should do stream processing. Try making a yes clone and ignoring SIGPIPE - your process will likely run forever trying to shove “y” into a closed pipe. There’s a reason SIGPIPE was invented! Very few programs bother to check the return value from write/printf/etc.


Agreed. I like articles like this because they show the learning process, which I think is super valuable. However, they really need a big disclaimer stating the author is experimenting and doesn't know the correct answer. Otherwise people stumble upon it and take it as authoritative.


Great point, I should probably add that disclaimer!


Agreed. Correct handling of SIGPIPE is quite subtle and often done incorrectly. I've some notes on SIGPIPE considerations at:

http://www.pixelbeat.org/programming/sigpipe_handling.html


> 1. Closing stderr in Python is not a good idea because that’ll swallow any other errors that occur at exit. Redirecting stdout to devnull is really just a way to prevent the flushed output from going to the now-closed stdout and triggering another SIGPIPE. That’s more preferable than closing stderr and losing error output at exit.

I don't follow. The standard for EPIPE/SIGPIPE handling is to silently exit with an error status. It's fine to close stderr to prevent spurious warning messages about flushing stdout.

> 2. Ignoring SIGPIPE is a terrible idea for a process that should do stream processing. Try making a yes clone and ignoring SIGPIPE - your process will likely run forever trying to shove “y” into a closed pipe. There’s a reason SIGPIPE was invented! Very few programs bother to check the return value from write/printf/etc.

Programs can correctly handle lost pipes masking SIGPIPE entirely, with error checking alone. Python's BrokenPipeError is raised on the basis of EPIPE, not SIGPIPE.

Re: programs not checking error returns of write() and close(): that is not really true in a language like Python with exceptions raised on IO errors. It always does the check, and the unwinder aborts the program if nothing handles the error. Sigpipe is completely unnecessary for Python programs. (It's also not necessary for C programs, but I guess AT&T didn't want to fix their programs to check for errors.)


The suppression of SIGPIPE was done in the Go code, not in the Python code.

Does the POSIX standard mandate that programs receiving EPIPE/SIGPIPE die silently? I don’t know of such a rule, and there’s plenty of programs that violate this. Python is a bit too verbose with the errors (with a full trace back and two copies of the error) so suppressing those errors somehow seems like a good idea for a general-purpose command line tool.


Does the POSIX standard mandate that programs receiving EPIPE/SIGPIPE die silently?

https://pubs.opengroup.org/onlinepubs/7908799/xsh/signal.h.h...

Yes. The default action of SIGPIPE is to terminate the process.


The standard just specifies that (a) SIGPIPE exists, (b) when it should be raised by the OS, and (c) what the default action is.

It does not specify that programs that mask or block or handle SIGPIPE/EPIPE must be silent when they do so.


> The suppression of SIGPIPE was done in the Go code, not in the Python code.

What? Your earlier comment I was responding to mentioned only Python, not Go. And Python suppresses SIGPIPE:

https://github.com/python/cpython/blob/master/Python/pylifec... ,

https://github.com/python/cpython/blob/master/Doc/library/si...

> Does the POSIX standard mandate that programs receiving EPIPE/SIGPIPE die silently?

Not to my knowledge.


Python "knows what it's doing" in the sense that it guarantees that it checks return values from write() and has an exception mechanism to ensure that errors aren't accidentally ignored. Therefore, they can get away with ignoring SIGPIPE because they have the appropriate error checking in place.

If you check the linked article, you'll see that it discusses both Python and Go, and mentions suppressing SIGPIPE in the Go code. My original comment did not mention Python for point #2; it was more of a general admonition. (And, like all Internet advice - there are always exceptions if you know what you're doing - it was more of a way to head off newbies who might see signal(SIGPIPE, SIG_IGN) and think it's a good idea!)


Useful to know is the exception is printed to stderr, not stdout. So if you continue the pipe after head you still get expected results.

cat doesn't die 100% silently either, check $PIPESTATUS after piping cat into shorter head and you will see its exit code is non zero.


The biggest reason I can think of not to close stderr (or any other of the standard handles) is that the next open(2) call is likely to get that same descriptor recycled. So now joe random open file is going to receive all the error messages from random libraries, or perhaps even from unrelated programs you may have forked with that file as fd 2.


The biggest takeaway here IMO is that Python breaks the standard contract regarding signals–at least for SIGPIPE. Python should not be catching and throwing an exception for SIGPIPE; it should simply exit immediately, which is literally the default POSIX behaviour... unless a script/program/process specifically installs a signal handler to perform cleanup before exit. Python has some pretty awful behaviours built into it, and this is one of them.

Half of this article is not "How do Unix pipes work", but "how to fix broken SIGPIPE handling in Python".


Python doesn't catch SIGPIPE, it ignores it.

The exception is raised from a -1/EPIPE return from libc write().

I fully agree that Python is often a bad citizen in terms of signal handling — it wants to only process signals on 'the main thread', but also wants end-users to fully control signal-handling. The two ideas are sort of at-odds and in general I find handling signals in Python frustrating.


Another bad practice, IMHO, is not to quit the program once the exception is thrown. Instead, the loop continues and the rest of the input gets fed to /dev/null


Thanks! So the preferred way is to redirect to dev null?


It's better to just do nothing and allow SIGPIPE to kill your program. That is the reason this signal exists. Python is not a good unix citizen in this case. Compare it to Perl for example where nothing special is required to do the right thing:

    $ perl -E 'say "y" while 1' | head -1
    y
    $


Not an expert is no way, so I could be wrong, but in python could you use signal.SIG_DFL to have that expected behavior:

  # cat.py
  import sys
  from signal import signal, SIGPIPE, SIG_DFL
  signal(SIGPIPE,SIG_DFL)
  
  for line in sys.stdin:
      print(line)


That seems to be discouraged by the official python docs.[0]

[0] https://docs.python.org/3/library/signal.html#note-on-sigpip...


> sys.exit(1) # Python exits with error code 1 on EPIPE

Should be 141 instead of 1. Convention is that when a program dies because of a signal, the exit code should be 128 + the signal number, in this case 13. So, 128 + 13 = 141.

Here are some examples:

  $ ( yes; >&2 echo $? ) | head -0 
  141
  $ ( ping localhost; >&2 echo $? ) | head -0 
  141
  $ ( perl -E 'say "y" while 1'; >&2 echo $? ) | head -0 
  141
In python, when using signal(SIGPIPE, SIG_DFL), you get the correct behaviour (at least regarding the exit status):

  $ ( python -c '                                                  
  import sys                                                                    
  from signal import signal, SIGPIPE, SIG_DFL     
  signal(SIGPIPE,SIG_DFL)                    
  while True: print("y");                         
  ' ; >&2 echo $? ) | head -0
  141
> Do not set SIGPIPE’s disposition to SIG_DFL in order to avoid BrokenPipeError. Doing that would cause your program to exit unexpectedly also whenever any socket connection is interrupted while your program is still writing to it.

Regarding that socket behavior, if that's the standard in other programming environments, why is it an issue from python's perspective? Doesn't seem worth eschewing standard unix conventions. I mean, it says "unexpectedly", but isn't it actually the expected behavior? Unexpected is for python to say that what's default (SIG_DFL) is unexpected.


> Convention is that when a program dies because of a signal, the exit code should be 128 + the signal number

Note that this is incorrect. This is a common misconception about POSIX/UNIX. Not every process exits with an exit code; that's right, a process can exit without an integer exit code! A process exits with either a) an exit code, _OR_ b) the signal which terminated the program (as a separate status in the underlying struct, rather than as an exit code).

Many shells–including bash–provide an abstraction that translates process exits caused by signals into 128+signo, but this is only to fake an exit code for processes as seen when executed within the shell; this is a nonstandard and shell-specific abstraction.

Look at the man page for waitpid() for details. There are macros to test for the difference between an exit status with a code vs. an exit due to a signal; eg. WIFEXITED() and WIFSIGNALED(). WEXITSTATUS() will return the exit status... if one exists because WIFEXITED() tests truthful. Hint: if you're used to seeing your shell report 128+signo, that's because your particular shell happens to check waitpid() for WIFSIGNALED() and sets $? to 128+signo for you… as an abstracted convenience.

tldr; An exit code, vs. an exit caused by a signal, are mutually exclusive statuses. Processes terminated directly by a signal do not have an integer exit code; your shell may provide a fabricated exit code to make things appear simpler, but that is shell-specific and NOT a standard convention provided by POSIX or UNIX.


You're right. waitpid(2) does document this.

I was also surprised to find that, at least on my linux system, the status integer returned by wait encodes the signal number in the least significant bits and the exit status in the most significant bits. So, when a process dies from SIGPIPE, the status (as set by wait(&status)) is 13; when it does exit(1), it's 256; when it does exit(2), it's 512; when it does exit(3), it's 768, etc. Maybe that encoding was done to avoid misuse from people seeing their exit codes in the raw status returned by wait, and thinking they didn't need to use the macros.

In any case, this does mean that, indeed, while the shell does display the same status for both `yes | head -0` and `(exit 141)`, they are in fact different.

TIL


That's why we love Unix shell: it takes everything we type and removes the types. :-)


In the exception handler, you should reset SIGPIPE disposition and reraise the signal:

   signal.signal(signal.SIGPIPE, signal.SIG_DFL)
   os.kill(os.getpid(), signal.SIGPIPE)
(Then there's no need to fiddle with stderr.)


It also gives rationale. The rationale is ... unconvincing.


True, but most Python programs don't do this, so they pollute your terminal with error messages when you attempt to use them as interior processes in a unix pipeline.


This is bad advice, if you are doing anything other than output to stdout/stderr (which you almost assuredly are, unless you are doing something very simple like "yes"), you want to switch to /dev/null. For instance, running

  rm -vfr folder/ | head
involves SIGPIPE causing problems because it may or may not kill rm before it finishes deleting the directory, based on its internal output buffer size.


The problem here is connecting a program you want to run to completion to the head command, not how rm handles SIGPIPE.


head is just an example, if you pipe rm -vfr (or any program that does more than just output) into _anything_ the reasonable and default behavior should be to run rm to completion. (And from a programmer's POV, the reasonably thing is for syscalls to handle errors by returning an error, the way every other error is handled.)


Wouldn't it be better to be explicit? If intention is to "display first 10 lines, at most; discard the rest", you can write:

  rm -vrf directory/ | (head; cat >/dev/null)


I guess it depends on your semantics/mental model of what a pipeline should do. My mental model says that each part should run fully, and the pipeline should emulate sequential behavior, with "yes" being a bit of a hack in this regard. Cases where the behavior of a pipeline depends on the internal buffer size and operation ordering (e.g. what if rm decided to print everything it did only after removing everything?) seem like bugs to me. Maybe your mental model is different?

In any case, I think that in the vast majority of pipelines, the default SIGPIPE behavior is not what is desired.


Wishing programs do extra work desroys performance. Some programs run forever unless terminated. If you don't want a program to terminate early, don't send it a termination signal!

The the Unix philosophy is that users are smart and should be obeyed, not that programs should do what they think should have asked for instead. Forcing the user to negotiate with the program like a TARDIS is madness


Piping output of an "rm -v" is an oddball, with all the possible stdio buffering, I agree.

More real-life example fitting your model would be "start this long-running process and show me first 10 lines of output, just to make sure it didn't bail right away", but then, a more common pattern would be to keep a copy of the whole log too:

  myfoo >/var/log/foo & tail -f /var/log/foo | head
Counter-example would be "yes", but also unordered sampling. Let's say I want to know if there is file with "foo" in its name somewhere in a very large directory, just a yes/no. Continuing directory traversal after finding one is a waste of time. SIGPIPE is desirable here as well:

  find directory/ -name '*foo*' | head -1


rm doesn't print anything on stdout.


"rm -v" does.


> Python is not a good unix citizen in this case. Compare it to Perl for example where nothing special is required to do the right thing

How is this better? At least with the exception you can cleanly terminate your program if you need to.


In the unix pipeline model, components read input, perform some transformation, and print output. There is no need to do any clean-up. As nneonneo explained, because they rely on SIGPIPE, most unix programs don't even need to check for errors from write(2).

Naturally, in the rare case you want to do some clean-up you can:

    $ perl -E '$SIG{PIPE} = sub { die "cleanup" }; say "y" while 1' | head -1
    y
    cleanup at -e line 1.


I see


I disagree. Saying python is "not a good Unix citizen" here is akin to saying cars are a bad highway citizen because people can crash them.

Python provides mechanisms to handle signals. The point of a signal is to indicate something outside of your process has happened and you may want to respond to it. It's up to the program to respond to relevant signals and Python in no way stops a developer from doing that.


> Saying python is "not a good Unix citizen" here is akin to saying cars are a bad highway citizen because people can crash them.

I don't follow your analogy. The default signal disposition for SIGPIPE is to silently terminate the process. Normally processes compose nicely on the command-line and this requires no extra work from the programmer. By disregarding this convention, most Python programs pollute your terminal when used in pipelines, which is why I claim that in this case Python is a bad unix citizen.


Yep, I follow your logic. As you say, in other languages it "requires no extra work", it's more a matter of convenience.

Python in no way stops you from a having a typical response to SIGPIPE, it's as simple as handling the exception and doing a sys.exit(), but it doesn't "just happen". This makes the typical path a little more contrived, but I don't think having explicit handling makes it a bad citizen.

To torture my analogy, in my mind a "bad citizen" of the highway may be a car incapable of doing the minimum speed limit, whereas Python just has a manual transmission vs Perl's automatic (in this case).

I'm splitting hairs though, your point is fair. I just think your objection is a little strong for what comes down to convenience.


It's not a matter of "convenience" any more, when you know for a fact that nobody dashing off quick five-line scripts in your language is going to bother doing the extra thing, and so the default behavior will be the only behavior for those scripts.

Defaults matter. It's like the difference between opt-in and opt-out for organ donation, or the difference between encouraging and requiring cars to have airbags.


Mosst everyone agrees that defaults matter, but not what the defaults should be.


    If we cat this file, it will be printed to the terminal.
    > cat brothers_karamazov.txt
    ... many lines of text!
    ***FINIS***
    It takes a noticeable amount of time to finish.
The amount of time it takes for cat(1) to read and output the file is almost certainly insignificant. The time the author is noticing is probably related to how long it takes for his console to process the text.


Agreed.

This can be easily verified by putting `time` in front of the cat to measure the time taken. Even for huge text files, the wall clock time might be significant but the "user" time is likely still zero.


or redirect cat to /dev/null and see how fast that is


This is the first thing I noticed too.

>how does cat know to stop when head is finished

I'm no expert on Unix, so correct me if I'm wrong, but surely this line of reasoning is misleading because pipes create a unidirectional data flow, so `cat` can not know anything about `head`. It does not "stop" - it passes the whole text along just as it did without the pipe. As you said, the delay comes in printing to the console, not in the `cat` command.


This is a great example of Useless Use of cat and why it is bad - the full text is indeed sent through the pipe simply for head to chop n initial lines.

I've actually had "developers" go "but, readability". Yea ok.


Pipes aren't completely unidirectional. You get one bit of information flowing back: Whether the read end of the pipe is still open.


This article only shows basic usage of pipes (this is what they mean by "how pipes works"), but doesn't explain at all "how pipe works" (as in: how are they implemented).


It's implemented as a buffer and some associated state. A process that writes to the buffer can do so until it is full, at which point the thread is suspended (blocked on the write() call) until it is not full. The read() side is similar --- reads return successive data in the buffer unless it is empty, at which point the read() call will block.


ur-whale probably knows that


Same, I was hoping for some mention of /proc/PID/fd*, but nothin'.


/proc/ is not a fundamental to understanding Unix or Unix pipes and is not present on many Unixes.


You're right, but for what it's worth, the folks that I've taught the nuance of pipes seem to really only "get it" after pointing them to /proc/PID/fd* and having them cat in/out the files there. It directly leads into much deeper understandings of what "everything is a file" means.


I had a nice surprise and learning experience, when I discovered that the output of

     (echo red; echo green 1>&2) | echo blue
is indeterministic:

http://www.gibney.de/the_output_of_linux_pipes_can_be_indete...

As it turns out, this short line and its behavior nicely demonstrate a bunch of aspects that happen under the hood when you use a pipe.


See Section 1.2 this & 1.3 of the MIT Unix teaching OS for a great intro to FDs and pipes: https://pdos.csail.mit.edu/6.828/2019/xv6/book-riscv-rev0.pd...


Here's something that you should remember about using pipes and fork(2) in Python 3: By default, O_CLOEXEC is passed to the pipe(2) system from the CPython runtime.

This means, that reading the read end of the side in the parent process after you forked will not work. Thefore you should explicitly change fctl flags and remove os.O_CLOEXEC:

  fcntl.fcntl(readfd, fcntl.F_SETFL, fcntl.fcntl(readfd, fcntl.F_GETFL) & ~os.O_CLOEXEC)


My own rule of thumb of whether or not to ignore SIGPIPE is simple:

* If you only deal with file descriptors provided to you (stdin, stdout, stderr) as well as some files that you open (including special files like FIFOs), do not ignore SIGPIPE.

* If you deal with sophisticated file descriptors (socket(2) and pipe(2) count as sophisticated), you'd better ignore SIGPIPE, but also make sure to check for EPIPE in every single write.

In my view, SIGPIPE is a kludge so that programs that are too lazy to check for errors from write(2) (and fwrite(3) and related friends) will not waste resources. But if you are dealing with sophisticated file descriptors, there is a lot more happening than just open/read/write and a lot more error cases you must handle, and at that point the incremental cost of handling EPIPE isn't a significant addition.


My favorite sentence from Brian Kernighan's latest book "UNIX A History and a Memoir": Pipes are the quintessential Unix invention, an elegant and efficient way to use temporary connections of programs .. so I'll read this article :-)


Another point where you have to ignore SIGPIPE is concurrent code that handles multiple fds (say, like a web server). In this case you have to ignore the signal and process EPIPE correctly, because the signal is not associated with a particular fd so you cannot tell which one of them failed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: