
A surprisingly arcane little Unix shell pipeline example - zdw
https://utcc.utoronto.ca/~cks/space/blog/unix/ShellPipelineIndeterminate
======
lnyng
There are some subtle points that the blog is not clear about:

\- The use of `1>&2` is to overwrite the stdout of the LHS process so that
`echo green` actually never writes to the pipe, so it never gets SIGPIPE.

\- `echo` only echoes arguments and always ignores stdin, so putting `echo
blue` on the RHS of the pipe only serves to run the two sides of the pipe in
parallel

\- `bash -c '(echo green 1>&2) | echo blue' 1>stdout 2>stderr` will show you
that green and blue actually write to different files

Another observation:

> if it was a separate command (or even if the shell forked to execute it
> internally), only the 'echo red' process would die from the SIGPIPE instead
> of the entire left side of the pipeline.

Most linux distributions have /bin/echo as a separate program. Running
`(/bin/echo red; echo green 1>&2) | echo blue` will always print both green
and blue.

EDIT: fix typo of stdin/stdout/stderr as suggested

~~~
rashkov
Thanks, this is helpful and why I came to the comments. Another question I had
was what happens to red? You helped me answer that -- unlike echo green, the
output of echo red is still connected to the pipe. The right-hand side of that
pipe does nothing with it, so it disappears.

Also regarding one of your examples: maybe I'm misunderstanding, and anyway
this wouldn't change your overall point, but should it maybe read like this
instead: bash -c '(echo green 1>&2) | echo blue' 1>stdout 2>stderr

since stdin is normally associated with file descriptor 0

~~~
lnyng
Yes you're right. That's a typo. Thanks

------
empath75
I’ve been doing Linux stuff for ten years and I am apparently just learning
that the pipeline commands run in parallel not serially. If I had put any
thought into it I would have realized it because otherwise tailing logs to
grep wouldn’t work...

~~~
rconti
Twenty five here, and I still get confused about stdout/stderr redirection.

~~~
kangnkodos
Here's a cheat sheet which should cover the most common situations.

Send stdout to a file.

ls myFileWhichExists > myStdLog

\- or -

ls myFileWhichExists 1> myStdLog

Send stderr to a file.

ls myFileWhichDoesNotExist 2> myErrLog

Send stdout to one file and stderr to a different file.

ls myFileWhichExists myFileWhichDoesNotExist 1> myStdLog 2> myErrLog

Send stdout and stderr to the same file

ls myFileWhichExists myFileWhichDoesNotExist 1> myBothLog 2>&1

I read that last part "2>&1" as "Send stderr (2) to the same place as stdout
(1) is already going to".

Notice that if you send stdout and stderr to the same file, because of caching
and other issues, the output from stdout and stderr will overlap in
unpredictable ways.

~~~
rconti
Yeah, it's the ampersands and ordering and placement and numbers that always
throw me, not the concepts. Thanks1

------
userbinator
This "parallel execution" is one of the interesting things that distinguishes
Unix pipelines from DOS pipelines; in DOS a temporary file is used, which was
an old source of puzzlement for beginners doing a "dir | more" \--- "what's
that extra file I see?"

In retrospect, getting non-preemptive pipes (a type of coroutining) working in
DOS would not have been all that difficult, if it weren't for the limited
memory available to PCs of the time and the fact that most programs assumed
they owned it all when they ran.

~~~
pjc50
"Apart from the fact that DOS wasn't a multitasking operating system,
concurrent execution of multiple tasks would have been easy"?

(Eventually people retrofitted a sort of multitasking with TSR programs, but
that's not really the same thing.)

------
derekp7
If you suspect a race condition between two operations, one way of testing it
is to add a sleep command to one side or the other. For example:

    
    
      (sleep 1; echo red; echo green 1>&2) | echo blue
    

vs

    
    
      (echo red; echo green 1>&2) | (sleep 1; echo blue)

~~~
mjevans
Offhand I'm unsure of the semantics of sleep...

It appears that sleep (at least on a typical modern Linux desktop) does behave
similarly to echo in that it does nothing with the pipeline instead of echoing
it to terminal.

It's curious because someone might assume the default behavior would be to
forward all file descriptors unless something was done to the data-streams.
Clearly that isn't the case, the shell ties the prior standard output to the
next programs standard input irrespective of if anything ever happens to it.

~~~
jstimpfle
I don't know what you mean by "forward" but I have the impression you don't
understand how file descriptors (i.e. "open files or streams") work.

The shell sets up a pipe and connects the left side to the pipe's writing end,
and the right side to the reading end. Now, the right side is actually a
subshell (indicated by the parentheses "(...)"). And that subshell can spawn
as many other processes, sequentially or in parallel, as it wants. All of them
will get the open file descriptor (the pipe's reading end) inherited by the
operating system.

If you had multiple processes in parallel reading from the pipe, the outcome
would be totally nondeterministic (dependent on the kernel's scheduling
behaviour. In the example case, none of the potential readers actually read
(not the subshell itself, not the sleep, and not the echo).

Here's a perhaps illuminating example:

    
    
        (echo h; echo hello; echo HALLO; ) | ( read firstline; echo "Firstline is $firstline"; grep A; )
    

Does that help?

------
contingencies
Arcane would be mixing the output of multiple commands in to a single text
stream without any readily available means to determine their origin, then
writing code based upon that output that relied upon a specific ordering of
said output without a preliminary explicit sort, ie. code that was reliant
upon this indeterminism to fail. In any event, _diff_ would _sort_ it out. ;)

------
ggm
thing | thing | thing > /file/I/didnt/mean/to/smash | thing

oops

thing | thing | thing > /another/file/I/didnt/mean/to/smash

oops

if you include > redirection, its parsed and processed before execution of the
pipe. Even if the subsequent pipe execution moments fail, you have probably
smashed /thing/you/didnt/mean/to/smash if you > into it

~~~
masklinn
Which is why setting "noclobber" is really useful and oft time-saving.

------
seedie
While testing around I found out that enclosing the commands in parantheses,
reverses the commoness of 'blue green' and 'green blue' output. Can anyone
explain why this happens?

------
kuwze
Disclaimer: I am an idiot on this subject

I recently found out that you can’t easily spawn a shell and then send
commands to it. It’s doable with tmux commands, but you’d think it would be
easier. I just wanted to write something that locates npm/virtualenv stuff in
bash, nothing fancy.

~~~
0xEFF

      echo ls | bash

~~~
yjftsjthsd-h
It even seems to work with named pipes, although in my first test it exited
after the first command (I suspect I'm accidentally sending a EOF when I echo
the command in).

    
    
        mkfifo testpipe1
        <testpipe1 bash  # in separate window
        echo ls > testpipe1

------
anamexis
Why does `red` never show?

~~~
xorcist
Why would you expect it to show?

Would you also expect "echo something < file.txt" to show the contents of
file.txt?

Perhaps you are thinking of cat or some other command, because piping things
to echo is such a strange and unexpected thing to do that you normally won't
encounter it.

~~~
anamexis
Because the other output of the LHS (sometimes) shows.

