
For the Love of Pipes - ingve
https://blog.jessfraz.com/post/for-the-love-of-pipes/
======
mothsonasloth
[Quote]

The Unix philosophy is documented by Doug McIlroy as:

    
    
        Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new “features”.
    
        Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don’t insist on interactive input.
    
        Design and build software, even operating systems, to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them.
    
        Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them.
    
    

I really like the last two, if you can do them in development then you are
then you have a great dev culture

~~~
quietbritishjim
Reformatted to be readable:

> Make each program do one thing well. To do a new job, build afresh rather
> than complicate old programs by adding new “features”.

> Expect the output of every program to become the input to another, as yet
> unknown, program. Don’t clutter output with extraneous information. Avoid
> stringently columnar or binary input formats. Don’t insist on interactive
> input.

> Design and build software, even operating systems, to be tried early,
> ideally within weeks. Don’t hesitate to throw away the clumsy parts and
> rebuild them.

> Use tools in preference to unskilled help to lighten a programming task,
> even if you have to detour to build the tools and expect to throw some of
> them out after you’ve finished using them.

~~~
majewsky
If bots were not discouraged on news.yc, I would have implemented a bot for
this long ago. Code-block quotes are so atrocious, esp. on mobile devices.

~~~
Leace
It seems "white-space: pre-wrap" on code block would solve most of the
problem. There is also additional "max-width" on the pre that I think is not
needed.

~~~
masklinn
That would break actual code snippet.

What would solve most of the problems is HN actually implementing markdown
instead of the current half-assed crap.

~~~
abraae
I would hate to see the day HN allowed any way to __bold __sections of text.

It's way more restful purveying a page of uniformly restrained text.

~~~
masklinn
> I would hate to see the day HN allowed any way to bold sections of text.

HN already has shitty italics (shitty in that it commonly matches and eats
things you don't want to be italicised e.g. multiplications, pointers, … in
part though not only because HN doesn't have inline code). "bold" can just be
styled as italics, or it can be styled as a medium or semibold. It's not an
issue, and even less worth it given how absolute garbage the current markup
situation is.

~~~
yen223
For a site that's meant to target programmers, HN's handling of code blocks is
pretty poor.

Just give me the triple-tilde code block syntax please!

~~~
masklinn
> For a site that's meant to target programmers, HN's handling of code blocks
> is pretty poor.

Meh. It does literal code blocks, they work fine.

That's pretty much the only markup feature which does, which is impressively
bad given HN only has two markup feature: literal code blocks and emphasis.

It's not like they're going to add code coloration or anything.

And while fenced code blocks are slightly more convenient (no need to indent),
pasting a snippet in a text editor and indenting it is hardly a difficult
task.

------
nojvek
I’m surprised JessFraz who is employed by Microsoft doesn’t talk about
powershell pipes at all.

Powershell pipes are an extension over Unix pipes. Rather than just being able
to pipe a stream of bytes, powershell can pipe a stream of objects.

It makes working with pipes so much fun. In Unix you have to cut, awk and do
all sorts of parsing to get some field out of `ls`. In poweshell, ls outputs
stream of file objects and you can get the field you want by piping to `Get-
Item` or sum the file sizes, or filter only directories. It’s very expressive
once you’re manipulating streams of objects with properties.

~~~
nijaru
She is actually employed at GitHub now.

[https://twitter.com/jessfraz](https://twitter.com/jessfraz)

~~~
lordgrenville
Still MSFT, in a way!

------
yaakushi
I'm probably nitpicking, but if you're using cat to pipe a single file into
the sdtin of another program, you most likely don't need the cat in the first
place, you can just redirect the file to the process' stdin. Unless, of
course, you're actually concatenating multiple files or maybe a file and stdin
together.

Disclaimer: I do cat-piping myself quite a bit out of habit, so I'm not trying
to look down at the author or anything like that! :)

~~~
arendtio
In fact, I don't like people optimizing shell scripts for performance. I mean,
shell scripts are slow by design and if you need something fast, you choose
the wrong technology in the first place.

Instead, shell script should be optimized for readability and portability and
I think it is much easier to understand something like 'read | change >write'
than 'change <read >write'. So I like to write pipelines like this:

    
    
      cat foo.txt \
        | grep '^x' \
        | sed 's/a/b/g' \
        | awk '{print $2}' \
        | wc -l >bar.txt
    

It might be not the most efficient processing method, but I think it is quite
readable.

For those who disagree with me: You might find the pure-bash-bible [1]
valuable. While I admire their passion for shell scripts, I think they are
optimizing to the wrong end. I would be more a fan of something along the
lines of 'readable-POSIX-shell-bible' ;-)

[1]: [https://github.com/dylanaraps/pure-bash-
bible](https://github.com/dylanaraps/pure-bash-bible)

~~~
Hello71
This is a very silly way of writing it though. grep|sed can almost always be
replaced with a simple awk: awk '/^x/ { sub("a", "b"); print $2; }' foo.txt.
This way, the whole command fits on one line. If it doesn't, put your awk
script in a separate file and simply call it with "awk -f myawkscript
foo.txt".

~~~
ncallaway
I would disagree that their way of writing it is silly.

It is instantly plainly obvious to me what each step of their shell script is
doing.

While I can absolutely understand what your shell script does after parsing
it, it's meaning doesn't leap out at me in the same way.

I would describe the prior shell script as more quickly readable than the one
that you've listed.

So, perhaps it's not a question of one being more silly than the other—perhaps
the author just has different priorities from you?

------
alkonaut
I love the idea of simple things that can be connected in any way. I'm not so
much a fan of "everything is a soup of bytes with unspecified encoding and
unknown formatting".

It's an abstraction that held up quite well, but its starting to show its age.

~~~
Jyaif
100% agree. Having to extract information with regular expressions is a waste
of time. If the structure of the data was available, you would have type
safety / auto-completion. You could even have GUIs to compose programs.

~~~
jenscow
I hear what you're saying.

However, how can you ensure the output type of one program matches the input
type of another?

~~~
testvox
Allow programs to specify the type of data they can consume and the type of
the data they emit. This is how powershell does it (using the dotnet type
system).

~~~
jenscow
And the problem is _how can you ensure the output type of one program matches
the input type of another_.

A program emits one type, and the other program accepts another.

Something will be needed to transform one type into another. Imagine doing
that on the command line.

~~~
NikolaeVarius
Cat file1 | convert | dest

------
icebraining
For an alternative view, don't forget to read the section on Pipes of _The
Unix-Haters Handbook_ :
[http://web.mit.edu/~simsong/www/ugh.pdf](http://web.mit.edu/~simsong/www/ugh.pdf)
(page 198)

~~~
bagrow
> When was the last time your Unix workstation was as useful as a Macintosh?

Some of that discussion has not aged well :)

~~~
OliverJones
MacOs is layered on a UNIX-like OS. You can use pipes in your command windows.

~~~
DavidWoof
This comment makes me feel really old.

MacOs wasn't always layered on unix, and the unix-haters' handbook predates
the switch to the unix-based MacOs X.

~~~
chrisfinazzo
Of course not, but the switch to BSD fixed a bunch of the underpinnings in the
OS and was a sane base to work off of.

Not to put too fine a point on it, but they found religion. Unlike Classic
(and early versions of Windows for that matter), there was more to be gained
by ceding some control to the broader community. Microsoft has gotten better
(PowerShell - adapting UNIX tools to Windows, and later WSL, where they went
all in)

Still, for Apple it meant they had to serve two masters for a while - old
school Classic enthusiasts and UNIX nerds. Reading the back catalog of John
Siracusa's (one of my personal nerd heroes) old macOS reviews gives you some
sense of just how weird this transition was.

------
darrenf
I twitched horribly at the final sentence, screaming inwardly "you don't
_pipe_ to /dev/null, you _redirect_ to it". And now I feel like an arsehole.

~~~
analpaper
redirect your feelings to /dev/null, because a pipe will just give us a
Permission denied

~~~
benj111
Chmod +x /dev/null

(havent tried above, not sure I recommend that you do)

~~~
jgtrosh
Well you can't read either from /dev/null, and I don't think that's just a
question of permissions. I'm pretty sure it's impossible to get /dev/null to
behave like an executable.

~~~
benj111
Interesting question.

You could write a executable that accepts piped input and throws it away.

When would it exit though? Would it exit successfully at the end of the input
stream? That sounds sensible.

That would be behaving like an executable wouldn't it?

~~~
majewsky
> When would it exit though? Would it exit successfully at the end of the
> input stream?

A process that attempts to read from a closed pipe receives SIGPIPE. The
default disposition for SIGPIPE is to terminate the program (similar to
SIGTERM or SIGINT). So yeah, assuming that the previous program in the
pipeline closes its stdout at some point (either explicitly, or implicitly by
just exiting), then our program would die of SIGPIPE the when it tries to
read() from stdin and the pipe's buffer has been depleted.

 _However_ , our program could also set SIGPIPE to ignored and ignore the
EPIPE errors that read() would return in that case. In that case, it could run
indefinitely. But at this point, you're way past normal behavior.

------
crazygringo
Pipes are awesome and infuriating.

Sometimes they work great -- being able to dump from MySQL into gzip sending
across the wire via ssh into gunzip and into my local MySQL without ever
touching a file feels nothing short of magic... although the
command/incantation to do so took quite a while to finally get right.

But far too often they inexplicably fail. For example, I had an issue last
year where piping curl to bunzip would just inexplicably stop after about 1GB,
but it was at a different exact spot every time (between 1GB and 1.5GB). No
error message, no exit, my network connection is fine, just an infinite
timeout. (While curl by itself worked flawlessly every time.)

And I've got another 10 stories like this (I do a lot of data processing). Any
given combination of pipe tools, there's a kind of random chance they'll
actually work in the end or not. And even more frustrating, they'll often work
on your local machine but not on your server, or vice-versa. And I'm just
running basic commodity macOS locally and out-of-the-box Ubuntu on my servers.

I don't know why, but many times I've had to rewrite a piped command as
streams in a Python script to get it to work reliably.

~~~
invsblduck
> Any given combination of pipe tools, there's a kind of random chance they'll
> actually work in the end or not.

While this may be your experience, the mechanism of FIFO pipes in Unix (which
is filehandles and buffers, basically), is an old one that is both elegant and
robust; it doesn't "randomly" fail due to unreliability of the core algorithm
or components. In 20 years, I never had an init script or bash command fail
due to the pipe(3) call itself being unreliable.

If you misunderstand detailed behavior of the commands you are stitching
together--or details of how you're transiting the network in case of an ssh
remote command--then yes, things may go wrong. Especially if you are creating
Hail Mary one-liners, which become unwieldy.

~~~
laumars
If got to agree. I can’t recall a pipe ever failing due to unreliability.

One issue I did used to have (before I discovered ‘-o pipefail’[1]) was the
annoyance that if an earlier command in a pipeline failed, all the other
commands in the pipeline still ran albeit with no data or garbage data being
piped to them.

[1] [https://stackoverflow.com/questions/1550933/catching-
error-c...](https://stackoverflow.com/questions/1550933/catching-error-codes-
in-a-shell-pipe#comment28578902_4959616)

------
jarpineh
I recently came across Ramda CLI's interactive mode [1]

It essentially hijacks pipe's input and output into browser where you can play
with the Ramda command. Then you just close browser tab and Ramda CLI applies
your changed code in the pipe, resuming its operation.

Now I'm thinking all kinds ways I use pipe that I could "tee" through a
browser app. I can use browser for interactive JSON manipulation,
visualization and all around playing. I'm now looking for ways to generalize
Ramda CLI's approach. Pipes, Unix files and HTTP don't seem directly
compatible, but the promise is there. Unix tee command doesn't "pause" the
pipe, but probably one could just introduce pause/resume output passthrough
command into the pipe after it. Then web server tool can send the tee'd file
to browser and catch output from there.

[1] [https://github.com/raine/ramda-cli#interactive-
mode](https://github.com/raine/ramda-cli#interactive-mode)

~~~
avodonosov
You can just store the first pipeline results in a file, edit it, then use the
file as an input for the second pipeline.

~~~
jarpineh
Well, yes, but this kind defeats transient nature of data moving through pipe.
Testing and debugging and operating on pipe based processing benefits from
close feedback cycle. I’d rather keep that as much as possible.

------
mpweiher
Yes, pipes are awesome, and the concepts actually translate well to in-process
usage with structured data.

[https://github.com/mpw/MPWFoundation/blob/master/Documentati...](https://github.com/mpw/MPWFoundation/blob/master/Documentation/Streams.md)

One aspect is that the coordinating entity hooks up the pipeline and then gets
out of the way, the pieces communicate amongst themselves, unlike FP
simulations, which tend to have to come back to the coordinator.

This is very useful in "scripted-components" settings where you use a
flexible/dynamic/slow scripting language to orchestrate fixed/fast components,
without the slowness of the scripting language getting in the way. See sh :-)

Another aspect is error handling. Since results are actively passed on to the
next filter, the error case is simply to not pass anything. Therefore the
"happy path" simply doesn't have to deal with error cases at all, and you can
deal with errors separately.

In call/return architectures (so: mostly everything), you have to return
_something_ , even in the error case. So we have nil, Maybe, Either, tuples or
exceptions to get us out of Dodge. None of these is particularly good.

And of course | is such a perfect combinator because it is so sparse. It is
obvious what each end does, all the components are forced to be uniform and at
least syntactically composable/compatible.

Yay pipes.

------
cmsj
pipe junkies might like to know about the following tools:

* vipe (part of [https://joeyh.name/code/moreutils/](https://joeyh.name/code/moreutils/) \- lets you edit text part way through a complex series of piped commands)

* pv ([http://www.ivarch.com/programs/pv.shtml](http://www.ivarch.com/programs/pv.shtml) \- lets you visualise the flow of data through a pipe)

~~~
foreigner
Yes! I love pv. Besides that and tee, can anyone else suggest some more
general pipe tools?

~~~
olejorgenb
[http://joeyh.name/code/moreutils/](http://joeyh.name/code/moreutils/) have a
couple more:

* pee: tee standard input to pipes (`pee "some-command" "another-command"`)

* sponge: soak up standard input and write to a file

Though in zsh and bash you can create pee using tee: `tee >(some-command)
>(another-command) >/dev/(null`

------
sudhirj
Sanjay Ghemawat (the other less visible half of Jeff Dean) wrote a pipe
library in Go, learnt quite a bit from it.

[https://github.com/ghemawat/stream](https://github.com/ghemawat/stream)

Edit: Jeff Dean, not James Dean

------
timvisee
Cool, the pipe command must be one of the most essential things in Unix/Linux
based systems.

I would have loved to see some awesome pipe examples though.

~~~
YesThatTom2
Ok, here are some example pipelines:

A simple virus scanner in one line of pipe:

[https://everythingsysadmin.com/2004/10/whos-
infected.html](https://everythingsysadmin.com/2004/10/whos-infected.html)

And a bunch of pipe tricks that are oh so wrong but oh so useful:

[https://everythingsysadmin.com/2012/09/unorthodoxunix.html](https://everythingsysadmin.com/2012/09/unorthodoxunix.html)

------
benj111
Why isn't the pipe a construct that has caught on in 'proper' languages?

~~~
duckerude
Many functional languages have |> for piping, but chained method calls are
also a lot like pipelines. Data goes from left to right. This javascript
expression:

    
    
      [1, 2, 3].map(n => n + 1).join(',').length
    

Is basically like this shell command:

    
    
      seq 3 | awk '{ print $1 + 1 }' | tr '\n' , | wc -c
    

(the shell version gives 6 instead of 5 because of a trailing newline, but
close enough)

~~~
OJFord
But each successive 'command' is a method on what's constructed so far; not an
entirely different command to which we delegate processing of what we have so
far.

The Python:

    
    
       length(','.join(map(lambda n: n+1, range(1, 4)))
    

is a bit closer, but the order's now reversed, and then jumbled by the
map/lambda. (Though I suppose arguably awk does that too.)

~~~
duckerude
That's true. It's far from being generally applicable. But it might be the
most "mainstream" pipe-like processing notation around.

Nim has an interesting synthesis where a.f(b) is only another way to spell
f(a, b), which (I think) matches the usual behavior of |> while still allowing
familiar-looking method-style syntax. These are equivalent:

    
    
      [1, 2, 3].map(proc (n: int): int = n + 1).map(proc (n: int): string = $n).join(",").len
      
      len(join(map(map([1, 2, 3], proc (n: int): int = n + 1), proc (n: int): string = $n), ","))
    

The difference is purely cosmetic, but readability matters. It's easier to
read from left to right than to have to jump around.

~~~
ticklemyelmo
C# extension methods provide the same syntax, and it is used for all of its
LINQ pipeline methods. It's amazing how effective syntactic sugar can be for
readability.

------
msravi
awk, grep, sort, and pipe. I'm always amazed at how well thought out, simple,
functional, and fast the unix tools are. I still prefer to sift through and
validate data using these tools rather than use excel or any full-fledged
language.

Edit: Also "column" to format your output into a table.

~~~
gpderetta
Although I probably use it multiple times everyday, I hate column. At least
the implementation I use has issues with empty fields and a fixed maximum line
length.

Edit: s/files/fields/

------
heinrichhartman
I have wondered for a long time why pipes are not used more often in
production-grade applications.

I have seen plenty pipe use in bash scripts, especially for build and ETL
purposes. Other languages, have ways to do pipes as well (e.g. Python
[https://docs.python.org/2/library/subprocess.html#replacing-...](https://docs.python.org/2/library/subprocess.html#replacing-
shell-pipeline)) but I have seen much less use of it.

It appears to me, that for more complex applications one rather opts for
TCP-/UDP-/UNIX-Domain sockets for IPC.

\- Has anyone here tried to plumb together large applications with pipes?

\- Was it successful?

\- Which problems did you run into?

~~~
aidenn0
The biggest issue is that pipes are unidirectional, while not all data flow is
unidirectional.

Some functional programming styles are pipe-like in the sense that data-flow
is unidirectional:

    
    
      Foo(Bar(Baz(Bif(x))))
    

is analagous to:

    
    
      cat x | Bif| Baz |Bar| Foo
    

Obviously the order of evaluation will depend on the semantics of the language
used; most eager languages will fully evaluate each step before the next.
(Actually this is one issue with Unix pipes; the flow-control semantics are
tied to the concept of blocking I/O using a fixed-size buffer)

The idea of dataflow programming[1] is closely related to pipes and has
existed for a long time, but it has mostly remained a niche, at least outside
of hardware-design languages

1:
[https://en.wikipedia.org/wiki/Dataflow_programming](https://en.wikipedia.org/wiki/Dataflow_programming)

~~~
JdeBP
Pipes are not unidirectional on FreeBSD, interestingly.

------
kradroy
I built an entire prototype ML system/pipeline using shell scripts that glued
together two python scripts that did some heavy lifting not easily reproduced.

I got the whole thing working from training to prediction in about 3 weeks.
What I love about Unix shell commands is that you simply can't abstract beyond
the input/output paradigm. You aren't going to create classes, types classes,
tests, etc. It's not possible or not worth it.

I'd like to see more devs use this approach, because it's a really nice way to
get a project going in order to poke holes in it or see a general structure. I
consider it a sketchpad of sorts.

~~~
noir_lord
My backup system at work is mostly bash scripts and some pipes.

If you write them cleanly they don’t suck and crucially for me, bash today
works basically the same way as it did 10 years ago and likely in 10 years,
that static nature is a big damn win.

I sometimes wish language vendors would just say ‘this language is complete,
all future work will be bug fixes and libraries’ a static target for anything
would be nice.

Elixir did say that recently except for one last major change which moved it
straight up my list of things to look at in future.

~~~
anthk
\- My IRC notification system is a shell script with entr, notify-send and
dunst.

\- My mail setup uses NMH, everything is automated. I can mime-encode a
directory and send the resulting mail in a breeze.

\- GF's photos from IG are being backed up with a python script and crontab.
Non IG ones are geotagged too with a script. I just fire up some cli GPS tools
if we hike some mountain route, and gpscorrelate runs on the GPX file.

\- Music is almost everything chiptunes, I felt interesent on any mainstream
music since 2003-4. I mirror a site with wget and it's done. If they offered
rsync...

\- Hell, even my podcasts are being fetch via cron(8).

\- My setup is CWM/cli based, except for mpv, emulators, links+ and vimb for
JS needed sites. Noice is my fm, or the pure shell. find(1) and mpg123/xmp
generate my music playlist. Street View is maybe the only service I use on
vimb...

The more you automate, the less tasks you need to do. I am starting to avoid
even taskwarrior/timew, because I am almost task free as I don't have to track
a trivial <5m script, and spt
[https://github.com/pickfire/spt](https://github.com/pickfire/spt) is
everything I need.

Also, now I can't stand any classical desktop, I find bloat on everything.

------
talkingtab
I'm not a Clojure person, but there are transducers,
[https://clojure.org/reference/transducers](https://clojure.org/reference/transducers).

~~~
bjoli
Which are in a way comparable, but I'd say that pipes are more like the arrow
->.

Transducers are compostable algorithmic transformations that, in a way,
generalize map, filter and friends but they can be used anywhere where you
transform data. Transducers have to be invoked and handled in a way that pipes
do not.

Anyone interested should check out Hickeys talks about them. They are
generally a lot more efficient than chaining higher order list processing
functions and since they don't build.intermediate results they have a lot
better GC performance.

------
roelb
Fully agree, pipes are awesome, only downside is the potential duplicate
serialization/deserialization overhead.

Streams in most decent languages closely adhere to this idea.

I especially like how node does it, in my opinion one of the best things in
node. Where you can simply create cli programs that have backpressure the same
as you would work with binary/file streams, while also supporting object
streams.

    
    
        process.stdin.pipe(byline()).pipe(through2(transformFunction)).pipe(process.stdout)

~~~
mweberxyz
Node streams are excellent, but unfortunately don't get as much fanfare as
Promises/async+await. A number of times I have gotten asked "how come my node
script runs out of memory" \-- due to the dev using await and storing the
entirety of what is essentially streaming data in memory in between processing
steps.

------
WhompingWindows
Pipes have been a game-changer for me in R with the tidyverse suite of
packages. Base R doesn't have pipes, requiring a bit more saving of objects or
a compromise on code readability.

One criticism would be that ggplot2 uses the "+" to add more graph features,
whereas the rest of tidyverse uses "%>%" as its pipe, when ideally ggplot2
would also use it. One of my most common errors with ggplot2 is not utilizing
the + or the %>% in the right places.

~~~
vharuck
I've always thought of ggplot2's process as building a plot object. Most steps
only add to the input.

Of course, Hadley admitted it was because he wrote ggplot2 before adopting
pipes into his packages.

------
adem666
Unix's philosophy of “do one thing well” and “expect the output of every
program to become the input to another” is living with "microservices" in
nowadays.

------
stirner
> “Perhaps surprisingly, in practice it turns out that the special case is the
> main use of the program.”

This is, in fact, a Useless Use of Cat [1]. POSIX shells have the < operator
for directing a single file to stdin:

    
    
        figlet <file
    

[1]
[http://porkmail.org/era/unix/award.html](http://porkmail.org/era/unix/award.html)

------
jcelerier
If you want to see what the endgame of this is when taking the reasoning to
the maximum, look at visual dataflow languages such as Max/MSP, PureData,
Reaktor, LabVIEW...

Like always, simple stuff will be simple ([http://write.flossmanuals.net/pure-
data/wireless-connections...](http://write.flossmanuals.net/pure-
data/wireless-connections/static/PureData-DataFlow-sendreceive1-en.png)) and
complicated stuff will be complicated
([https://ni.i.lithium.com/t5/image/serverpage/image-
id/96294i...](https://ni.i.lithium.com/t5/image/serverpage/image-
id/96294i47A75666708C8ECA?v=1.0)).

No silver bullet guys, sorry. If you take out the complexity of the actual
blocks to have multiple small blocks then you just put that complexity at
another layer in the system. Same for microservices, same for actor
programming, same for JS callback hell...

~~~
jancsika
> Like always, simple stuff will be simple
> ([http://write.flossmanuals.net/pure-data/wireless-
> connections...](http://write.flossmanuals.net/pure-data/wireless-
> connections...))

That is not actually simple because the data is flowing across two _completely
different message passing paradigms_. Many users of Max/MSP and Pd don't
understand the rules for such dataflow, even though it is deterministic and
laid out in the manual IIRC.

The "silver bullet" in Max/MSP would be to _only_ use the DSP message passing
paradigm. There, all objects are guaranteed to receive their input before they
compute their output.

However, that would make a special case out of GUI building/looping/branching.
For a visual language designed to accommodate non-programmers, the ease of
handling larger amounts of complexity with impunity would not be worth the
cost of a learning curve that excludes 99% of the userbase.

Instead, Pd and Max/MSP has the objects with thin line connections. They are
essentially little Rube Goldberg machines that end up being about as readable.
But they can be used to do branching/looping/recursion/GUI building. So users
typically end up writing as little DSP as they can get away with then uses
thin line spaghetti to fill in the rest. That turns out to be much cheaper
than paying a professional programmer to re-implement their prototype at
scale.

But that's a design decision in the language, not some natural law that visual
programming languages are doomed to generate spaghetti.

Edit: clarification

~~~
jcelerier
> The "silver bullet" in Max/MSP would be to only use the DSP message passing
> paradigm. There, all objects are guaranteed to receive their input before
> they compute their output.

in one hand, this simplifies the semantics (and it's the approach I've been
using in my visual language ([https://ossia.io)](https://ossia.io\))), but in
the other it tanks performances if you have large numbers of nodes... I've
worked on Max patches with thousands and thousands of objects - if they were
all called in a synchronous way as it's the case for the DSP objects you
couldn't have as much ; the message-oriented objects are very useful when you
want to react to user input for instance because they will not have to execute
nearly as often as the DSP objects, especially if you want a low latency.

~~~
jancsika
That is certainly true. My point is that this is a drawback to the
_implementation_ of one set of visual programming languages, not necessarily a
drawback of visual programming languages.

I can't remember the name of it, but there's a Pd-based compiler that can take
patches and compile them down to a binary that performs perhaps an order of
magnitude faster. I can't remember if it was JIT or not. Regardless, there's
no conceptual blocker to such a JIT-compiled design. In fact there's a version
of [expr] that has such a JIT-compiler backing it-- the user takes a small
latency hit at _instantiation time_ , but after that there's a big performance
increase.

The main blocker as you probably know is time and money. :)

------
kgr
I made a simulation and video which shows why the pipe is so powerful:
[https://www.youtube.com/watch?v=3Ea3pkTCYx4](https://www.youtube.com/watch?v=3Ea3pkTCYx4)
I showed it to Doug McIlroy, and while he thought there was more to the story,
he didn't disagree with it.

------
djhworld
I love writing little tools and scripts that use pipes, I've accumulated a lot
of them over the years and some are daily drivers.

It's a great learning tool for learning a new programming language as well as
the interface between the boundaries of the program are very simple.

For example I wrote this recently
[https://github.com/djhworld/zipit](https://github.com/djhworld/zipit) \- I'm
fully aware you could probably whip up some awk script to do the same, or
chain some existing commands together, or someone else has written the same
thing, but I've enjoyed the process of writing it and it's something to throw
in the tool box - even if it's just for me!

------
wglb
This is cool and useful, but not all unix programs follow this convention:

    
    
      * find
      * cal
      * vi
      * emacs
      * ls
    

These don't use one of standard input/standard output. (edited) and are not
fully pipeable.

I don't recall seeing a list of programs--tools, in the original description--
that distinguish between pipeable and not-pipeable programs.

Also, none of the corrective cat/grep code in these threads point out that
grep in fact takes file names, so "cat foo | grep stuff" is just a silly no-
op.

~~~
fooblat
Well find, cal, and ls don't take input at all. And vi (well vim) in fact
does. If you invoke it with a - for the file name argument, it will read
standard input into the buffer. I can't comment on emacs as I don't use it
much.

~~~
wglb
Ah, correct on that point.

What about standard output? That is, can vi be used in a pipe?

~~~
rmilk
I use vim as a visual pipe or pipe debugger with undo when I need to perform a
series of transformations (grep/sort/cut/lookup/map/run other data tool).
Obviously geared towards text files because it is vim.

The ! command sends the selection through external command(s) as STDIN and
then replaces the selection with STDOUT from the command(s). For example, grep
or sort, but can be any command that works with pipes. Buffer is replaced with
output (sorted file for example). Undo with U to go back to original data.
Redo with R to go forward to transformed data. Command line history is
available to add more commmands or correct when you type ! again.

Edit a file. Select block (Visual mode shift-V) and type ! or use the whole
file with gg!G command. Type in the commands you need to run.

Vim also reads stdin if you give - as the filename, like “ls -l | vim -“ so
you we can use it at the end of a pipe instead of redirecting to a file.

Like I said, I use it as an interactive debugger to assemble pipelines and see
the results.

vi / vim filter commands:
[http://vimdoc.sourceforge.net/htmldoc/change.html#](http://vimdoc.sourceforge.net/htmldoc/change.html#)!

------
MichaelMoser123
I think the Unix pipeline works because this it has this moldable and
expressive text substance that is exchanged. Gstreamer also has pipelines, but
the result in my opinion is quite awkward because events of many types are
exchanged. Windows Powershell also has a pipeline where objects are exchanged,
but it also somehow failed to become a huge success.

i think the unix pipeline concept doesn't quite scale to other domains that
try to exchange a different unit of information between the pipeline elements.

~~~
kingosticks
While I do agree they can sometimes be tricky to use, Gstreamer pipelines are
also it's best feature. Is there a better way?

~~~
MichaelMoser123
ffmpeg took a different approach (sort of like a single program/system with
many command line options). Some say its easier to use (at least easier than
gst-launch)

~~~
kingosticks
I find the ffmpeg options dizzying whereas the single syntax for gst-launch
and parse_bin_from_description is pretty neat. But then I guess you've still
got a selection of randomly named properties to discover and correctly set.

~~~
MichaelMoser123
occasionally they have events exchanged that can't be implemented with gst-
launch, you have to do it in C. Go figure.

------
j_b_s
I built a little tool to audit objects in your R pipeline here:

[https://jabustyerman.com/2017/09/29/auditing-your-r-data-
pip...](https://jabustyerman.com/2017/09/29/auditing-your-r-data-pipeline/)

This basically allows you to tag (wrap) intermediate functions in your pipe,
and then inspect the objects generated by that function later.

------
timesntimes
In Unix, everything is a file. /dev/null is a file but not an executable so it
can't even become a process. So you can't pipe anything into it. If you
consider this last sentence as a "figure of speech" then you should have
probably avoided the use of "pipe" vs "redirect" and used send them to
/dev/null.

------
flexer2
One of my most used tools with pipes on the Mac is `pbcopy`, which copies
whatever you're piping to the clipboard.

Copying your public key somewhere? `cat ~/.ssh/id_rsa.pub | pbcopy`

Pretty-printing some JSON you copied from a log file? `pbpaste | jq . |
pbcopy`

It's one of those little tools I use daily and don't think about much, but
it's incredibly useful.

~~~
saagarjha
And of course, its cousin pbpaste is also quite useful: pbpaste | clang -x c
-; ./a.out

------
yboris
I love pipes so much! In particular, I love pipes in Angular.

I gave a talk about including a bit about pipes in Angular:
[https://youtu.be/Gv7IfU78vxw?t=578](https://youtu.be/Gv7IfU78vxw?t=578)

------
bonquesha99
Elixir/Unix style pipe operations in Ruby!
[https://github.com/LendingHome/pipe_operator](https://github.com/LendingHome/pipe_operator)

------
agumonkey
has there been proposals/research on having non linear / branching pipe syntax
?

    
    
        a |1 b | c | d, |1 e
    

aka piping a to both b,c,d and e branches ?

~~~
masklinn
Smalltalk's "method cascading" (the `;` operator) did roughly that. I don't
think you're forking an entire sequence of methods though, it just allows
performing operations on the same root object e.g.

    
    
        a foo
        ; bar
        ; baz
    

would sequentially send "foo", "bar" then "baz" to "a", then would return the
result of the last call.

The ability to fork iterators (which may be the semantics you're actually
interested in) also exists in some languages e.g. Python (itertools.tee) or
Rust (some iterators are clonable)

~~~
agumonkey
I see.. but I wonder if st syntax allows for more than methods:

    
    
        a (foo duh meh) <- classic composition here
        ; bar
        ; baz

~~~
masklinn
I don't think the first line is even valid, it translates to

    
    
        a.(foo.duh.meh)
    

in e.g. ruby.

------
elchief
I wish there was an operator to allow a branch then join. Like send output to
a lookup, do the lookup, join results

Can do with named pipes, but an operator would be nice

~~~
JdeBP
GNU awk co-processes are another way to do this.

~~~
elchief
I'll look into it thx

------
TuringTest
It's sad that developers have had access to pipes for decades, yet the most
similar feature they provide to end users is still copy/paste.

------
JustSomeNobody
Pipes are awesome. Even in more general programming the pipes and filters
pattern is very important and yet a lot of devs still don't know it.

------
peter_retief
I never really associated the Unix philosophy with unix pipes, now it seems so
obvious

------
zshrdlu
Pedantic, but: you don't pipe to /dev/null, you _redirect_ to it.

------
Kagerjay
Piping is fairly common when working with databases and importing CSV files

------
josemober
Fantastic article!

------
ggm
Danny, boy, the pipes are calling!

(I love pipes too btw)

------
aaaaaaaaaab
If only there was some standard format to interchange structured data over
pipes, other than plain text delimited with various (incompatible)
combinations of whitespaces, using various (incompatible) escaping schemes.

~~~
Kipters
That's one of the design goals of Powershell, you don't pass streams of text
between cmdlets, you pass objects, which have a type, properties and methods
you can use in standard ways (they may also be strings)

~~~
peterwwillis
A while back I made a PoC that used pipes to create bidirectional json-
speaking connections between applications and then distribute the
communication between distributed nodes. The idea being, don't just distribute
TCP streams between distributed processes, but give 'em objects to make data
exchange more expressive. I don't know where my PoC code went, but it wasn't
very stable anyway. Just figured it was something we should have adopted by
now (but that Plan9 probably natively supports)

------
lincpa
The Pure Function Pipeline Data Flow

[https://github.com/linpengcheng/PurefunctionPipelineDataflow](https://github.com/linpengcheng/PurefunctionPipelineDataflow)

Using the input and output characteristics of pure functions, pure functions
are used as pipelines. Dataflow is formed by a series of pure functions in
series. A dataflow code block as a function, equivalent to an integrated
circuit element (or board)。 A complete integrated system is formed by serial
or parallel dataflow.

data-flow is current-flow, function is chip, thread macro (->>, -> etc.) is a
wire, and the entire system is an integrated circuit that is energized.

~~~
jacquesm
Yes, there is a strong connection between pipes and functional programming,
they all transform what passes through them and should not retain state when
implemented properly.

~~~
kitd
They also both rely on very generic data types as input and output. Which is
why I was surprised to see the warning about avoiding columnar data. Tabular
data is a basic way of expressing data and their relationships.

------
alias_neo
Anyone else disappointed that this wasn't some in-depth, Docker related pipe
usage post?

For those of us that know who Jess is, the post is a little lacking. I went
there to read about some cool things she's doing with pipe, but was
disappointed at a post that contained nothing.

~~~
Insanity
I disagree - the pipe command is trivial at least to us whom have used linux
(or unix) based systems for a long time.

But to people who follow her and are rather new to this way of doing things,
this could be a good read.

Either way, it is nice to be reminded of the unix philosophy about chaining
tools together for junior and senior alike.

Edit: typo fixes

~~~
alias_neo
Fair enough. I'm a little surprised at the response. I didn't expect many
people here aren't already intimately familiar with pipes, and for those that
aren't *nix users, I'm not sure the content is of interest.

I guess being honest around here is the wrong thing.

~~~
Insanity
For what it's worth, I had no issue with your opinion and also didn't downvote
either of your comments.

And I also didn't upvote the article, because it is indeed quite basic. Though
apperantly enough people did find it interesting enough to make it to the
frontpage ^^

