How are Unix pipes implemented?

divyekapoor · on March 27, 2020

"The Linux Kernel Implementation of Pipes and FIFOs (2011)" is a detailed look at the Linux Kernel internals of Pipes and FIFOs.

https://www.slideshare.net/divyekapoor/linux-kernel-implemen...

Long story short: pipes and FIFOs are implemented on a virtual pipeFS and an internal 64K buffer is used for holding data in memory when transferring between processes. Locks on VFS inodes on pipefs are used for synchronization across threads / processes.

(Full Disclosure: I'm the author; this work was done as part of my Master's degree and discusses Pipes and FIFOs as implemented on pipefs for a kernel around 2011).

Hope this is interesting to the people on the thread.

ktpsns · on March 27, 2020

Interesting. What exactly have you done for your Master's project? Contributing code to the linux kernel or was it some literature work (such as making a presentation about the code)?

Can you elaborate on how to identify the filename for a pipe in procfs or sysfs in a simple "echo hello | wc -c" example?

amenonsen · on March 28, 2020

> Can you elaborate on how to identify the filename for a pipe in procfs or sysfs in a simple "echo hello | wc -c" example?

You can see it under /proc/$pid/fd. For example, if you run "sleep 600|wc -c" (just to give yourself some time to poke around), you can see:

    $ ls -l /proc/2760791/fd 
    total 0
    lrwx------ 1 ams ams 64 Mar 28 13:35 0 -> /dev/pts/10
    l-wx------ 1 ams ams 64 Mar 28 13:35 1 -> pipe:[54074122]
    lrwx------ 1 ams ams 64 Mar 28 13:35 2 -> /dev/pts/10
    
    $ ls -l /proc/2760792/fd
    total 0
    lr-x------ 1 ams ams 64 Mar 28 13:35 0 -> pipe:[54074122]
    lrwx------ 1 ams ams 64 Mar 28 13:35 1 -> /dev/pts/10
    lrwx------ 1 ams ams 64 Mar 28 13:35 2 -> /dev/pts/10

Here 2760791 is the pid of the "sleep 600", and 2760792 is the pid of "wc". You can see they're both connected to "pipe:[54074122]".

54074122 is the (virtual, i.e., not disk-based) inode of the pipe. You can get substantially the same information from lsof:

    $ lsof -E -u ams|egrep 'sleep|wc'|grep pipe
    sleep     2760791  ams    1w     FIFO               0,13        0t0   54074122 pipe 2760792,wc,0r
    wc        2760792  ams    0r     FIFO               0,13        0t0   54074122 pipe 2760791,sleep,1w

But the name "pipe:[54074122]" is actually the "filename" of the pipe, and it comes from here in fs/pipe.c:

    static char *pipefs_dname(struct dentry *dentry, char *buffer, int buflen)
    {
            return dynamic_dname(dentry, buffer, buflen, "pipe:[%lu]",
                                    d_inode(dentry)->i_ino);
    }

(There's an interesting note in Documentation/filesystems/vfs.txt about how pseudo-filesystems like pipefs can generate these names only when someone asks for them, since they're not used for anything otherwise. The only way I know of to ask for the name in this case is to call readlink() on the pipe fd under /proc/$pid/fd, as ls does.)

divyekapoor · on March 29, 2020

Great response - Thank you. Seems I misunderstood the question.

divyekapoor · on March 28, 2020

If I remember correctly, this work was an end of semester advanced OS/Networking class presentation. I picked Pipes, others picked the filesystem, device drivers, CUDA, process scheduling etc. For the master’s thesis, my work was on active indoor localization and tracking (early work before indoor Google maps was launched).

Re: your second question, I guess you’re referring to identifying a FIFO on the filesystem - that’s a simple ls -la - FIFOs show up as regular files with the p attribute. For procfs or sysfs - just ls /proc or ls /sys

Man page: http://nersp.nerdc.ufl.edu/~dicke3/nerspcs/ls.html

You should see the p attribute for FIFOs on the filesystem. Sysfs and procfs map onto internal (in-memory) kernel data structures, so those might just show up as regular files in the “virtual” filesystem. So cat, grep etc. on these files will be just reads/writes from/to the appropriate memory space in the kernel or for read only files, they may be “code generated output” that allows for inspection of some internal state in the kernel.

zaphirplane · on March 28, 2020

Linux got pipes 9 years ago seems not right. What do you mean happened on 2011

amenonsen · on March 28, 2020

The slides are from 2011, not the code. Linux had pipes even in the v0.01 release from 1991, as the original article here mentions.

dbcurtis · on March 27, 2020

Brian Kernighan gives credit to Doug McIlroy for coming up with the idea in 1964. Quoting from _UNIX A History and a Memoir_, page 68:

"To put my strongest concerns in a nutshell: 1. we should have some way of coupling programs like garden hose..."

from a Bell Labs internal memo.

Kernighan's book is a great read, BTW. Highly recommend to all UNIX nerds.

yesenadam · on March 27, 2020

A scan of the original 1964 memo:

https://www.oreilly.com/library/view/making-sense-of/9781492...

amenonsen · on March 28, 2020

Thank you very much for digging this up, and to all the other commenters here for the useful references.

tlamponi · on March 28, 2020

FYI: it's literally in the linked article from this HN post, maybe start reading that, not just the headline and then commenting...

amenonsen · on March 28, 2020

FYI: I'm the author of the linked article, and I added a link to the image after it was posted here.

combatentropy · on March 28, 2020

This has got to be one of the funniest exchanges I have ever seen on Hacker News.

tlamponi · on March 30, 2020

Thanks for adding context, that wasn't clear at all to me, it seemed like the classical "Reads headline and then some comments, and post 'smart' stuff which the original article read anyway" response to me.

Sorry for the "accusation" then.

aasasd · on March 29, 2020

Now I'm confused and curious as to what the ‘library filing scheme’ means. Does it mean shared libraries? IIRC Unix stuff was statically-linked at first. And then again, what are ‘indexing’ and ‘data path switching’?

jjice · on March 27, 2020

+1 for Kernighan's book. I'm not one for impulse purchases, but as when I saw a link on HN, I immediately bought it and read through it. I lent it to a professor of mine and he's using as a text for his history course next semester.

leoc · on March 27, 2020

Here's a nice McIlroy interview with broken formatting: https://www.princeton.edu/~hos/mike/transcripts/mcilroy.htm .

all2 · on March 27, 2020

Does anyone know how the pipe idea compares to Erlang messaging?

As far as I can tell the implementation of each might look like the other in some ways (a queue of some kind with MutEx guaranteed somehow).

Does anyone have the knowledge to write a little on this? Or maybe point to a similar article for that style of messaging?

Matthias247 · on March 27, 2020

Pipes are bytes Streams. Communication between actors as well as CSP channels are object streams.

Pipes exercise backpressure on the writer - if the pipe is full the writer is blocked. Actor systems mostly use seemingly unbounded queues. The sender will not get blocked.

toast0 · on March 28, 2020

> Pipes exercise backpressure on the writer - if the pipe is full the writer is blocked. Actor systems mostly use seemingly unbounded queues. The sender will not get blocked.

Erlang does suspend (block) processes that send to ports or nodes when the buffers for that get full; but not when sending to local processes. There used to be an optional reduction count punishment for senders when sending to messages with larger mailboxes, but it seems that may have been removed. I don't think it would be too hard to add a feature where sending to a local mailbox over a specified size caused the sender to be suspended, but tracking it might be a little difficult.

wahern · on March 27, 2020

Here's a short 2014 paper from McIlroy himself discussing his early thinking back then: https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf

TL;DR: McIlroy was applying the concept of coroutines, which was described by Melvin Conway in 1963. Two processes communicating over a pipe are basically coroutines, except instead of passing structured data they're just passing bytes.

A Unix pipeline is a set of multiple coroutines. In fact, Tony Hoare's 1978 Communicating Sequential Processes paper cites the UNIX shell[1] for the concept of coroutines, and discussion of coroutines figures prominently in that paper. See https://www.cs.cmu.edu/~crary/819-f09/Hoare78.pdf CSP basically models the behavior of a large set of coroutines.

AFAIU, Erlang was partly inspired by CSP. You can draw a straight line from coroutines, through Unix pipelines and CSP, to Erlang's processes.

[1] Specifically, Ken Thompson's 1976 paper, "The UNIX command language." See https://archive.org/details/theunixcommandlanguage

TheGrassyKnoll · on March 28, 2020

"...a practical exploration of using Python coroutines..."

  http://www.dabeaz.com/coroutines/index.html

the8472 · on March 27, 2020

Modern pipes have to serve many lords. Besides bog-standard stdio redirection they also act as MPMC queues of semaphore tokens in build systems[0] but also as handle to a kernel-owned io-vecs for zero-copy DMA via sendfile, splice and friends.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

amenonsen · on March 28, 2020

That commit is a fascinating illustration of how pipes allow a model of inter-process interaction that goes beyond what you can do with temporary files, as well as how much further things have evolved from the days of the "very conservative locking" in readp/writep from the 6E Unix kernel.

wahern · on March 27, 2020

Nice write up. The logic of the sleep and wakeup code, which explicitly pass control from writer to reader and back again, clearly shows how pipes lead to the more refined Communicating Sequential Processes (CSP) concept.

tinkertamper · on March 27, 2020

Is there any recommended books for CSP concepts?

JJMcJ · on March 27, 2020

http://usingcsp.com/cspbook.pdf is the classic.

It has some unusual notation you might need to get used to, but it's a real goldmine of ideas.

tincholio · on March 27, 2020

There's "the CSP book", by Tony Hoare himself: http://usingcsp.com/cspbook.pdf

JdeBP · on March 27, 2020

> whose troff dialect still underlined words with a string of literal ^H backspaces followed by underscores!

... which is still how roff tools do it today. Manual formatters still send, even today, TTY Model 37 style input from 1969 to the manual pager: underlining with BS and the underscore character, boldface with BS and overprinting, and bullet points formed by printing a plus sign over the letter "o"; all of which less/more/pg/most have to recognize (but ironically actually do not).

The relatively modern (1976!) capabilities of GNU groff were deliberately turned off at the turn of the 21st century.

By the way: File-based pipes were later created on a specific "pipe device", whose device number was configured by the /etc/config program and was not necessarily the root.

* https://unix.stackexchange.com/a/450900/5132

amenonsen · on March 28, 2020

> Manual formatters still send, even today, TTY Model 37 style input from 1969 to the manual pager

Thanks, you are perfectly right. What I found charming was that the 3E pipe.2 manual page contains «word^H^H^H^H____» written out by hand in the roff _source_. The 4E one switched to using ".it word" instead.

https://minnie.tuhs.org/cgi-bin/utree.pl?file=V3/man/man2/pi...

Another charming little quirk: the 6E kernel's falloc() does «printf("no files\n")» if it fails to find a free file structure.

monkpit · on March 27, 2020

Much more in depth than the other post with the Python examples that was posted recently! Thanks.

usr1106 · on March 28, 2020

So do modern Linux pipes work bi-directionally? Of course the shell doesn't use them like that. But that does not necessarily mean that the kernel wouldn't support it. I vaguely remember that a colleague used a pipe bi-directionally between 2 C programs he wrote. To my surprise it mostly worked. IIRC there were some minor issues make him giving up the approach. The big surprise for me was that it worked at all. Or is it just racy beyond all control that you read back your own data if the other end has happened not to empty the buffer?

amenonsen · on March 28, 2020

> So do modern Linux pipes work bi-directionally?

No, they do not (nor on the modern BSD kernels, as far as I can tell). The Linux pipe(7) manpage says (under "Portability notes"):

    «On some systems (but not Linux), pipes are bidirectional:
    data can be transmitted in both directions between the
    pipe ends. POSIX.1 requires only unidirectional pipes.
    Portable applications should avoid reliance on
    bidirectional pipe semantics.»

I believe the systems that supported bidirectional pipes were SysV kernels that implemented pipes using STREAMS and 4(?)BSD kernels that implemented it using socketpair.

amenonsen · on March 28, 2020

Sorry, I was wrong about the BSDs. FreeBSD's pipe.2 manpage clearly says that pipes are still bidirectional. https://www.freebsd.org/cgi/man.cgi?query=pipe&sektion=2

That said, it is not based on socketpair any more. sys/kern/sys_pipe.c says:

    /*
     * This file contains a high-performance replacement for the socket-based
     * pipes scheme originally used in FreeBSD/4.4Lite.  It does not support
     * all features of sockets, but does do everything that pipes normally
     * do.
     */

teddyh · on March 28, 2020

> So do modern Linux pipes work bi-directionally?

No, not modern Linux, nor Unix historically. The GNU HURD was supposed to, IIRC, have bidirectional pipes as one of its features.

csense · on March 27, 2020

MIT has a UNIX-like operating system called [XV6](https://pdos.csail.mit.edu/6.828/2012/xv6.html) for teaching OS design.

It's in the spirit of UNIX v6, but it's written for multicore Intel x86 machines in ANSI C.

If you're interested in studying UNIX for teaching or learning OS design, xv6 is a great starting point.

amenonsen · on March 28, 2020

Yes, xv6 is great. The original post here already has a brief section on the code from xv6/pipe.c, which made for pleasant reading after I had just finished working my way through the 6E code.

I also looked at the pipe implementation in Minix, which is a (non-trivial) variant of John S. Dyson's implementation that the BSDs share. It is implemented as a server (in the microkernel sense), so there's quite some added complexity there in handling "vmount"s and locking, but there are still some familiar elements of the code too, such as the "put it all together with flags" code in create_pipe().

https://github.com/Stichting-MINIX-Research-Foundation/minix...

For something even further along these lines, there's also the pipe implementation from Plan9, which at first glance felt so unfamiliar that I wasn't sure I was looking in the right place:

https://9p.io/sources/plan9/sys/src/9/port/devpipe.c

monocasa · on March 28, 2020

And it has a pipes implementation, which is saying something for how ascetic it is. It's only got a couple dozen syscalls total, and is smaller than sel4 at least in lines of code.

MisterTea · on March 28, 2020

How plan 9(front) does it: http://man.postnix.pw/9front/3/pipe https://raw.githubusercontent.com/9front/9front-test/master/...

It's a kernel level file server.

RocketSyntax · on March 27, 2020

I'd greatly appreciate pipes in the next version of Python. Pandas is huge now, and the R programming language has implemented this for its dataframes. There are Python packages for this, but they were buggy for me.

doubleunplussed · on March 27, 2020

What would that mean? Just an extra syntax for passing like, the output of one iterator to a function that accepts an iterable as input? Like, syntactic sugar for:

foo = qux(bar(foo(x)))

RocketSyntax · on March 27, 2020

my_df %>% plot(col1,col2,col3)

see 18.2.1 and 18.2.4 in https://r4ds.had.co.nz/pipes.html

much sugar

mellavora · on March 27, 2020

too much sugar can make your code diabetic.

pipes are cool until they aren't

ktpsns · on March 27, 2020

The %>% notation might not be very short, but there are other short ways of avoiding parenthesis mess as in foo = qux(bar(foo(x))). For instance, Mathematica/Wolfram language allows you to write something basically

     foo = qux @ bar @ foo(x)

which resembles the function composition operator in mathematics (https://en.wikipedia.org/wiki/Function_composition). Piping is basically the reverse notation, i.e. foo = foo(x) | bar | qux.

aargh_aargh · on March 28, 2020

This one is more in-depth: http://www.rozmichelle.com/pipes-forks-dups/

amenonsen · on March 28, 2020

Thanks for the link. That article seems to have a completely different focus (API rather than internals), but it's very nicely done.

0xThiebaut · on March 27, 2020

Thanks for the nice write-up! I have been wondering for quite some time how pipes were implemented but never knew where to start. Very well done!

RoutinePlayer · on March 27, 2020

The other article was a very short quip about the wonder of Unix pipes. Stop the griping. This one is just as nice, and more in-depth. Great.

incompleteCode · on March 28, 2020

I don't think there's anything wrong with saying a _I'm a little disappointed_. That's literally the only remark on the other article.