
Ask HN - Why was there ever, fork() ? - billpg
Hi everyone. I'd like to please figure out something that I don't quite understand. Why did we ever have the fork function call?<p>Back in my youth when I was first taught the fork call, I got how how it worked, two copies of the process return from the function call and they both continue in parallel. The only thing that was bothering me was the long list of caveats my text book discussed.<p>It told me that a copy of the process was made, except for file handles, and that the memory wasn't really copied until one of the two processes tried to modify something. It seemed all terribly complicated but I figured there was a good reason for it that I didn't yet understand, grasshopper.<p>Time passed and I started working in embedded systems and later coding for Windows. I never used fork beyond those juvenilia programs I made. These OSs started new processes by passing an executable filename to the OS and telling it to start a new program. That new process started with a clean slate, no memory, empty stack, no open file handles except stdin/out/err. Simple.<p>Now, I've just been reminded of the fork call in Unix, and I'm prompted to ask; Why was it ever there? Who <i>wants</i> the ability to do a fork when simpler ways of starting a new process exist.<p>Nearly all the uses of fork I've seen are usually followed by an exec call. So the OS goes to all that trouble setting up a duplicate process only for all that hard work to be eliminated by running exec.<p>Even when concurrency within a program is needed, the thread model seems far more useful with a lot less complications.<p>So please, I need to know, why fork?
======
jacquesm
Because a 'fork' is the most natural way for one process to transform itself
into another process while continuing to run by itself.

I actually think it is one of the most elegant system calls in unix.

Think of all the alternative clunky ways that OS's before unix had to use to
start a process at a given depth into the process. Lots of flags to make sure
that you started off where you left in the 'parent', to recreate all or most
of the state required for the child process. Fork passes all that state 'for
free'. And copy-on-write makes it fast.

It's a bit like biology. Split the cell, then let them both specialize a bit
towards what they have to become. The moment of splitting is almost 100%
symmetrical, the only difference being who is the 'parent' and who is the
'child' process.

Other ways of starting new processes feel clunky in comparision, you have to
specify a binary to run, you have to know all kinds of details about
parameters to pass and so on.

Fork essentially abstracts the creation of a sub-process to the absolute
minimum.

Fork is atomic, it's got 0 parameters and it returns only one integer (or it
fails for a simple reason, no more process slots).

~~~
cousin_it
Needing an exact copy of the current process is an atypical use case in my
experience. And the typical use case of running another binary isn't made any
more elegant by fork(), it just shuffles the complexity into exec().

~~~
nwatson
Here's where using fork() without an ensuing exec() is very natural ...

A server process that must reply to hundreds of network requests per second
but still remain single-threaded pretty much needs fork(). The persistent
parent process sets up a common configuration, accepts incoming network
requests, and calls fork() (which is very fast) to do the real work for each
request in a child process.

That child's work involves looking up relevant data or recording relevant data
or doing a transaction that may involve only local memory, the filesystem, a
DB, other processes, or other network services. The child must also reply to
the network client. Doing this all in the single-threaded parent would
preclude it from handling other requests and make it impossible to respond to
hundreds of clients (unless the parent uses complicated asynchronous
processing and the per-request work is mostly I/O and low on CPU ... but that
complicates the server process). Starting a separate process per client
instead (as in CreateProcess() or spawn()/fork-exec()) complicates the
architecture and is very expensive because all info needed for the client
reply can often be housed in the original parent process and inherited by the
child (e.g. locally derived/cached DNS results). fork() leads to the simplest
and most efficient architecture.

~~~
bayareaguy
_the simplest and most efficient architecture_

Actually I think in most cases the simplest and most efficient architecture is
when your server creates a process for each specific type of thing it needs to
do and then simply delegates incoming requests.
<http://www.okws.org/doku.php?id=okws> is a good example.

~~~
jerf
Ah, but now it's a lot harder to share anything that isn't simply coming out
of a shared library; pre-cached computations, any code that isn't in a shared
library (like Perl or other interpreted code), any expensive computation that
must be done per-process, etc. Forking off children is virtually impossible to
beat on the efficiency front; people tend to grossly overestimate the expense
of a "fork".

The only way to win with the approach you suggest is if you have some sort of
massively complicated server that you only ever need some small part of at any
given point in time, allowing you to do a lot of swapping in and out of
memory. I've never seen such a beast and can't really come up with a non-
contrived use case. YMMV, but it certainly isn't a common case.

(Yes, Perl isn't exactly interpreted, but from this point of view it certainly
isn't a shared binary library.)

~~~
bayareaguy
Although I believe you're wrong about the actual utility of this kind of data
sharing across processes (and also making exaggerating claims about
complexity), I might change my mind if you provided a specific real-world
reference example.

------
jbert
Forking predates threads. It's also makes more sense as a system call, since a
library writer can implement spawn() in terms of fork() and exec() but not
vice versa.

It's also still a good model when you want to run N copies of the same code
concurrently, since the processes are isolated from each other, making it
easier to reason about correctness. There are some wrinkles (primarily due to
inherited global state such as filehandles), but they're reasonably well
understood (by Unix+C programmers).

If you need significant shared state between concurrent paths of execution in
the same code, then threads are probably easier.

~~~
billpg
> Forking predates threads.

Suddenly, everything becomes clear.

~~~
jacquesm
Maybe clearer than it should be then :) You would _still_ need something like
fork, even if unix would have had threads from day 1. It's just that we
wouldn't have been using fork+communication to simulate multi-threading in
those cases where multi-threading are more appropriate.

------
vinutheraj
On a related note, the Plan 9 system call rfork() gives a system call by which
you can create new processes or lightweight processes(threads) with a single
system call, deciding on what resources are to be shared between the new
processes. There are no 2 discrete entities called processes and threads, just
processes !

 _"In Plan 9, fork is not a system call, but a special version of the true
system call, rfork (resource fork) which has an argument consisting of a bit
vector that defines how the various resources belonging to the parent will be
transferred to the child. Rather than having processes and threads as two
distinct things in the system, then, Plan 9 provides a general process-
creation primitive that permits the creation of processes of all weights."_ \-
Rob Pike.

You can read more about it here -
[http://groups.google.com/group/comp.os.research/browse_threa...](http://groups.google.com/group/comp.os.research/browse_thread/thread/a7f0d27ad2af5a26/c9d4071bf0e2ae4c?lnk=gst&q=Rob+Pike&rnum=10)

~~~
jacquesm
Plan 9 still is in many ways the most interesting thing to happen to the world
of systems design after unix.

It deserves more attention.

------
jgrahamc
fork() is commonly used where you want to communicate to the sub-process using
a pipe. Given how common piping from one process to another is in Unix it's
not surprising that fork() was implemented (nor that fork then exec was
common).

It's true that Windows tends not to use this paradigm, but it is common in the
Unix world specifically because of the simple ability to share file handles
between a parent and child process. And also the parent and child processes
share most everything else (for example, they have the same environment
settings).

------
DougWebb
fork() is great when you're writing a service. A pattern that I've used
repeatedly is:

\- The initial process reads in configuration files, sets up an environment,
and opens a listening socket for the service. It then forks several times to
create service processes. From this point, the initial process' only job is
start new service processes when/if they exit or when load increases, and to
shutdown the whole service when told to.

\- The service processes run in a loop waiting for requests to come in on the
socket, which they all share. The service can handle as many concurrent
requests as you've got service processes, and they all operate independently.
Thanks to copy-on-write, they all access the same configuration information
stored in the initial process' memory. When a request comes in, the service
process accepts it (which creates a new socket) and does some initial sanity
checking to make sure it's a valid request, and then forks to create a handler
process to actually process the request. It then goes back to listening for
requests.

\- The handler process is the workhorse. It gets the connection socket from
its parent, and it's still got access to all of the config info. It's an
independent process, so it's free to do whatever it needs to, without risk of
impacting the continued operation of the service. Once it's done handling the
request it can simply exit, freeing up whatever resources it consumed while
handling the request.

In this pattern, the initial process and service processes have very simple
jobs and very little code, which makes them easier to make bug-free and
robust. Having lots of independent processes instead of threads adds
robustness, because a crashing process can't take down the other processes in
the service (unless it takes the whole machine down, of course.) This is
rarely a problem in the initial or service processes, but the handler
processes are exposed to the world and are much more likely to encounter
unanticipated input, so they're the hardest to make robust. With the pattern,
they don't need to be as robust, because they're allowed to exit unexpectedly
without harming the service.

~~~
billpg
That right there, the perfect use case for fork. Spinning a task off to do its
thing in isolation.

------
barrkel
fork() is strictly superior to an API like Win32 CreateProcess(), because it
can do more with less.

Processes normally inherit lots of context from their parent: the user
identity, the window station (Win32-speak), security capabilities, I/O handles
/ console, environment variables, current directory, etc. The most logical way
to inherit everything is to make a logical copy, which is very cheap owing to
memory architecture.

Because of this things that would normally need two APIs, one synchronous and
one asynchronous, can be programmed easily. If you need the synchronous
version, call it directly; otherwise, fork and call it, and wait on the pid
(at a later point) if necessary in the parent.

And I rather vehemently disagree with you saying that the threading model has
less complications than the process model. I believe there's almost universal
agreement that the problem with threading is mutable shared state, and the
process model avoids it.

~~~
thwarted
To see some of the hoops that need to jumped through to emulate fork() on
systems that don't have it, and the limitations of doing so, check out the
perlfork man page.

<http://perldoc.perl.org/perlfork.html>

------
rtm
fork() and exec() work well as separate system calls for the common situation
where the child (but not the parent) needs to adjust something before
executing the new program. Changing file descriptors to implement > and < in
the shell, for example. It's common to see sequences like

    
    
      pid = fork();
      if(pid == 0){
        close(1);
        dup(pipefds[1]);
        exec(...);
      }

~~~
caf
Aye. This way, we can control a large number of aspects of the child's
environment in which execve() is called, without having to have execve() do
all that work for us. We can open files, change the session id, reparent the
process to init, alter environment variables, lower process limits, change
credentials, change the root directory... the possibilities are legion.

You wouldn't want to have to design a way to pass all of those things to
execve(), would you?

------
Erwin
An interesting question. I don't know of the design decisions or whether the
fork idea predated UNIX. But to me, it's just the sheer simplicity. Compare to
the basic process creation function on Windows, taking 10 arguments:
<http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx>

Another useful application of the implicit descriptor sharing is
<http://en.wikipedia.org/wiki/Privilege_separation>

~~~
cousin_it
Except fork() doesn't do what CreateProcess() does. I'd actually be pretty
interested to see a fair comparison: a list of Unix system calls, with
arguments, that covers all the functionality of CreateProcess(). Are you
completely sure that it'll be smaller and more elegant?

~~~
abrahamsen
Well, it is certainly the UNIX way. Have many simple tools that do simple
stuff, plus the ability to combine them for doing complex operations, rather
than few complex tools that do complex operations directly.

I'd say it is more elegant design, whether or not it results in smaller or
more elegant code.

------
yan
> So the OS goes to all that trouble setting up a duplicate process only for
> all that hard work to be eliminated by running exec.

That's just it, creating a process that's an exact copy is the path of least
resistance. Due to the way the VM system works in most modern hardware, it's
much cheaper to create an exact copy of a virtual address space (you're just
copying TLB entries) than it is to create a brand new one.

------
gcv
Samba uses fork to create new instances of itself to handle each connected
user, and I am often grateful that it does that instead of using threads.
Since each user has a separate process, something going wrong inside the
process means that the process dumps core, but none of the other users ever
notice. Even the user whose smbd process crashed doesn't notice much except a
brief delay while he reconnects.

No real need to monitor (except to try to catch the bug that caused the crash
in the first place), and no need to manually restart anything. It all just
keeps going.

Basically, a server process using fork has a lot of resilience built in. In
contrast, a crash in a threaded process will kill all the threads at once, and
all users feel the pain.

------
vii
fork is really handy when you actually want multiple copies of your program
(sharing only initial state) -- it can be used like creating a working thread,
but you don't have to worry about sharing (subsequent) state. This means you
can exploit multiple cores without the complexity (and bugs) of shared memory
threads.

Actually, fork(2) has now evolved into clone(2) on Linux, so you can choose in
quite a fine grained way what the threads/processes will share.

The separation of the functionality of spawn between fork and exec is
surprisingly handy (even though people occasionally still come up with
vfork(2)).

------
kniwor
Here is a use case... We have a shell that must execute

    
    
      $ cmd_a | cmd_b | cmd_c
    

The simplest way for the shell to accomplish this request is to fork itself
multiple times. Doing so without fork would be difficult. I figure since
multitasking and pipes are old as eternity in the linux world, fork must have
been an early necessity and this use case might have something to do with
their prominence but then again I am just guessing.

~~~
billpg
If you'll excuse the pseudo code...

Pipe pipe1 = new Pipe() Pipe pipe2 = new Pipe() NewProcess("cmd_a", null,
pipe1) NewProcess("cmd_b", pipe1, pipe2) NewProcess("cmd_c", pipe2, null)

~~~
kniwor
You are not passing the current shell state though. So you could start 3 new
shell processes with enough data to set the state right and then start the
individual processes but that is both inelegant and inefficient.

Instead if we fork thrice and each fork execs a command and does the pipeline
plumbing, all three processes start simultaneously and inherit the exact same
shell state all for free. And copy on write means we did not waste any memory
replicating the shell state for 3 processes.

------
tybris
Servers & Shells.

i.e. the things Unix systems are good at, but Windows systems are not.

------
clord
Epic troll. Bravo.

But seriously: fork(2) is natural and mathematical. There is no IO involved.
That is to say, when you call it, you don't have to activate some spinning
mechanical thing and wait several million or billion cycles while it clatters
and bumbles along, filling the higher caches with code and data.

fork is blazing fast; effectively an O(1) operation. It's just about as light-
weight as process creation can get.

fork is useful. It allows one to manage complicated families of processes,
complete with pre-fork and post-fork activity. Threads can't match it here.
The only thing I can think of that surpasses the multiprocessing capability of
a forking process is modern async IO. And then you have to implement all the
management stuff by hand.

With all due respect, someone who claims to have embedded experience shouldn't
have to ask hacker news about the benefits of fork, unless your embedded
experience is all on Windows Mobile and its ilk, where CreateProcess rules the
day.

~~~
billpg
> With all due respect,

"Epic troll. Bravo."

> someone who claims to have embedded experience shouldn't have to ask hacker
> news about the benefits of fork, unless your embedded experience is all on
> Windows Mobile and its ilk, where CreateProcess rules the day.

Why do I feel an urge to defend my experience to someone who openly insults
me? I should just walk away.

(sigh)

I've worked on OS-less systems, where we have a short bit of assembly to pass
control over to the "Main" function written in C. We implement concurrency by
hooking into the timer-tick interrupt. I'd describe those as having two shared
memory threads, one pre-emptive and one co-operative.

As well as that, I've used psos and vxworks. These have pre-emptively switched
processes, but I'd describe them as threads as they share memory and have no
protection between them. There's no memory management or virtual memory, load
memory location 42 and 42 comes out of the address bus.

~~~
clord
I'm not trying to insult, but on re-reading I guess one can see it that way,
what with the troll remark and all. You have to admit your post had most of
the effects your typical internet troll's would.

But on topic; your original post suggests that there are "simpler ways of
starting a new process" and that using threads "seems far more useful with a
lot less complications."

I think this is wrong on both counts. There is no simpler way to start a
process, and using threads leads people towards manually reproducing many of
the things fork provides for free, leading to more complicated and difficult
to understand code.

I understand when the average programmer misunderstands fork, but systems
programmers should know better. Since your experience is on the hardware
level, and not operating systems, it makes more sense that you're not aware of
the advantages of fork. But I still can not fathom what you consider to be
more simple than fork. Perhaps your definition of process creation differs
from mine, and most others? I'd like to understand more, in any case.

------
DarkShikari
_Nearly all the uses of fork I've seen are usually followed by an exec call.
So the OS goes to all that trouble setting up a duplicate process only for all
that hard work to be eliminated by running exec._

<http://en.wikipedia.org/wiki/Copy-on-write>

~~~
cousin_it
The OP was asking why we need to copy the state at all, not how to optimize
it.

~~~
jacquesm
Because plenty of times what a child will do is dependent on what the parent
was doing just before the fork, and in fact may simply be a bit of code to run
a background task related to the parents foreground task.

Also, and this is a very important bit I think, fork started out before
'threads' were common, so another process to run the same code was a common
solution. The communication between parent and child was through a unix pipe.
That way you could write one single program, with all the state shared between
the two sides of the fork, so both parties have access to all the context.

The copy-on-write bit set on all the pages with state in them in the child
guarantees that fork is very fast and pushes the copying of the state as far
in to the future as it can get away with. So forking a process with 10M
resident is as fast as forking a process with only 100K resident. When you
modify the memory in the child you get to pay 'bit-by-bit' for the cost of the
cloning of the parent, but never more than you actually need.

Clever programmers make sure that the state variables that are going to be
modified by the child live close to each other.

An alternative to that is to use shared memory and mutexes, that way you can
get pretty close to the 'threaded' model using only processes.

~~~
ryanpetrich
well, page-by-page technically

------
paulmcl
Don't forget that fork predates not just threads but virtual memory so cloning
all of the volatile memory of a process was really cheap (because there was
only a few K of it). Look at the source code for fork in Lyons book on the
version 6 kernel and you'll see how simple it used to be.

------
hapless
In the systems programming languages of old, fork is just easier than
threading.

Fork model:

Step 1: Write a program to accept a single connection to a single TCP socket,
then handle the request.

Step 2: Judiciously place a fork() call at the time of the new connection
coming into the socket.

Step 3: Add an "if" statement to wait for another request if you happen to be
the parent process after the fork.

You're done!

You just wrote a program capable of handling thousands of concurrent requests,
with none of the concurrency nightmares that keep sensible men up at night.
Going from the simplest case to the finished version was a two-line code
change.

~~~
billpg
If the new task can work in isolation, then yes, fork seems ideal. If the
tasks need to interact, then threads seem (to me) more useful.

I've written web services in the past with databases storing data (like most
web applications do) and I've often wished that the potentially many processes
could just be multiple threads in a single process instead, so I could have
them just share an array of objects without the overhead of a database server.

------
klodolph
So I've got a process running as root. I want it to spawn a new process
running as user1, in /var/fred, with a pipe to stdin and direct stdout to
/var/log/greg.

First I create the pipe, then call fork. In the child, I chdir to /var/fred,
open /var/log/greg, run fdup2 on the pipe and on the handle to /var/log/greg,
setuid to user1, and then finally call exec.

Show me an API that can do that without fork.

All the popen / spawn / system functions are not system calls but rather
library functions which operate by calling fork.

~~~
billpg
> Show me an API that can do that without fork.

[http://msdn.microsoft.com/en-
us/library/ms682429%28VS.85%29....](http://msdn.microsoft.com/en-
us/library/ms682429%28VS.85%29.aspx)

~~~
caf
Exactly the point - that call takes 11 parameters, many of which are complex
structures themselves. Compare that to:

pid_t fork(void); int execve(const char _filename, char_ const argv[], char
*const envp[]);

The idea is to have several simpler system calls that you can wire together to
get the complex effect you need, rather than trying to build an ultimate
CreateProcess function that can handle any case of infinite complexity.

------
toddh
System 5 was process based. You start processes, monitor process, kill
processes. Processes share data through message queues. You entire
architecture is process based so you need a way to start processes.

Simple embedded system don't have processes or threads. The are just loops.
More complex embedded systems are real-time oriented and will use threads as
the locus of control because the whole memory space is shared amongst the
threads. No need for processes at all.

------
coliveira
The UNIX methodology is to have a large number of processes cooperating to
provide functionality. "fork" is the fastest way to create a new process; so
the reason for fork is to provide a system call that creates a process really
fast.

Windows philosophy, on the other hand, is to have monolithic programs that
solve everything by themselves. They infrequently need to start new processes,
so fork is not viewed as important.

------
ErrantX
On the subject of fork() and exec I have used that in the past so that the
parent and the child can share I/O - thus allowing the parent to monitor the
exec'd program more closely.

In the end I gave it up as a lost job; whilst the general idea of fork() is
appealing we found much "better" ways for fine grained process control.

------
bengtan
I vaguely (possibly incorrectly) recall that fork is the only way to create a
new process, and that, no matter what system call you use, deep down, it still
needs to call fork().

Someone with a better memory may correct me.

I'm not sure if this holds true for Windows though.

~~~
jng
no fork() in windows

------
bediger
fork() by itself - inelegant. fork() - do ARBITRARY stuff in child process -
exec(), now that does all kinds of things that a spawn()-type process creation
cannot do.

fork() allows the ability to do anything before exec(), setting up lighter-
weight process creation, and whatever flexibility the programmer desires.

I'd turn it around: why do the designer's of spawn() or CreateProcess() think
they've got the foresight to cover all of the bases for programmers? Why don't
those systems do fork()/stuff/exec() to simplify?

~~~
jacquesm
Why is fork by itself inelegant ?

Text editor, hit 'save', editor forks, saves in the background and quietly
exits, no matter how long the save will take. Meanwhile the user continues to
type in more text in the foreground.

Just one example.

------
axod
fork is a really elegant call.

Maybe you should be using higher level calls, but back when I was writing
linux assembly programs fork was awesome.

There's not really that many caveats to using it at all.

