
Why Ruby App Servers Break on MacOS High Sierra - mef
https://blog.phusion.nl/2017/10/13/why-ruby-app-servers-break-on-macos-high-sierra-and-what-can-be-done-about-it/
======
saagarjha
Greg Parker, who works on the Objective-C runtime, has a blog post that goes
into more detail:
[http://www.sealiesoftware.com/blog/archive/2017/6/5/Objectiv...](http://www.sealiesoftware.com/blog/archive/2017/6/5/Objective-
C_and_fork_in_macOS_1013.html)

~~~
hinkley
So, basically Objective-C doesn't like the RAII pattern if the resource is a
process? Ouch.

I got the gist of his summary but his writing is a bit... awkward. Is English
not his first language?

~~~
valleyer
C has this same problem. Most of the C standard library is unsafe to use
between fork and exec of a multithreaded program. This usually includes malloc
and printf.

~~~
dozzie
How so? After fork() you are guaranteed to only have one remaining thread in
the child process, and the parent process doesn't care if there was a fork()
or not.

~~~
valleyer
malloc() and printf(), for example, have mutexes internal to their
implementation. Suppose the parent process has two threads. One is about to do
a fork, and the other is in the middle of a malloc(). The fork occurs. The
parent process continues on as normal. In the child process, there is only one
thread -- a clone of the forking thread, but the memory state of the child
process is a full clone of the parent (except for the return value of fork(),
of course).

The single thread in the child calls malloc(). But the malloc mutex is already
held, because in the parent process, a thread was executing there.
Unfortunately, since that thread does not exist in the child, it will never be
able to release the mutex. The thread in the child is deadlocked.

~~~
dozzie
Well, I haven't thought about locked mutexes. It makes sense.

------
joshmn
This issue has been addressed on ruby-head

[https://github.com/ruby/ruby/commit/8b182a7f7d798ab6539518fb...](https://github.com/ruby/ruby/commit/8b182a7f7d798ab6539518fbfcb51c78549f9733)

~~~
akvadrako
It's been addressed, but it's a hack. To be safe, apps must stop all other
threads before forking.

The only correct fix I can see is to have pre- and post- fork hooks in every
library, which make that guarantee.

~~~
nateberkopec
In Puma, we emit a warning if we detect any additional threads running prior
to fork.

------
dep_b
> This cryptic error

Well....I don't think I've seen many programmer-to-programmer errors that are
less cryptic than the one described in the article. It's actually quite
amazing how much explanation you sometimes get from Cocoa!

~~~
semanticfact
It's only cryptic if you don't read it, right?!?

------
the_mitsuhiko
Python suffers from the same issue but the response there has largely been to
stop using modules that use objc.

~~~
captainmuon
I was thinking about Python when reading this, the real culprit is not objc
but the third-party modules.

If I understand correctly, this problem is caused when you call initialize in
another thread, and while that is running, you fork.

It affects these app servers because this is triggered at "require" (or
"import") time. This is madness, no module should run code just when you
import it. OK, I have been guilty of that, too. But if absolutely neccessary,
keep it to a minimum. It breaks all kind of things when module imports are not
side-effect-free.

Launching a thread and (indirectly) taking a mutex is definitely not something
you should do on module import!

~~~
dchest
As far as I understood, the module isn't launching threads at import time. On
the contrary, the problem is that since pg is linked against macOS's Kerberos
and LDAP frameworks, as soon as the Objective-C runtime realizes that
NSPlaceholderDictionary initialize method was called after fork(), it crashes:

 _One of the rules that Apple defined is that you may not call Foundation
class initializers after forking._

~~~
captainmuon
In that case, the new behavior doesn't make sense. You could call `fork` as
the very first thing in a program, and then do all the stuff in the
subprocess, and I wouldn't expect it to make a big difference.

Maybe there is an internal `mark_exec_as_called()` function that you could
exploit, if you are forking without exec...

------
oblio
This is so incredibly Apple :)

The breakage, I mean. To clarify a bit, for better or for worse, this is what
Microsoft does, totally different psychology:
[https://blogs.msdn.microsoft.com/oldnewthing/20031223-00/?p=...](https://blogs.msdn.microsoft.com/oldnewthing/20031223-00/?p=41373)

~~~
geocar
So Unix people had this function called `gets` that was defined like this:

    
    
        char *gets(char *at);
    

In the early days, if someone wanted a string you could do:

    
    
        x = gets(sbrk(0));brk(x + strlen(x) + 1);
    

And this is perfectly safe, but it is perhaps the only safe way to use `gets`.
See, most people wanted to write:

    
    
        char buf[99];
        x = gets(buf);
    

And is this _not_ safe because `gets` doesn't know that `buf` only has room
for 99 bytes.

The API has a choice:

a) They can make it harder to do the wrong thing; make `gets` generate errors,
warnings, crash if you use it, etc. This requires people fix their programs.
That's what GNU did for `gets` and it's what Apple is doing here.

b) They can change the rules: It's possible to modify the ABI and require
markers or a mechanism to detect where the edge of the buffer is. This
requires people recompile their programs. I think Zeta-C had "fat pointers" so
gets should be safe there[1]

c) They can work around: If you have enough resources, you can look at the
programs using this API and figure out what their buffer sizes are and
hardcode that into the API provider. Microsoft has famously done this for
SimCity[2]

That's it. They can't really do anything else: The function is difficult to
use right and programmers do the easiest most-obvious thing they can. _oh i
need to get a string, so i 'll use gets... but i need two strings so...._

Anyway, I'm not aware of any good rules for choosing which way to go: Less
total effort seems like a good metric, but this is very hard to estimate when
you live on an island and don't know how other people use your software.

Memory corruption is serious though: It's often very easy to turn into a
security hole, so I generally advocate doing _something_. All of the people
just disabling this security feature make me nervous. I wonder how many of
them run websites that I use...

[1]:
[http://www.bitsavers.org/bits/TI/Explorer/zeta-c/](http://www.bitsavers.org/bits/TI/Explorer/zeta-c/)

[2]:
[https://news.ycombinator.com/item?id=2281932](https://news.ycombinator.com/item?id=2281932)

~~~
fanf2
Surely that gets(sbrk(0)) will not work since sbrk(0) returns a pointer to
unmapped memory. Maybe you wanted sbrk(BUFSIZ)?

~~~
geocar
Well, no actually: That still limits you to a gets of BUFSIZ.

Just as we often grow the stack with page faults, you could grow the heap with
page faults: Modern UNIX doesn't because this is another one of those "almost
certainly a mistake", but:

    
    
        signal(SIGSEGV, grow);
        void grow(int _) { sbrk(PAGESZ); }
    

should work.

------
mhandley
Interesting discussion. If these were user-space threads, like FreeBSD ~20
years ago, there'd be no problem. When fork() is called, the whole user-space
threads package would be forked, along with all the threads.

So the obvious question is whether it's _fundamental_ that with kernel threads
the fork() system call doesn't clone all the other threads in the process?
Yes, that's not how it's done, but could Apple choose to implement a new
fork_all() system call? I imagine it wouldn't be easy - you'd need to pause
_all_ the running threads while you copied state, but is there a reason it's
actually not possible?

~~~
geocar
Is this what you want most of the time?

If you're just going fork() && exec() then why would you copy all that state
just to run some subprogram?

Is that what you want _any_ of the time?

This prefork implementation is silly. Either do prefork after initialisation
so you can take the benefits of COW, or don't bother: Just use SO_REUSEPORT
and run 10 copies of your server. This distributes the TCP traffic _and_
provides you an excellent way to upgrade-in-place (just roll up the new
version as you roll-down old ones).

~~~
mhandley
I agree - most of the time, the current fork() implementation is what you
want, so it's the right choice. The conventional wisdom is that you shouldn't
spawn threads and then fork(), and so long as you stick to this, you're fine.

Today though, everyone loves threads. It's hard to know precisely what happens
in some low-level library, so it's hard to be sure that you don't have any
threads left running after initialization. This is the subject of the OP.

My question though is broader - if we chose, _could_ we add a version of
fork() that did clone all the threads? I'm not entirely sure what it would be
used for, but I'm sure there would be uses. Likely some of those would be for
increased security, as processes provide stronger isolation.

~~~
geocar
> I'm not entirely sure what it would be used for, but I'm sure there would be
> uses.

I suspect strongly there aren't.

One of the biggest problems you have to contend with is around mutexes (and
the like) and their state. If you copy them, we double-book an external-
resource (like storage or something), and if you don't we almost certainly
deadlock.

Another is any thread waiting on a resource (like reading a file descriptor)
or writing to the disk (do you write twice?) and so on.

------
mkj
Do developers just call whatever function seems to work without reading the
docs? It doesn't work for low level programming.

    
    
      ~ man fork
    

CAVEATS: There are limits to what you can do in the child process. To be
totally safe you should restrict yourself to only executing async-signal safe
operations until such time as one of the exec functions is called. All APIs,
including global data symbols, in any framework or library should be assumed
to be unsafe after a fork() unless explicitly documented...

~~~
ProAm
Welcome to the stackoverflow era of programming, where technical specs don't
matter and frameworks are life.

~~~
mrguyorama
This has been around since well before stack overflow

Here are a plethora of examples from the Windows 95 days:
[http://ptgmedia.pearsoncmg.com/images/9780321440303/samplech...](http://ptgmedia.pearsoncmg.com/images/9780321440303/samplechapter/Chen_bonus_ch02.pdf)

------
throwaway613834
fork() _fundamentally does not make sense_ as the de-facto method of starting
a new process. Why aren't people using posix_spawn() by default?

~~~
segmondy
implementation of fork is defined by the POSIX standards. Apple should respect
it. There are tons of apps that have been written using fork, are you going to
change all of them to use posix_spawn()?

~~~
tedunangst
Posix standard says you can't call anything but async functions after fork()
in a threaded program. Application developers should respect that.

------
booleanbetrayal
for other fun low-level High Sierra issues, see the PostgreSQL msync() thread:
[https://www.postgresql.org/message-
id/flat/13746.1506974083%...](https://www.postgresql.org/message-
id/flat/13746.1506974083%40sss.pgh.pa.us#13746.1506974083@sss.pgh.pa.us)

~~~
salarycommenter
Interesting. Mapping 64k at a time seems a bit excessive. If I call mmap it's
to map everything I'm going to need, but maybe I am missing context.

~~~
snuxoll
PostgreSQL still supports 32-bit platforms, heap and index files can get large
enough that it's not feasible to mmap them into a 32-bit process. As far as
the specific number of 64k though, I don't know why they choose that, writing
8 pages to disk at a time (especially sequentially) doesn't really take
advantage of modern hardware well.

------
olivierlacan
The discussion on the Ruby core team issue tracker is also very informative:
[https://bugs.ruby-lang.org/issues/14009](https://bugs.ruby-
lang.org/issues/14009)

------
bpicolo
I've hit similar issues with uwsgi in recent memory (though pre high-sierra),
where an OS upgrade caused it to start segfaulting when using the `requests`
lib inside CoreFoundation somewhere (though of course entirely unrelated to
the new forking changes).

Maybe this? Though the resolution was to disable uwsgi proxying globally...
[https://stackoverflow.com/questions/35650520/uwsgi-
segmentat...](https://stackoverflow.com/questions/35650520/uwsgi-segmentation-
fault-when-using-flask-and-python-requests)

------
krisives
What does Linux do right now?

~~~
dchest
If you fork after launching threads, your program is an unexpected state, so
it may corrupt data or crash.

The same happens on macOS, it's just that Apple added a contract that you
can't initialize ObjC classes after forking.

------
lima
Why do people run applications servers on macOS?

~~~
tonyedgecombe
[https://www.apple.com/uk/macos/server/](https://www.apple.com/uk/macos/server/)

~~~
giancarlostoro
I've mostly seen it for managing multiple Macs in a classroom environment. Not
sure I've seen it used heavily in production to run web servers though. Was
curious and was able to find sites that do let you rent Macs that you remote
into, interesting setup since you know exactly the hardware you're getting
into.

------
throwme211345
LOL. mac osx breaks fork() to avoid state inconsistency in threaded
applications. How about pthread_atfork() semantics? But ,as usual, apple heavy
hands userspace and breaks things. Nothing new to see here, move on.

~~~
throwme211345
What you don't like the fact that apple sucks for breaking userspace (as
usual) or that pthread_atfork() type approaches should be in every programmers
toolbox?

~~~
oshepherd
POSIX has deprecated pthread_atfork because it is unworkable. In particular,
atfork handlers can only call AS-safe functions, which means they're useless.

~~~
throwme211345
As you say except I explicitly noted '..type approaches' and '..semantics'. If
a library designer does things in a way that makes you doubtful of state then
don't fork. If you do fork block signals and exec. It doesn't help the race
but it does help your peace of mind (i did what i could).

Apple still sucks BTW. Heavy handed nonsense. Let developers deal with the
consequences of their actions.

------
ransom1538
I agree the bug should be fixed. But, why not just use docker, then run rails
like its on ubuntu/linux on your mac? It miserable having windows/mac/etc
specific issues.

~~~
sillysaurus3
Docker grows to >8GB. I have 13GB free, 486GB used. My free space fluctuates
by as much as 20GB depending on how much RAM I'm using. (And by "I'm" I mean
"Chrome.")

I've never felt the need to install docker on my local dev environment. It's
great for production, and I'm sure it's great for people who can afford the
disk space. But when space became tight, Docker was the first to get the axe.
I haven't missed it yet.

~~~
jasonjei
I'm just curious how you would deploy on a container production environment if
you can't test all the platform-specific issues while developing on Mac
instead of using Docker to develop for a container platform. Isn't the whole
point of using Docker to minimize the differences between production and
development?

~~~
matwood
Deploy to a container test environment first? I hope no one is going from
development -> production.

------
pantulis
I love this geek-porn stuff, and the Phusion guys never fail to deliver it ;)

But my question is: is this really that important? I mostly use macOS for
development, don't feel that the preforking model has that impact in the
development cycle.

~~~
FooBarWidget
It is important for dev-prod parity. You will want to test whether your code
is compatible with preforking, which you will likely use in production.

