
Parallelizing the Naughty Dog Engine Using Fibers [video] - Splines
http://www.gdcvault.com/play/1022186/Parallelizing-the-Naughty-Dog-Engine
======
corysama
Similar talk: CppCon 2014: Jeff Preshing "How Ubisoft Develops Games for
Multicore - Before and After C++11"

[https://www.youtube.com/watch?v=X1T3IQ4N-3g](https://www.youtube.com/watch?v=X1T3IQ4N-3g)

Ubisoft's talk spends more time getting into the weeds with atomic ops.
Naughty Dog's is more of an architecture discussion. If you can only watch
one, I'll recommend Naughty Dog's.

------
k__
Cool.

Me and a friend did this at university with the Ogre 3D engine and failed
miserably, because I didn't know much abut C++ or thread safety.

We tried the actor model, every game object became an actor and sent messages
around. The actors would then be spread over the amount of CPUs and the
performance should rise with every CPU. In the end we got it running on
multiple cores, but the message overhead killed the performance, haha.

~~~
wolfgke
> Me and a friend did this at university with the Ogre 3D engine and failed
> miserably, because I didn't know much abut C++ or thread safety.

Fiber safety is much more easy to assure than thread-safety. Under Windows
using fibers is very simple (just use ConvertThreadToFiber on your current
thread if you have not done already and then call CreateFiber) in opposite to
POSIX systems where you will have to roll up your own implementation of fibers
(or use some existing library which is not part of the POSIX standard).

~~~
gpderetta
makecontext/swapcontext was the POSIX approved way to do fibers. It was
removed from the last POSIX standard because who uses coroutines or fibers
today? Around the same time every other mainstream language was adding support
for corutines/generators/fibers.

~~~
wolfgke
makecontext/swapcontex doesn't have automatically growing stack (opposed to
their WinAPI equivalents). Yes, you can do this, too, I know. But if you don't
want to go into the gory details here you either have a alloc a "large enough"
buffer for your coroutines thus probably wasting memory.

~~~
gpderetta
technically you only waste address space which should be plenty enough at
least for a while.

------
djhworld
Naughty Dog must be doing something completely off the charts in comparison to
other developers, I challenge anyone to look at Uncharted 4 and provide to me
a better looking game.

~~~
sangnoir
I think CD Projekt RED are definitely in the same league as Naughty Dog - The
Witcher 3 looks amazing (disclaimer: I haven't seen it on the PlayStation)

~~~
Narishma
It has poor performance on consoles.

------
dpc_pw
For fibers (AKA Coroutines) in Rust, see mioco
[https://github.com/dpc/mioco](https://github.com/dpc/mioco)

------
rawnlq
An older but similar approach from Doom III engine:
[http://fabiensanglard.net/doom3_bfg/threading.php](http://fabiensanglard.net/doom3_bfg/threading.php)

I wonder if there is a good open-sourced C++11 project for this pattern? (a
job/task queue)

Also how does this pattern compare with just using future/promises with a
parallel executor?

[https://code.facebook.com/posts/1661982097368498/futures-
for...](https://code.facebook.com/posts/1661982097368498/futures-for-c-11-at-
facebook/)

or grand central dispatch from objective C?

~~~
lbrandy
Our (facebook's) folly library has lots of components for doing stuff using
fibers[1]. That sits with folly::futures[2] and wangle[3].

That said, without proper language/compiler support, though, it's difficult to
make them "safe" (for some definition). Even within fb, given these libraries,
we are pretty wary of using fibers with C++ unless absolutely necessary. You
gotta really need it. This is where the coroutines proposals working their way
through the C++ committee can make lives better.

[1]
[https://github.com/facebook/folly/tree/master/folly/fibers](https://github.com/facebook/folly/tree/master/folly/fibers)

[2]
[https://github.com/facebook/folly/tree/master/folly/futures](https://github.com/facebook/folly/tree/master/folly/futures)

[3] [https://github.com/facebook/wangle](https://github.com/facebook/wangle)

------
warmwaffles
I still don't quite understand what he means by a "fiber". Is this basically
160 blocks of memory allocated on the heap?

~~~
corysama
A fiber is a thread in that it executes code and it has it's own stack, but it
does not have premption. The OS won't thread-switch between fibers for you.
You must manually switch in and out of executing it. Running in a fiber
doesn't mean you've stopped using threads. A fiber runs in to context of a
thread and that thread still has premption. But, inside the thread, you can
call functions to switch stacks in order to manually pause and resume
execution inside the fiber.

~~~
2bitencryption
I've noticed that lots of recent advances in threading are all just ways to
avoid context switching of threads and instead handle those switches manually
(e.g. "green threads" in Python).

~~~
pvg
These aren't particularly recent. Java originally had green threads; I think
the term itself is from there.

~~~
wavesplash
Green threads were a thing long before Java.

~~~
pvg
The term, not the variety of threads. I can't find any mention of it pre '95
or so, at least.

~~~
markkanof
'95, so about when significant amounts of information started being put on the
internet?

~~~
pvg
I'm not so sure about the term 'green threads' but the internet was definitely
around well before '95\. There doesn't seem to be any mention of 'green
threads' in available USENET archives before then, for instance. But if you
know of one, anywhere, by all means, I'd love to hear about it.

~~~
fit2rule
I think MIPS RISC/Os had 'green' threads in the 80's, or I could be mixing it
up with Tandem terminology, where Fiber was a constant type, I seem to recall
.. either way, the idea of a userspace-managed thread scheme is as old as the
hills.

The question has always been: who deals out the work, the OS or the App? and
as we can see, the question will continue to be asked, and un-answered,
probably ad infinitum ..

~~~
pvg
Did it call them 'green threads'? If not, the terminology itself probably
doesn't come from there.

~~~
fit2rule
I do remember Tandem and/or Wang talking about green (transportable) threads
who were okay to suspend/resume across processor units .. I wish I could find
more info, but I really do recall the term being applicable way back when ..

------
aidenn0
A note is that input latency will be 17ms longer with the pipelining versus
running at 30fps. That's still almost certainly a huge win though, as the
smoother rendering at the higher rate will make it perceived as more
responsive, particularly since input latency doesn't go much below 100ms these
days.

~~~
w0utert
Why? As he explains in the Q&A session after the talk, two frames of latency
at 30Hz (2 x ~33ms) is still more than 3 frames latency at 60Hz (3 x ~16ms).
You will always have at least 2 frames latency if you split the game logic and
render stages like they did for TLoU on PS3, so in fact their PS4 engine has
better framerate _and_ latency.

~~~
aidenn0
Before they implemented pipelining, they had all of the stages running in
under 33ms on the ps4. Yes, this is lower input latency than on the ps3, but
that's beside the point.

------
tomlu
Super interesting, but the audio level is too quiet for my laptop even with
everything set to max. Can it be downloaded from anywhere?

~~~
wallacoloo
FWIW, youtube-dl can easily download the video. i.e. `youtube-dl
[http://www.gdcvault.com/play/1022186/Parallelizing-the-
Naugh...](http://www.gdcvault.com/play/1022186/Parallelizing-the-Naughty-Dog-
Engine`). You can use it almost everywhere, despite the name.

------
hyperpallium
Is there a downloadable version?

~~~
felixguendling
Not directly. I think it is part of the Playstation SDK (but I'm not a game
developer). There are libraries that are based on the same concept (many
execution contexts per thread):

    
    
      * https://github.com/RichieSams/FiberTaskingLib  
      * https://swtch.com/libtask/  
      * https://github.com/halayli/lthread  
      * https://github.com/stevedekorte/coroutine
    

RethinkDB are using something like this: [http://rethinkdb.com/blog/improving-
a-large-c-project-with-c...](http://rethinkdb.com/blog/improving-a-large-c-
project-with-coroutines/) [http://rethinkdb.com/blog/making-coroutines-
fast/](http://rethinkdb.com/blog/making-coroutines-fast/)

