
Writing an MMO with Web Actors - pron
http://blog.paralleluniverse.co/2014/02/13/web-spaceships/
======
stcredzero
I'm currently writing an MMO.

[http://www.reddit.com/r/roguelikes/comments/1xa9rj/i_have_be...](http://www.reddit.com/r/roguelikes/comments/1xa9rj/i_have_been_writing_a_pseudoroguelike_mmorpg_with/)

[http://www.reddit.com/r/Clojure/comments/1x4ycd/i_have_been_...](http://www.reddit.com/r/Clojure/comments/1x4ycd/i_have_been_writing_a_pseudoroguelike_mmorpg_in/)

I am intrigued by actor models, but I can't get my head around what that would
mean in terms of architecture. A game loop is easy to understand, especially
for functional programming. You just make the world at tick n+1 a function of
the world at tick n.

For one thing, this makes server load easy to evaluate. It's just a percentage
of simulation tick duration spent in calculation. How would I evaluate that in
the actor model?

~~~
pron
Well, with (concurrent) actors there are no ticks – just like in the real
world, and every actor has its own loop.

This game also computes an FPS rate. Whenever an actor completes an update, it
atomically increments a counter. Every N increments, with N being the number
of actors, we say a frame is done. But this is unnecessary.

The best way to estimate load is by looking at Quasar's internal execution
queues. Also, every actor in this game has a loop of 30ms, i.e. it immediately
responds to messages it receives, but if there are none, it updates every
30ms. If you miss that deadline, namely your current run is over 30ms from the
previous one, you know the server is overloaded.

Of course, the main advantage of working like that is that you get really good
parallelism. In a previous blog post where I discussed the standalone
simulation[1], I mentioned that you can run it still using actors and still in
parallel, but in lockstep using a Phaser. After each cycle, every actor waits
for all the others to complete. This gives you "classic" predictability, but
you're not making the most out of your hardware as all cores basically become
idle after each step, and then you have to start them up again. When the
actors run asynchronously, all cores are busy all the time.

[1]:
[http://blog.paralleluniverse.co/2013/10/16/spaceships2/](http://blog.paralleluniverse.co/2013/10/16/spaceships2/)

~~~
stcredzero
_The best way to estimate load is by looking at Quasar 's internal execution
queues._

Yes, but then how do I break things down in specific terms? With the game
loop, it's easy for me to just take milliseconds timestamps before and after
each subroutine. For example, I can just time how long it takes for me to run
(prepopulate-monsters).

 _Also, every actor in this game has a loop of 30ms, i.e. it immediately
responds to messages it receives, but if there are none, it updates every
30ms. If you miss that deadline, namely your current run is over 30ms from the
previous one, you know the server is overloaded._

So then, I have to aggregate the loop running-times for every entity, or take
a census of entities overrunning?

 _Of course, the main advantage of working like that is that you get really
good parallelism._

I've always understood this part. However, consistent game ticks are useful
for various aspects of simulation and game AI.

 _but you 're not making the most out of your hardware as all cores basically
become idle after each step_

A laudable goal, but what if one is prioritizing "never lag" over efficiency?
My game is to have some of the same "a-life simulation with a game attached"
quality that Dwarf Fortress has. To ensure that such calculation never
interferes with player responsiveness, I was planning to put such background
processing on an entirely different instance, then import the resulting
evolved monster species into player instances. How would this be architected
differently using actors?

~~~
pron
> With the game loop, it's easy for me to just take milliseconds timestamps
> before and after each subroutine. For example, I can just time how long it
> takes for me to run (prepopulate-monsters).

Well, you can do the same here, for each actor's loop. With a single loop
you're letting your hardware go to waste and limit your scalability.

> So then, I have to aggregate the loop running-times for every entity, or
> take a census of entities overrunning?

You can do both. Whatever makes sense for you.

> However, consistent game ticks are useful for various aspects of simulation
> and game AI.

Well it's a tradeoff. I think that given the current state of hardware, you
need to use all your cores.

> A laudable goal, but what if one is prioritizing "never lag" over
> efficiency?

I don't think the two are mutually exclusive. How will you achieve "never lag"
if your single thread can't perform all the computation in time?

> My game is to have some of the same "a-life simulation with a game attached"
> quality that Dwarf Fortress has.

Sounds like a perfect fit for the actor approach.

> To ensure that such calculation never interferes with player responsiveness,
> I was planning to put such background processing on an entirely different
> instance, then import the resulting evolved monster species into player
> instances. How would this be architected differently using actors?

Well, you could always still use a single-threaded loop and offload
computation to actors. But a better way would be to split each entity to two
actors. One for sending feedback to the player, and one for slow computation.
If you want to ensure little interference, you can run the different kinds of
actors in different Quasar schedulers, each using a different thread-pool, and
control the number of threads in each of the pools. Quasar also supports
distributed actors, so you can place the slow, computation-heavy ones on
different machines, just as you have planned.

~~~
stcredzero
_I don 't think the two are mutually exclusive. How will you achieve "never
lag" if your single thread can't perform all the computation in time?_

My plan is to determine the max number of players supportable in each
instance, which I call an "interaction pool" and basically ensure no instance
exceeds this number. All players will be in the same world, but can only
really interact within the same instance. I think this will be okay since the
shared world is huge and not every player will be interacting. (And I will
write my way out of the sticky details. When players have to change instances,
they will experience a "phase storm")

Let's say I want to move my Clojure code on top of spacebase, how would I go
about it?

------
danso
Ha! I clicked on this expecting it to be how to make a story-driven MMO game
in which the actors are all non-union/guild actors making a living off of
webepisodes. Considering that even guild actors get paid beans for what they
are used to in commercial/tv/film work, I was interested in what cost-savings
might have been actually attained. But the actual material of the OP is even
more interesting :)

~~~
ChuckMcM
I had a similar thought, my guess is that the "next generation" of MMOs has
humans playing the 'Boss' in dungeon scenarios. This makes re-playing the
content interesting (the talent of the boss can change) and off sets a fairly
large technical burden (scripted encounters). Of course it adds issues with
potential collusion, balancing difficulty, and continuity (you need someone
online as the 'boss' for the encounter to happen).

I sketched out an update for WoW that would allow RBC's (remote Boss
characters) to be statues in game, until someone logged in and "inhabited
them" where upon there would be an in-game effect (like 20 - 30 seconds of a
glowing around the statue) and then 'wham' the boss comes alive. Until either
dead, or they log off and the return to being a statue.

Its a complex mechanic but it has the potential of providing a lot of quality
game play for the subscribers. I'd need an economist though to get through the
models of expected subscriber return value vs cost.

------
tmarthal
This looks super interesting, a great example of functional programming and an
implementation of websockets.

However, whenever something like this comes up, I always wonder about the new
Java implementation of Web Sockets / actors. Do you have a comment on the
difference between using Akka for your web interface instead of comsat
WebActors?

I guess if this is a proof of concept for the comsat classes, it works quite
well.

~~~
pron
I think the main difference is that Comsat (which runs on top of Quasar) has
true lightweight threads, just like Erlang and Go. This means that actor code
can block, you can have simple selective receive, and all-in-all the code is
simpler and easier to follow.

Also, web sockets are not directly related to actors. Comsat's Web Actors
simply let you write HTTP and WebSocket using the actor model.

------
peregrine
The demo barely works at all, with sub 100 players, hits won't register,
flying is jerky.

I get thats is only a demo but its not a very convincing one :/

~~~
pron
Flying is jerky because we do no client-side processing. A real MMO will use
dead-reckoning etc.

At the top left of the screen you can see your network latency. Obviously, it
depends on where you are. The server is a single US-EAST EC2 instance. So this
is entirely a network issue. I can tell you that the server isn't starting to
feel the load yet, and is currently running at 32.5Hz (pretty much the maximum
rate). This is a much higher rate than most MMOs that run at 4-10Hz.

~~~
stcredzero
I have to concur with Ryan, but for slightly different reasons. For me, the
biggest scaling concerns are not with efficient use of hardware, but with
guaranteeing user experience. I can see how the actor model can ensure
efficient use of hardware. I can also see how it could be leveraged to give
responses to user input almost as fast as network latency will allow. I'm not
yet sure that it will make performance monitoring or understanding performance
bottlenecks any easier or that it won't make such activities more complicated.

~~~
pron
I don't think it will make identifying performance bottlenecks easier, but I
can't see why it would make it harder, either. Modern software systems are
complicated; they're built on complicated middleware on top of complicated OS,
on top of complicated hardware. There are tools to help you find your
bottlenecks, but handicapping your application just so you'll understand it as
we used to understand performance in the 80s and 90s seems like the wrong
choice. If you want to have a simple mental model of your application's
performance, just run it on a 386. With instruction-level parallelism,
multiple cores with MESI coherence protocols, optimizing JITs, automatic
memory management and more, the days of running the code in our head to find
problems are long gone. You have to measure with the new tools and learn to
trust them.

~~~
stcredzero
_I don 't think it will make identifying performance bottlenecks easier, but I
can't see why it would make it harder, either._

Well, in the example of your demo, the system can detect individual actor-loop
overruns. How do I know that this will result in a graceful degradation? What
if all of the overruns happen so close together in time, the system has
insufficient time to mitigate and keep out of the notice of players? If having
game loops on individual actors is more efficient but otherwise semantically
equivalent to having a game loop, I would like to know more about this.
However, my understanding of the actor model is that it comes with built-in
indeterminacy.

[http://en.wikipedia.org/wiki/Actor_model#Unbounded_nondeterm...](http://en.wikipedia.org/wiki/Actor_model#Unbounded_nondeterminism_controversy)

 _Modern software systems are complicated; they 're built on complicated
middleware on top of complicated OS, on top of complicated hardware. There are
tools to help you find your bottlenecks, but handicapping your application
just so you'll understand it as we used to understand performance in the 80s
and 90s seems like the wrong choice._

A system one can analyze is always preferable to a system that's potentially
more efficient but less understandable.

I'm not trying to shoot down your architecture. I considered it for a couple
of weeks, but I turned it down because I couldn't understand the risks
involved. You can leave this as "an exercise for the reader" but we both
understand this isn't going to win most people over. (Maybe that's your goal?)

~~~
pron
> A system one can analyze is always preferable to a system that's potentially
> more efficient but less understandable.

Yes, but analyze how? In your head? If that's the case, then the last
hardware+software architectures you could analyze in your head were about ten
years ago. If that kind of performance is always enough for you then fine. But
no one is able to _mentally_ analyze complex software performance on modern
OSes and hardware any more[1]. On the other, there are tools that analyze
performance for you.

Because game developers are among those who care the most about analyzing
performance, and because they like working close to the metal, they like to
continue using that approach, but it's no longer viable. There is no more
"close to the metal" when the CPU itself is so complex it's almost non-
deterministic. Unless your game runs on a console, there is so much
interference from other processes, that even believing you understand how your
code works in a vacuum is not good enough to guarantee real-world behavior.

What I'm trying to say is that writing single-threaded code, combined with
network calls and queues for asynchronous processing is not really more
understandable. It's just similar to 90s code so people have the illusion they
can understand performance as they did back in the 90s.

For example: suppose your thread is now idle. What do you do? sleep or spin?
If you sleep you're at the mercy of the OS scheduler; if you spin, modern
Intel CPUs can actually decide to power down your core. In either case you'll
have increased latency when waking up.

[1]: forget virtual memory and multi-level caches. There's ILP, TLB, core
boosting and power management, SSDs, modern thread schedulers and more and
more.

~~~
stcredzero
_Yes, but analyze how? In your head?_

Of course not. But I can work out how the software tools for performance
measurement are going to work in my head. I can also work out if the
performance degradation of a game loop will be gradual with some degree of
reliability and also know how the measurement tools to verify that are going
to work.

 _There is no more "close to the metal"_

Long time Smalltalker and Clojurist here. Preaching to the choir you are.
However, to minimize risks, one should still understand what one is doing as
well as they can beforehand. I'm not saying that your architecture is bad. I'm
trying to explain my decision making process with regards to it. Furthermore,
if I had more information about how to implement pragmatic performance
guarantees or at least a high probability of gradual degradation, and if I
knew enough about performance monitoring in the actor architecture, I would
probably change my assessment of the risks involved in trying it.

 _Unless your game runs on a console, there is so much interference from other
processes_

It's probably going to run on a dedicated server.

Basically, what I've been trying to get across in these threads is: Your
architecture sounds really cool but I don't quite have enough information
about it to make me want to try it.

Or to put it another way:

"It's really great because of Z!"

"Yes, but I also really need X and Y and I don't know how it would be with X
and Y."

"Yeah, you should be able to work all that out, and can't you see that it's
really great with Z!?"

"Yes, it's great with Z, but..."

