

MMO xkcd WebSockets fixed by PubNub - pubnub
http://www.pubnub.com/static/pubnub-xkcd/index.html
Hello all hackers from HackerNews.  We notice a new MMO was released by n01se and xkcd yesterday (September 26th, 2012) with multiple users flying around with a balloon figure.  If you got stuck, you can click your balloon guy and turn into a ghost to seamlessly move through the landscape unhindered by mortal barriers like trees and hills.   There was a problem however with the scaling of users on the system.  The max concurrency could only be 20 users at a time leaving many wonder where the MMO part of the MMO was.  We ripped out the non-scaling Node.JS code.  ENJOY.<p>Enter the xkcd World with Friends: http://www.pubnub.com/static/pubnub-xkcd/index.html
======
pubnub
Hello all hackers from HackerNews. We notice a new MMO was released by n01se
and xkcd yesterday (September 26th, 2012) with multiple users flying around
with a balloon figure. If you got stuck, you can click your balloon guy and
turn into a ghost to seamlessly move through the landscape unhindered by
mortal barriers like trees and hills. There was a problem however with the
scaling of users on the system. The max concurrency could only be 20 users at
a time leaving many wonder where the MMO part of the MMO was. We ripped out
the non-scaling Node.JS code. ENJOY.

Enter the xkcd World with Friends: <http://www.pubnub.com/static/pubnub-
xkcd/index.html>

~~~
chrishouser
Any reason you copied the files manually instead of forking on github? The
link to <https://github.com/n01se/1110> was right there on the main page.

~~~
pubnub
Hi Chris! We ment to fork it. Just forgot. We included the link on our
README.md, Game Page itself and the blog.

------
JTxt
Neat and your service looks interesting, but I don't call this 'fixed' yet. It
did not seem to scale well.

Everyone is spawning at the same point. I just see a flood of dead users. Some
jerk around a little and appear and disappear. But very little interaction.

Is it possible to multiple spawn points when demand is high and separate
channels/servers for various areas that you switch as you move? (like Second
Life)

I believe a node.js (or something else) version could be federated, and/or
clients connect to various servers as they travel.

I know this is just a toy, but it would be interesting to see this work well
at a large scale.

~~~
pubnub
Hi JTxt. Good points regarding the fix. In fact the system is scaling great
and you can see a lot of people on the screen. The issue is in drawing that
number of players on the client computer. We have improved the source code
just a moment ago and now you can see disconnected users in a stopped state
and live users are actively moving around. It was wonderful to see this
earlier this morning with about 2K active live users all in the same world.
Updates have been posted. Enjoy.

~~~
kanaka
Actually, I've done some analysis of the stream of messages and it appears the
current problem is that the vast majority of messages never arrive (even as of
7:30am CST). You can verify this easily by connecting with two browser
windows. The first time I tried it took almost 30 seconds before my second tab
even saw my other character even though I was moving both around. And the name
in the second tab was never updated even after a minute.

The second time I tried, my first player window saw the other player fairly
quickly, but it never registered the change to a ghost or my name change and
it just showed the second player as a balloon guy floating up forever (even
after the second client window disconnected).

I did this testing at 7:30am CST. There were about 90 other players that I was
getting updates from. However, this is the same type of behavior I've seen
whenever I have connected since it went live. And friends in other locations
see the same behavior so it's not just my environment.

------
kanaka
I think your service is having some trouble. The updates appear fairly smooth
because the client continues rendering the last vector seen for each avatar.
However, when comparing with several friends also connected it's clear that
very few of the messages are getting through and sometimes only in one
direction.

Also the dynamic poll/sample interval you implemented seems to hit 1000ms (1
second) and stay there.

~~~
pubnub
Hi Kanaka! Good question regarding sample rate. We auto-scaled the sample rate
based on number of occupants. It is currently at 1000ms top peak because there
are so many of you! If there were only a few, it would go to 50ms.

~~~
JTxt
Is it possible to divide the world into multiple channels?

So I only see updates from those in my area. Then change channels as you move?

And start users in different spawn points when usage is high?

~~~
pubnub
Hi JTxt! Yes this is possible! Just update the channel value in this line here
from "/xkcd" to "/whatever_you_want" and you can take on new parallel worlds.
The source code is here that you need to change:
[https://github.com/pubnub/pubnub-
xkcd/blob/master/network.js...](https://github.com/pubnub/pubnub-
xkcd/blob/master/network.js#L4)

~~~
JTxt
Thanks for responding!

I'm talking about dividing the map into multiple channels so that users only
receives updates from those around them, and hopefully make it more
responsive.

So for 1000x1000 areas, if I'm at "x":3033521.3,"y":-3025356.4, my client
subscribes to /xkcd_x3033_y-3025

And adjoining area/channels if they're close to a border. (but only sends
updates to the current area/channel.)

...IF it's easy enough for the client to dynamically change channels, and be
subscribed to multiple (up to 4) channels.

Then combine this with multiple spawn locations at other interesting locations
when demand is high. But they can still explore and meet others.

So it should more responsive and less data sent than everyone getting updates
from everyone.

~~~
pubnub
Hi JTxt, This is a fantastic idea for resource saving. Also streaming all the
data to the clients, while possible with PubNub slows down the clients as they
are busy downloading the streams from all player movements on the screen.
Would you be able to work with us and coordinate a way to work this? Here is
the starting line to segregate channel data -
[https://github.com/pubnub/pubnub-
xkcd/blob/master/network.js...](https://github.com/pubnub/pubnub-
xkcd/blob/master/network.js#L4) see the "xkcd2" at the end. This is the
CHANNEL ID. You can take that and change it based on the Region of the world.
You can even use the names of the world splices such as
<http://www.pubnub.com/static/pubnub-xkcd/images/1n1e.png> \- 1n1e "1 north 1
east". So the channel name would be "1n1e".

~~~
JTxt
Sure, I'll try to play with it this weekend. I'll have some questions for you
if I can work on this. Thanks!

------
ch0wn
Great use of a marketing opportunity. This actually runs very smoothly. Well
done!

~~~
donpark
Agreed. It would be nice to see running charts or stats showing # of users
online, average messages/second, and average $/hr at $1 per 1M rate.

~~~
kanaka
For our original server, after the first full day after the HN post, we racked
up an AWS bill of about $1.62. Most of that was in the first few minutes after
the announcement before we implemented the user cap.

From earlier today for about 20 hours of run time: we had 13,000 successful
connections to the server, 22,000 failed attempts to connect. The server is
averaging about 100k per second outbound when fully loaded with 20 clients.
With the quick improvements we made after the HN flood we got the CPU down to
about 1-5% on a t1.micro when 20 clients are connected and interacting. The
real issue for us is the bandwidth.

The server and protocol could be MUCH more efficient (visibility pruning,
binary messages, etc). But it was something chouser and I hacked up over the
weekend and it's now working quite well within the cap.

~~~
donpark
Your stats are inline with what I expected. Thanks.

What I was actually interested in was PubNub stats to see how such service can
help with apps like yours and to help in figuring out how much it'll cost.

------
joshaidan
I like this a lot. But I wish it had some physics added to it. Maybe that will
be my fork.

------
meritt
I'm running Chrome yet it's still long-polling instead of using websockets.

Why?

~~~
pubnub
We fixed using a reliable transport communication which Bundles, Compresses
and Delivers the data more efficiently. Check it out!
<https://github.com/pubnub/pubnub-api/tree/master/websocket> \- PubNub
WebSocket Emulation PubNub offers full RFC 6455 Support for WebSocket Client
Specification. PubNub WebSockets enables any browser (modern or not) to
support the HTML5 WebSocket standard APIs. Use the WebSocket Client Directly
in your Browser that Now you can use new WebSocket anywhere!

~~~
meritt
While emulating WebSocket on non-supporting browsers is definitely awesome,
the point I am making, is I am using a websocket-supported browser yet it's
just doing a series of non-stop HTTP requests.

Shouldn't the emulation only occur when necessary and be a graceful
degradation not forcing all clients to downgrade to a less efficient
transport?

------
pubnub
Added iPad and iPhone / Android Support with Touch! -
<http://www.pubnub.com/static/pubnub-xkcd/index.html>

------
mparlane
"No server required courtesy of PubNub."

There most definitely is a server, so can anyone explain what they meant by
that instead?

~~~
kanaka
They mean that you don't have to run a server yourself, instead you use their
PubNub messaging service (which has a free tier but costs after that).

------
danielweber
Okay, I could probably answer this if I read all the old WebSockets posts, but
how can you have people communicate without a server?

~~~
elisee
From what I understand, PubNub provides servers around the world who take care
of the heavy lifting and all you have to do is deal with publishing /
subscribing messages.

So there are servers, you just don't run any yourself and it is all abstracted
away by their API.

What they did, in this case, is mock the WebSocket interface (so that they
didn't need to rewrite the whole original app) and make it use the PubNub API
internally. See [https://github.com/pubnub/pubnub-
xkcd/blob/master/websocket....](https://github.com/pubnub/pubnub-
xkcd/blob/master/websocket.js)

~~~
pubnub
Excellent investigation! You are correct thank you for the explanation here on
this thread. Check it out we also needed to remove the server.js file too. No
more node needed now. Enjoy!

------
pubnub
<https://twitter.com/PubNub/status/251440715572326400/photo/1> \- Team Photo
[IMG]

------
pubnub
Recorded a video - <http://vimeo.com/50320757> \- [VIDEO] of live action
during the early moments of the release.

~~~
JTxt
Nice work. It's an interesting problem but I don't think it's solved yet.

In the video and from my experience I see a bunch of dead users spawning from
a single point making a 'pillar of death' and a few users moving that are
skipping around, but none in any kind of fluid motion.

Perhaps have multiple spawns, offset by a small random amount.

Not sure how your system works but perhaps have areas in separate channels,
and users only subscribe to updates from that area?

I'd rather see a few users in my area that can interact with me quickly than a
flood of non-responsive ones.

~~~
kanaka
JTxt. We had a similar problem (but for a different reasons) with the original
(1110.n01se.net) when the initial rush from HN happened. After a few minutes
we moved the server to an AWS t1.micro instance (our previous hosting company
didn't like CPU intensive processes and was killing the server).

However, one of the problems with a t1.micro server on AWS is that the
hypervisor will detect heavy load and start throttling the VM using a heavy
handed approach which basically just pauses the VM for several seconds. This
would cause buffers to build up and when the VM started running again you
would have a burst of traffic.

The symptom was that you would see other avatars stop moving for a few seconds
and then suddenly jump around and then be back to smooth. All clients would
still receive all the updates (WebSockets is reliable over TCP), but they
would come in bursts whenever the VM was throttled. So a few minutes after
moving to the AWS server we realized what was happening (we use AWS for
development and knew to recognize the behavior) so we implemented a few easy
fixes to the server (such as change deltas and doing JSON of the data once
instead of per client, etc) to bring the CPU usage down to a reasonable level
so that it wouldn't get throttled. Combined with the 20 connection cap that
resulted in a very smooth and low latency experience for the players that were
able to connect.

One thing we are considering with the original is what you suggested where
players only get updates for things that are visible to them. However, this
means more processing on the server because each client gets a different data
set which has to be generated. So it's sort of a tradeoff between decreasing
bandwidth and increasing CPU (surprisingly often the case in the real world).
If the CPU increase causes throttling then we would start encountering
burstiness again. So it might be worth it or it might not.

Anyways, if we have some free time we may play with some ideas. One of the
first things is being able to simulate load manually. Doing a mad dash to try
and implement improvements while thousands of HN users are trying to connect
(we didn't know that a friend had posted it there) is certainly an adrenaline
rush but not the most ideal way to do development :-)

Another solution would be to use a larger AWS instance. Right now on the
t1.micro with 20 users the CPU stays around 1-5% (which is a safe zone for
throttling). Larger instances don't have the throttling problem so in addition
to more horse power we would be able to use a much higher percentage of it.
However, it's just a spare time project for us and I'm still trying to figure
out to explain to the wife why we are paying (mostly bandwidth) for other
people to play an online game. :-)

And surprisingly AWS doesn't seem to have a way to accept donations or gift
cards for AWS costs. Seems like a logical for encouraging free and open source
development projects like this.

~~~
JTxt
Thanks for getting this going and sharing your experience. It's really fun
when it's interactive.

I'd like to hope having 1000's in the same map with fast updates, without
being a huge burden to any one person is possible.

Perhaps run a master instance at 1110.n01se.net Invite others to run a slave
instance that is configured to connect to your master, report health, and
serve areas as directed.

The master serves the client and assigns a spawn point to the client to other
interesting locations when load is high. It also manages the list of slaves
and the areas covered and updates it with clients as it changes.

(But there's a potential bottleneck here. Not sure how to do this yet...

Also how do you subdivide the map so high traffic areas are smaller? I'm
guessing Quadtrees so the subdivide data can be very small.

How well can clients be connected to multiple servers?

And is a master/slave relationship needed?)

Well I'll leave it there. It's an interesting problem and I think it's part of
what node.js is trying to become.

------
vyrotek
Cheers! We're using PubNub in production and it's been great so far. We were
one of those noisy customers that wanted the presence API. :)

------
carimura
Nice work guys.

~~~
pubnub
Thank you Carimura! Enjoy.

------
wschott
Awesome!!!

~~~
dirkk0
+1

