
Riot Games Messaging Service - puzza007
https://engineering.riotgames.com/news/riot-messaging-service
======
matt_oriordan
That's am impressive bit of technology, and nice to see it's making extensive
use of Erlang (we're using Elixir).

I'd be interested to know if the service works across multiple regions though.
Some of the biggest challenges we've faced when engineering our realtime
platform has been in having no single point of congestion and effectively
peer-to-peer routing within the cluster. This is not that important if all
your servers can be in a single region, however if you want your users in
Australia for example to have a similar latency profile to those in USA, then
clients in Oz need to connect to servers in Oz and all routing for customers
in Oz does not need to traverse continents for other customers in Oz, but does
for customers in US. I'd be interested to know if that was tackled in the
design. Michal you following this thread?

Matt Ably realtime - [https://www.ably.io](https://www.ably.io)

~~~
grey
Riot accounts are tied to different geographical regions, so I would expect
the service doesn't have to span across them,
[http://leagueoflegends.wikia.com/wiki/Servers](http://leagueoflegends.wikia.com/wiki/Servers)

~~~
huac
Correct, a "North American" account would not be able to message or otherwise
interact with a "Oceania" account (the region which contains Australia). That
being said, an Australian could register an NA account, but shouldn't expect
bearable ping/latency.

------
sanqui
I realize the article isn't actually about a chat service, but I continue to
be confused because there exists Riot[1], the messaging client of the Matrix
network.

[1] [https://riot.im/](https://riot.im/)

~~~
trqx
Off-topic: do you guys at matrix/riot have a bot that automatically creates a
ticket when it finds keywords such as {IRC|messaging|chat} in any hn thread
without any mention of {riot.im|matrix.org} URL being also in the comments?
Honest question.

Isn't there a publicly hosted database of hn comments so someone could make
stats on this topic? I'm sure you are close to 100% match.

~~~
Arathorn
The only folks who work on matrix/riot who comment on HN are basically me
(project lead) and sometimes jkire (synapse maintainer). And we don't plug the
project, but respond to stuff like this & questions when they come up.

Anyone else commenting about Matrix are doing so completely independently and
without any input or knowledge from the actual dev team. For instance, I have
absolutely no idea who sanqui is (although I share their concern that Riot
Games moving into the messaging space is awkward).

We are categorically not spamming or sockpuppeting HN: I'm afraid that folks
who bring Matrix up must be doing so because they're interested or
enthusiastic about the project and consider it relevant (which unfortunately
it is on this thread, albeit for the wrong reasons :|)

~~~
jameskegel
like Always_Good said before you, it's just an in-trend for readers to mistake
headlines.

------
shuntress
Edit: Upon more careful reading, it appears that a response is sent from the
server to the client via WebSocket which contains the resource and method that
the client can request using HTTP if it wishes to update.

Original question: Can someone explain to me why the JSON message sent to the
REST interface seems to contain URL and Type as data?

Is it because they are transmitted using web sockets but the concepts of
"resource" (ie: clubs/v1/clubs/665632A9-EF44-41CB-BF03-01F2BA533FE7) and
"method" (ie: GET) still just make sense and happens to be very similar to
HTTP?

------
pjmlp
Nice to see yet another game studio making use of Erlang.

~~~
shoover
I thought the post might shed more light on their use of Erlang for this
service, but it focuses more on high level topology, AWS infrastructure, and
load balancing. It's interesting to get a sense of the level of high level
deployment tooling they used, because in a way Erlang marketing gives the
impression that the language and VM are all you need.

~~~
pjmlp
There are some presentations from Wooga about their use of Erlang.

[http://www.gdcvault.com/play/1016648/Why-
Erlang](http://www.gdcvault.com/play/1016648/Why-Erlang)

------
javitury
This article reminds me of a previous HN post. The program discussed was
called pushpin or pinpush.

These solutions are the glue between stateless and stateful services. I guess
they will become increasingly important as some services specialize towards
responsiveness. The use websockets or http2 without giving up the
simplification provided by restful architecture.

~~~
biot
Pushpin:
[https://news.ycombinator.com/item?id=9005724](https://news.ycombinator.com/item?id=9005724)

------
je42
wondering why they have the edge nodes and not just a load balancer. looking
at their responsibilities it looks like a lot of the overlap with the
responsibilities of a load balancer.

~~~
dorfsmay
Looks like a load issue, a single server can only take care of so many conn,
auth, etc... So having several solves the problem and can scale horizontally
easily, but each edge maintains conn information, so a given user needs to be
routed to that same server every time. On the other hand, all LBs share a
single address and all should be replaceable by each other, they exist purely
to route the user to the right edge server (existing conn or least busy).

Question to OP: do you run your edge servers in pairs or some kind of cluster?

~~~
cyunker
Not the OP, but I work on the same team.

The edge servers are not clustered and share no state. We require at least 2
servers as minimum for fault tolerance.

~~~
je42
Did the need for edge server arise from the fact the a service like
[https://aws.amazon.com/elasticloadbalancing/applicationloadb...](https://aws.amazon.com/elasticloadbalancing/applicationloadbalancer/)
didn't exist back then ?

~~~
cyunker
Well we need edge servers to handle the persistent websocket connections which
last through the life the player session.

~~~
je42
yeah the ALB provides that feature as well. I just checked. it was released in
Aug 2016. So clearly before that time your setup makes a lot of sense.

I am wondering if the ALB would be the preferred method now, if you were to
redo it ?

~~~
michalptaszek
Right, ALB were introduced after we built RMS. We would have to re-evaluate
the ALB stability/cost/scalability - but definitely something to consider.

------
Zekio
Their JSON have an key called payload which saves a String rather than an
Object, is there any benefits to doing this? since they also escape the " in
the String

~~~
marksomnian
Probably that services aren't required to use JSON (since they use many
different service languages, it's not unlikely that they use many different
serialisation methods - JSON, protobuf, etc.)

~~~
michalptaszek
Exactly this - each publisher can encode their own payload, including
protobufs, JSON, plain text or base64'd binaries. RMS itself is completely
oblivious to the format of payload used in the message.

------
nodivbyzero
I'd like to see more info on their customized Erlang/OTP mnesia .
michalptaszek mentioned it in the comment:
[http://disq.us/p/1gltwfs](http://disq.us/p/1gltwfs)

~~~
michalptaszek
Sure, few words on customizations we've built into mnesia: \- we integrated it
with our eureka-based service discovery mechanism, so that it can
automatically cluster with other servers that are spun up in a process of
cluster bootstrap/resizing. Also relaxed constraints when merging 2 identical
schemas of separate clusters (when table cookies don't match, but everything
else matches we still want to merge and take a union of already existing data)
\- we've added a bunch of auto-merge code (heavily inspired by Ulf's wonderful
[https://github.com/uwiger/unsplit](https://github.com/uwiger/unsplit)
library) in case of network partitions \- we've also added support for
maintaining pools of processes for each table for dirty updates (as opposed of
going through mnesia_tm for every single operation, including transactions as
well as dirty_asyncs)

I'm 100% aware that these changes are RMS/Riot specific and won't work in many
other situations (e.g. they violate certain transaction isolation properties).

------
shoover
Do people use zeromq for this kind of work? It seems the various socket types
are tailored to building such load balancing and routing architectures, but
I'm not aware if zeromq is in use at this scale.

~~~
jkarneges
Yup, we use ZeroMQ for internal routing for this kind of thing. Each instance
of our edge server (Pushpin) binds on a SUB socket to receive messages
destined for external clients. This makes it possible for internal publishers
(using PUB) to route among edge servers using a brokerless mesh.

