> the matchmaker shuffles the queue to avoid the same players being matched together repeatedly
This is actually something I hate about multiplayer games with matchmaking nowadays. I made the majority of my childhood friends only because we stayed on the same server and played together for hours on end. I don't think it's a stretch to say that a key reason for why we play multiplayer rather than single player games is to socialize. This has become increasingly more difficult when you just get a new set of people every 10-30 minutes.
This is why smaller AA games still keep this alive - to build community. For instance: Squad, a 80-100 player multiplayer game which primarily depends upon privately hosted servers ran by communities that actually care about the community and game. You'll still run into jerks like any online game, but it's a world of difference to the one-match/single-serving strangers you usually play with in other games.
This is something I really like about Rocket League’s matchmaking. Competitive is random, but Casual play allows you to keep the same lobby. You can even vote to rematch to keep the same teams. I’ve had numerous runs of 3-4 or more matches with the same people. It’s fun to get into a back and forth, especially if you rematch and the other team wins one - then you’ve _gotta_ have a tie breaker!
Yeah, way back when I used to play Counter-Strike, I eventually landed on one server I really liked and kept going to. It would be interesting to see who was on it at different times of the day, and eventually I got to be friendly with most of the regulars that showed up, and they knew me.
Turns out several lived near the server's location, in Texas, and at one point my friends and I just happened to be going there to visit a friend who was stationed at the nearby military base, and so I ended up meeting up with them for lunch. Nice guys.
> Turns out several lived near the server's location, in Texas, and at one point my friends
Also ended up in a similar situation multiple times (bunch of randoms found some server we liked, sticked around for matches across weeks, eventually became regulars and eventually figured out we lived nearby). Sometimes we'd bump into each other on other servers too.
After a couple of times of hanging out we've found out why (probably at least) we came across each other, we all default to sorting the server list based on ping (latency), and since we were all geographically close, we tended to end up on the same servers.
Texas wasn't that close to me (I was in Illinois), but the server was hosted within a data center, and had pretty low latency for me anyway.
Also it was very consistently up and low latency compared to other servers, so that's why I kept going there at first. Later on I kept showing up because I got to know people on the server.
You can queue with people you already know, and some games do have a 'stay as a team in the next queue' option, but there's no real opportunity to actually get familiar with anyone, let alone a community, and play with them more in the same way that you get with community servers
Glenn Fiedler is the best blogger on game networking. We used your stuff to guide our netcode development ~6 years ago. We now have a great stack on top of Godot for multiplayer games on the web over webrtc or websocket and enet on mobile and native.
We're a small time company (3 people) and we have several multiplayer games..our netcode is a serious competitive advantage largely due to unity's failure to provide a system to our competitors.
Thanks for helping us, Glenn.
Our matchmaker needs a rewrite, and these posts are nice.
> We now have a great stack on top of Godot for multiplayer games on the web over webrtc or websocket and enet on mobile and native. We're a small time company (3 people) and we have several multiplayer games..our netcode is a serious competitive advantage
Yeah we use libdatachannel. tbh we didnt see much (if any) network improvement using WebRTC over WebSockets, but our testing/analysis is pretty shallow (3 person company - whos got time for that, lets make products)
https://en.wikipedia.org/wiki/Express_Data_Path is a feature of recent Linux kernels allowing you to bypass/replace most of the normal Linux networking code, allowing for higher performance. Some network cards will even run your network code directly on their hardware.
(not sure what GP meant, just providing the reference)
I think they were confused by the term “netcode”, which in game developer parlance means “code which implements networked gameplay in a multiplayer game”, rather than “code that implements networking in an OS”.
Yeah, our netcode forward predicts from the server state to the local time to allow the local player to see their own character without network artifacts, and see other players in approximately their true location. It's the 'server authoritative, with client side prediction and rollback' technique. Consult Gaffer On Games' blog posts to learn more.
1. Let the game client apply to servers in datacenters where they will get low latency if that's relevant for your game
2. Wait 1 second on the server, then shuffle the list of clients and group every set of 4 into a match.
3. If no match is found after 10 seconds, add the client to servers in suboptimal datacenters
4. If still no match after another 10 seconds, put them in all the queues and join them to the first game it finds so they can at least play at all, even if they'll have lag no matter where they go
5. Determine latency either using the same IP:port as the game server will have because the route might differ from another port (citation needed imo. The author links another blog post for further info which is mainly a statistics lecture about averages hiding outliers and doesn't back up this claim that the port number makes a significant difference for latency in any but the most exceptional of situations where you're sent eastward instead of westward around the globe or so) or by collecting statistics to find which server was best for a given location and ISP over the past 30 days (seems over-engineered with extra downsides, like that you will have gaps in the data and that it might have changed)
Did I miss anything? The whole thing seems somewhere between "do the most naïve implication" (1–4) and doubtful (5). It's useful for devs to confirm the intuition they might have for the simplest implementation is correct but other comments saying this is a great blog surprise me when looking at these two posts
The hardest part about matchmaking seems to me finding players well-matched in skill without making them wait long, especially during off-peak hours, or having off-putting loss streaks because they (50% chance for each game) got matched against a slightly stronger or equal team and lost. The article doesn't even mention these aspects exist, let alone offer advice on tackling them
> 5. Determine latency either using the same IP:port as the game server will have because the route might differ from another port (citation needed imo.
No citation, but here's some things to think about.
You can easily have two servers in the same facility, in the same rack, with very different ip addresses. Most of the time you'll get the same general route for all destinations within a /24 (v4) or /48 (v6), but not always. Two /24s at the same facility might get different routes, especially if there are capacity issues on the better route at any point.
From same IP to same IP, anywhere there's a redundant link, which link to use is generally selected with a hash on the 5-tuple {protocol, destination ip, sender ip, destination port, sender port} (sometimes on a smaller tuple). If traffic is unbalanced or if one link is lossy, you can get drastically different experiences with a change in port number.
If you've done traceroutes and seen multiple similar ips at a given hop number, it's probably the second effect; this is pretty common. Typical traceroute sends udp packets to the destination and modulates the sending port number. Sometimes, rarely, you can see drastically different routes within a traceroute. It's easier to do traffic engineering on a /24 level though.
Traffic on redundant links is balanced by hashing flows rather than on a per packet basis so that packets within a flow tend to arrive in order; out of order arrival is very expensive for endpoints, so it's better to ocassionaly have unbalanced usage than to regularly have out of order arrival.
It's a shame that you missed the key point of the article in #5. Many games today have suboptimal inputs to their matchmaker, because like you, they refuse to believe that the internet can route a significant percentage of players incorrectly.
Source: Ran a network accelerator with more than 50 million unique players for 5 years. Bad network performance is much more pervasive than you think. https://networknext.com
Love the articles but I think you should disclose Network Next is not merely "a" network accelerator. It's the network accelerator that you're the CEO/founder of, which makes this read a bit like an advertisement.
If it can route incorrectly, that seems like a strong signal not to rely on your maps. It doesn't matter if you know what the correct latency is, if the users only experience the incorrect one.
The solution is to pair the matchmaker using maps with a network acceleration product that fixes the bad routes, then the latency map and the actual latency experienced are very close.
The reason you use the maps in the matchmaker is so you can look up the latency from client to datacenter in constant time and you don't have to waste any time on the client pinging ping servers to get the results.
Plus, it costs money to run pings through the network accelerator, and they always (from experience) converge to the latency map anyway.
You know, I really loved your previous blog [1]. So many great tips and tidbits in that series. With your new blog you seem like a completely different man, posting those surface-level overviews full of gaps that seem to only push your product. I can see what happened.
Sure, you don't owe me anything, don't reply if you don't have time. Don't pretend that you've answered my questions already though, everyone can see you didn't. Honestly why reply at all, if you are going to lie and use the "I know better noob" line?
> Don't pretend that you've answered my questions already though, everyone can see you didn't.
Jesus Christ dude. I'm just a guy who posted an article with a nice dataset and source code to help people out.
While at network next, I dug in to matchmakers with multiple customers who were having weird problems like "why are my ping server pings good, but the player gets high latency when they connect to the server?" and "why are my players in Peru all getting high latency?".
If you don't want to accept what I've shared about these experiences being true, that's fine. Write a matchmaker that ignores my advice, ship a game with it and see what happens when you hit 100k CCU.
Again, you don't have to reply if you don't have time to reply. You don't even have to do 50*3600 in your head.
I ask the questions that I think follow logically from the conversation. This is what this site is for! You are the one who decides to come here and post comments claiming that you have "explained it enough already", that all you've done is post an article (not comments?), and that I'm somehow "ignoring your advice".
The article clearly explains why the latency maps provide benefit over haversine and ping server approaches. It even explicitly mentions why the latency map approach works and how in combination with a network accelerator the latency maps line up with the actual network performance you'll get.
But then you asked a question "but, if the network is going to do a bad route, doesn't this mean the latency maps are a bad idea?" paraphrased, and this indicates to me that you didn't really read the article.
And then I try my best to provide an answer to you, but you don't accept it and continue to ask questions about how long it will take to perform pings, and how expensive they are. Nitpicking on answers that I provided.
At this point I realize that if I answer these questions, they'll create even more questions from you. You just want to argue. Each answered question will result in more nitpicks.
This is tedious and I really don't care for it. If you doubt my results, I've given you sufficient information to try out my experiment and see if you can reproduce my results.
The scientific method.
But you take this as some sort of personal insult. And this I can't really help. I don't know you, and I don't know if you are a professional game developer or not, but telling you to just try it out is not meant as an insult. I literally mean, if you doubt my results, please check them yourself.
Again, the response provided is "avoid sending a ping packet to each location", when you are about to send 50 packets per second for the duration of the game. This does not make sense to me, or the people who upvoted my remark.
But then you decide to believe we didn't read the article, and we wouldn't understand if you clarified. Ok dude! Have a rotten day too, what am I supposed to say?
Your game launches with 1M+ players joining at the same time. Can you think of any reason why having 1M players all pinging the same ping servers in the datacenter at the same time might not be a great idea?
What if a ping server goes down? Great. Now people can't join servers in that datacenter. You've created an additional component to your architecture that needs to be up and working.
Oh, you'll just have multiple ping servers per-datacenter to fix this via redundancy? Not so fast, if hash modulo n is real as described in the article, each of these ping servers can have different performance, even though they're in the same physical datacenter, because they have different IPs.
Oh you'll just have one anycast address per-datacenter? No. Most game devs don't have the resources to actually create and maintain their own network for their games, and instead, host servers in a mix of bare metal and cloud providers. Implementing a unified anycast approach across multiple providers is probably a no go. Game devs make games not networks.
OK you'll just have multiple ping servers per-datacenter and take the minimum value? It can work, and it's better than taking the average but now you're sending n x m pings where n is the number of datacenters and m is the number of ping servers per-datacenter.
You'll only ping ping servers near the player? Unfortunately, can't just ping the nearest few datacenters because ip2location isn't foolproof and a non-trivial amount of players will end up at the wrong location or even null island (0,0) lat long, thus 50ms is not a sufficient time to wait for the the RTT, even if you sent only one ping packet, it would be more like 250ms at minimum, and realistically at least 1 second, because you're sending packets over UDP and it's not reliable.
You'd probably also want to extend this to 10 seconds because Wi-Fi often has significant low frequency jitter which gives a false high RTT reading if you are unlucky, and the period of this jitter is often several seconds long. This brings you up to around 10 seconds pinging time realistically across which you'll be taking the minimum RTT seen which is most indicative of the true RTT between client and ping server for the current route.
Also, many network accelerators (not anycast ones like AWS Global Accelerator, but active probing based ones) require spin up time and need to perform their own pings, perhaps for an extended period of time like 10 seconds for the same reasons as above to find the correct route. Add this your own ping time as above and now you could be pinging for 20 seconds before getting a realistic route to the ping server. How many players do you know that are willing to wait 20 seconds in a lobby before playing a game?
This spin up time and probing can also become quite expensive for network accelerators that use active probing to find the best route, and is best avoided. Perhaps the key thing you are missing here is that many active probing network accelerators don't accelerate all players, just the ones that are having the bad network performance at any time (around 10% of players at any time), and thus the load of all players doing pings is non-trivial relative to this. Think egress bandwidth for pongs as well.
Next, if the internet has the property that the hash modulo n is real, like I describe in the article, even if you could do the pings in a reasonable amount of time, and the cost wasn't an issue for you -- WHY would you then do it in a way that results in a significant disconnect between the measured latency to the ping server for, and the actual latency in-game for some players? And why would you want to put in a noisy input that can fluctuate when you can have a rock solid, steady input across a long period of time like 1 day that is actually representative of the topology of the internet?
Especially, why would you do it, when you'll be tracking latency per-player per-match at minimum for your own visibility already, and you could just batch process this data at the end of the day to take the average latency at each (lat, long) and output a greyscale bitmap where [0,255] indicates the latency from that lat/long square to the datacenter. You just need one grayscale image per-datacenter.
Now you can look up latency from any lat long square to any datacenter in zero time. It's a steady input, and it's the accurate post-acceleration RTT value you can most likely get for players in that lat/long square to each datacenter in question.
And finally, why would you even care so much about the ping server approach, when I've already told you it converges to the latency map anyway. FFS.
> Especially, why would you do it, when you'll be tracking latency per-player per-match at minimum for your own visibility already, and you could just batch process this data at the end of the day to take the average latency at each (lat, long) and output a greyscale bitmap where [0,255] indicates the latency from that lat/long square to the datacenter. You just need one grayscale image per-datacenter.
Given lat,long is a guess and even if it is accurate, doesn't correspond very well to network latency, why wouldn't you use something like the source /24 or /48 rather than lat,lon. You don't get a pretty picture that way, I guess.
The algorithm just has a tick rate at 1 second. Assume requests come in continuously, but only do work at some rate, eg. once every 1 second for a matchmaker that is meant to very quickly find games, and maybe once every 10-15 seconds for a higher quality matchmaker that is going to do a lot of work to match players together by skill etc (longer tick times means you'll have more players to consider in each iteration)
I've written game matchmaking before for games and found it far more challenging to do the portions related to matchmaking groups by skill. If you're hyper latency sensitive then perhaps this blog post is really useful? But for the games I worked on we would trade 10ms worse average ping for 10% better skill pairings without question. If you have any advice on improving skill matchmaking I would be quite interested.
Thanks for the link, I hadn't read this one yet. Very interesting. I feel like as an industry we're still in the stone ages on how we do skill matching systems. A lot of the current and even future system they describe is really not great, and these flaws are definitely not unique to them (the systems I worked on are even more flawed)! Trying to matchmake dozens of players at a time is such a cool, challenging problem.
I really think it's a cursed mission to chase perfectly balanced matches.
In terms of overall experience, the best approach in general is a purely random one. This is how you avoid the experience of getting trapped in a 20 game losing streak, even if you are a really good player.
If you allow the natural balance of good players winning more often and bad players losing more often, you will find things get a lot less messy in between. This also provides your bad players an opportunity to occasionally witness what they could become. When you only play someone 1% stronger than yourself, you probably don't have a great idea of what the upper bound actually feels like. This can become a serious trap for players who are seeking to grow their skills.
There's different levels of SBMM (skill-based matchmaking) though, most games nowadays have a choice between ranked (where the appeal is the well-matched games and the possibility of increasing your rank) and quick play (where the appeal is the low queue time and more casual play, and the matchmaker basically is making sure that the lobby is filled with players who are very roughly in the same skill range)
While I also have fond memories of pre-SBMM Halo/CoD from the 2000s, they're more focused on pure shooter mechanics rather forcing players to work together to win an objective, so doing well but losing still feels fine. I find SBMM is needed in more objective focused esport games like overwatch, because the game is designed such that it's harder to do well individually (and thus have any fun) if your teammates aren't doing well.
> it's harder to do well individually (and thus have any fun) if your teammates aren't doing well.
Perhaps this is the actual cursed aspect. In a free-for-all context (i.e. a team size of one), you would never have this kind of a problem. As you increase the mandatory team size to six, you are creating an entirely new universe of effects to compensate for.
In my experience, 99.99%+ of the frustration in Overwatch and League of Legends emerges from dealing with your own teammates, not your opponents.
A lot of what you're saying depends greatly on the exact game you're talking about and what winning and losing means, for instance if the game has discrete places instead of a bimodal winner/loser.
For a few of the games I worked on, random matchmaking like you describe is a non-starter. If you're a 90th percentile player in one of our games, you effectively never lose to a 70th percentile player or below. Your rating will be so high that we can't give you any rating system points for the win. So the person who won got no reward other than the feeling of winning, and the person who lost played a match they had no hope of winning. It ends up feeling pointless to play as a top player because only 1 or 2 matches in 10 on average have any meaning for you. Needless to say, it also feels worse as a low rated player because you simply lose more often.
I'm confused about the using the greyscale map tiles to estimate ping.
You don't want to have the users ping the servers themselves because those pings could be inaccurate or noisy, so you use historical average data for users in that region instead to get a nice simple number. But... how do you know where the user is? IP Geolocation? Can't that be wrong also?
Isn't it better to have a direct measurement which could be a little wrong than an average of a guess which could be really wrong?
Google has Open-Match[1] for matchmaking where you just have to provide an API that takes a batch of players and returns groups, and it handles the surrounding stuff, also integrates with Agones[2] to automatically provision servers on k8s.
Is matchmaking the process of players "always [being] sent to the datacenters with the best chance of having low latency"?
EDIT: No, it seems that it's literally grouping players together in multiplayer games (but doing so in a way where latency between players is minimized.)
The example here is really simplified to focus mostly on the finding datacenters with low latency problem, but it could also include things like matching players of similar skill together, finding a set of players that would make balanced teams, making sure that players who are partied up together play in the same match and so on.
Basically, just like matchmaking in real life. It's the thing that works out groups of players who should play together in a match.
This is actually something I hate about multiplayer games with matchmaking nowadays. I made the majority of my childhood friends only because we stayed on the same server and played together for hours on end. I don't think it's a stretch to say that a key reason for why we play multiplayer rather than single player games is to socialize. This has become increasingly more difficult when you just get a new set of people every 10-30 minutes.