As context this was a big international fiasco. Nobody was acting maliciously but fans of RNG and other teams felt wronged. RNG didn't want to attend due to covid restrictions on travel but Riot really wants Chinese viewers in international tournaments, hence all this online latency work. Pros on other teams were lied to by the false latency information which must be almost paranoia inducing to be told it's only 35ms even when it (correctly) feels higher. In the end RNG had to replay their games, which was not as bad as it could have been since they were vs much weaker teams, but still annoying.
Anyways, by "scenarios where the actual ping was significantly lower than the target latency" do they mean like if the ping from Busan-server is 10 but they want it to feel like 35? I feel like that's the one scenario you'd care about, making the Busan players have 35 ping was the whole point.
Edit - also the other scenario, where the actual ping is significantly higher than the target latency...well in that case your latency service would be some sort of magical time travel box! Adding latency is literally the only thing it can do right? It's not possible to lower it. Maybe the it depends what they mean by "significantly" but idk the target latency was 35ms so like at most they could be 35ms lower than that.
I guess they didn't catch it because the bug was in the latency service itself which gave the wrong value to both the in game display and to the old network monitoring system. Idk in hindsight it's easy to say if you're going to introduce this new ping equalizing service it might break how you test ping especially because it hasn't been tested in soloq. But meh a lot easier to see what happened in hindsight after they write it all up.
You can in fact have that magical time-travel approach to get apparent latency lower than the real number. Keep a history of the game state at every tick, so you can roll back to any previous instant. If real latency from one client is 50 ms, and you're targeting 35 ms, you roll back the game state by 15 ms each time input comes in from that client, and re-run the game logic from that point forwards and send the resulting state update to all clients. Those clients will see a bit of jumpiness in the game state, which isn't ideal, but may be the lesser evil compared to enduring more latency.
(I have no idea what LoL in particular supports or implements, but that's the general idea.)
This strategy is commonly referred to as being in the class of “latency hiding” as it doesn’t actually improve latency, it just allows you to reduce apparent latency. In practice it’s still useful but I think it’s mainly useful because it allows for local input to be buffered significantly less than if you were to compensate for lag by adding buffering only. However, that is not always useful. It is mainly useful in games where netplay is handled in a lock-step fashion, where each peer needs to keep every action coherent frame-wise. That makes sense for things like fighting games, but it doesn’t make all that much for say, puzzle games, even though those can be quite latency sensitive too.
Either way, there is nothing magic. At the end of the day, the soonest you can see an action is still physically unchanged. Especially since the prediction model that makes the most sense is just assuming no action.
That said, I dunno enough about MOBAs to try and postulate on how much it matters, and I have no idea if LoL is using lockstep or some other approach.
LoL is not using lockstep. Lockstep has two big disadvantages for a competitive game. Tt requires both client to have the full game state to run the simulation, which makes creation of maphacks trivial. The second issue with lockstep is that if one player is lagging, all players are lagging, which is also not something you'd like to have in a fast paced competitive game.
Maybe I'm just a fuddy-duddy, but for me there's no room for remote competitors in a serious competition. Either it all happens on the same LAN or its just playing for funsies.
> Tt requires both client to have the full game state to run the simulation,
> The second issue with lockstep is that if one player is lagging, all players are lagging
Where are you getting this?
My game Nebulous uses lockstep, and it's a MMOish mobile multiplayer game. Plenty of clients lag, but one client lagging doesn't cause any other to lag. Furthermore, clients do not have full game state, as an optimization and data transfer reduction first, anti-cheat second.
Path of Exile actually lets you choose lockstep or client-predictive. "Serious" and "competitive" ladder players choose lockstep because, naturally, you want your client's representation of the game state to be as close to the server's representation of the game state as possible, UX be damned.
This is independent of lockstep vs. predictive and more to do with topology. SC2 (and most RTSs) are peer to peer, there is no server game state, just two clients that synchronize to each other. In that case, one client lagging will absolutely effect another. Also in that case, its incredibly difficult to not have both clients simulating full game state.
Prediction could mitigate that (and possibly does, even in sc2), but if one client is suspended and the other is running fine, either every thing that the running client does is moot (and will be reverted) or it gets suspended, regardless of lockstep or prediction.
The thread you linked reasonably concludes its "routed" P2P. In other words, P2P with, (1) a forced route through, and (2) a form of network address translation at, blizzard's server.
My speculation: this almost exclusively to protect the private IP of one client being exposed to another client. There is no server processing happening beyond connection verification to help mitigate desync attacks. Deliberate desyncs are still possible though, since I've been the victim of it.
I can definitely see why the maphack thing would be an issue generally, but in the case of a broadcast esports competition is it so significant? They seem to have put an awful lot of engineering effort into their special lag system for this tournament, so I assume they at least have some pretty decent budget. So, presumably they could create a special client (possibly even sending the competitors locked down hardware). This seems like it would let them do some things we'd normally consider impossible or 'magic' -- assume both sides of the pipes are going to tell you the truth about everything...
There are pretty regular esports scandals involving cheating. Heck, we've seen players bring their cheats in via special drivers embedded in their peripherals (which is why almost every serious in-person tournament provides peripherals + PC and does not allow the player access to them outside of supervision).
I really dislike games which rewind state based upon latency. It penalizes people with good ping and can lead to very weird game states where a large rollback can cause teleportation of a laggy player as seen by non laggy players.
Team Fortress 2 is particularly bad where high latency players can teleport around or you can shoot and hit a laggy player but it doesn’t register by the server due to a rollback. Some players purposely exploit this function by artificially increasing their ping values to absurd levels (700ms or more) which makes them very hard to hit. Many community TF2 servers enforce maximum ping limits for this reason.
Most notably high pings are used to exploit when playing as the Spy class due to the fundamental purpose of the class is to get behind you to backstab you. If you have 700 ping can abuse the fact that you you can get behind players due to the game being out of sync and if you can backstab a player the server will roll back and kill the non-laggy player even though you were no where near behind them from their non-laggy perspective.
While the exact logic TF2 uses leaves a lot to be improved, you absolutely need this kind of lag compensation. Without rewinding any amount of latency results in having to aim where an enemy gamer will be by the time your packet reaches the server instead of where they are on your screen. If you remember before they patched the Pyro class you had to aim significantly in front of players because the flames weren't lag-compentated.
There's a difference between backwards reconciliation (rolling the game world back to process a lagged player's input) leading to "shot around a wall", and packet loss/jitter leading to jumpy player and teleporting.
I make this distinction because I actually think backwards reconciliation is table stakes for any kind of precision game and is usually pretty easy to implement. It can also help solve teleporting: you run backwards reconciliation on movement commands also (not just shooting or other actions), and combine with a little player position extrapolation.
Nothing fixes a 700ms, actually I would guess the ceiling on competitive play is somewhere around 200ms with incredible netcode. But 200ms gets you pretty far: across the US or the Atlantic Ocean for sure (packet loss over those distances is another story though).
every game in existence where you can shoot at someone by simply clicking where they are either uses lag compensation or clientside hitreg, it's simply impossible to get around this
also players only teleport around on your screen in tf2 if you set your cl_interp very low
I’m not a game dev but I don’t really buy that being able to shoot other players requires compensation or client side hit reg. If you assume players have 35 ping (as Riot did) that is only 2 frames at 60fps which may make a small difference in edge cases, but I believe the game would still be very playable.
Spoiler alert: it wouldn’t be “very playable”. Not in a FPS. There’s a reason that basically every single FPS out there has it. You want headshots to happen when people click on people’s heads, not clicking ahead of them.
I stated a detailed reason why I think playing a game with 35 ping without rollback would be “playable”. If you would like to explain technically why it would not be playable I am willing to listen and learn, but posting some low effort Reddit-esque reply does not bring any value to this conversation.
That's just another kind of latency. Additional time has still passed from you (or the enemy player) doing an action, and you seeing the results of that action on your screen.
By contrast, the server rolling back the game state is indistinguishable from the server always being 15ms behind and the clients running the simulation ahead of time (which they do anyway for fluency [1]). The "jumpiness" is called rubberbanding [2].
> I feel like that's the one scenario you'd care about, making the Busan players have 35 ping was the whole point
I think the key word is "significantly". The post states that the tool was originally developed for matches where both teams would be remotely connected, so neither team would have a significantly different latency (you can usually enginneer this by the first approach they suggested, of picking a server in between two locations).
I love the intent and actions behind this post. Finding an unfair bug mid-tournament is serious, and it takes courage and integrity on the part of the technical team to
-listen to the players
-investigate past what your own analytics are saying
-find and fix the bug under time pressure
-and disclose the (potentially embarrassing!) initial mistake
I think the article itself could have been worded better though. It didnt feel very clear. As a reader, I want to know what went wrong, how it was handled, and feel confident that you are on top of the issue. All the stuff about "why we chose our network topography" reads as details at best, justification/excuses at worst.
The technical team acted honorably, but the upper stewardship at Riot games behaved horrendously.
They sacrificed the competitive integrity of their sport because they didn't want to exclude RNG from the tournament, because they wanted the audience of the Chinese market.
This article goes into more detail, giving a larger context for why this situation shouldn't have never happened in the first place.
> because they wanted the audience of the Chinese market.
Or maybe because the tournament would've been shit without them? Who wants to watch an international tournament with only 1 of the clearly top 2 regions represented? WHole thing would've been a forgone conclusion at that point.
Yes, the tournament would have been worse without them. What I'm saying is that does not somehow justify penalizing the other teams and the tournament writ large to "even" the playing field if you value competitive integrity.
As the article states, increasing the ping doesn't make it more fair for the other teams because they actually traveled to the event. They're physically there. It only makes it more fair for RNG who can't attend because of Covid restrictions.
To paraphrase, it "punishes the teams who are there for a team that isn't."
From a business perspective of course, I agree with you. It's definitely better to sacrifice the level of play for viewership (and it does sacrifice the level of play because artificially increasing the ping at a LAN event makes the game worse).
But if you're running a competitive eSport and you want a fair sport, it's a terrible decision.
Fairness isn't a perk applied to a team, it is a property of the game.
Unless "ability to travel during a pandemic" is considered a skill for this game, it should not be factored into the competition. So, if teams who can't travel to be game are going to be included, then the host must find a way to provide a fair game. This means that remote and local players should have the same latency.
> Fairness isn't a perk applied to a team, it is a property of the game.
Could you clarify what this means? I’m not entirely sure I understand what you mean. I think I just need a little more context.
> if teams who can't travel to be game are going to be included, then the host must find a way to provide a fair game.
I agree with the logic but disagree with the premise: teams who can’t travel to play the game shouldn’t be included in the tournament in the first place, because it places an undue burden on the other teams and lowers the quality of play.
> Unless "ability to travel during a pandemic" is considered a skill for this game
I don’t think it’s a skill; it’s a basic obligation individuals and teams must fulfill.
>> Fairness isn't a perk applied to a team, it is a property of the game.
> Could you clarify what this means? I’m not entirely sure I understand what you mean. I think I just need a little more context.
You have described fairness in terms of punishing or rewarding players for certain non-game behavior. But, this is not the right way to think of fairness, in the context of a competitive game. A competitive game has a set of rules, which the players compete under. In a fair competitive game, those rules don't favor a particular set of players. Regardless of whether they deserve it or not, players who couldn't show up in person would be playing under a handicap if there was no lag correction.
>> if teams who can't travel to be game are going to be included, then the host must find a way to provide a fair game.
> I agree with the logic but disagree with the premise: teams who can’t travel to play the game shouldn’t be included in the tournament in the first place, because it places an undue burden on the other teams and lowers the quality of play.
>> Unless "ability to travel during a pandemic" is considered a skill for this game
> I don’t think it’s a skill; it’s a basic obligation individuals and teams must fulfill.
Sure, it is an understandable position to hold, that players who can't show up shouldn't be able to play. Riot clearly disagrees with you(for business reasons probably, mostly). I sort of disagree with you, mostly because I think that while we're in this weird liminal stage of the pandemic, I think we should try to accommodate people who want to engage from home. But this is a soft disagreement, I think there are good arguments either way.
However, once the decision is made about which players will be included in the tournament, if the tournament is to be taken seriously it must present a level playing field to the players who've been let in.
My only rebuttal would be that I don't think the actual game itself should be affected if we want to preserve competitive integrity. That's where I draw the line. Raising the ping at a LAN does affect gameplay and rather significantly.
I do agree with you the host should make a decent effort of accommodating players who can't travel though, but I thought this was pushing it a bit too far for my taste.
What's funny about all this is that technically because the Asian Games aren't going to be held anymore and because it was one of the mitigating factors for RNG not travelling, technically they could physically travel to Korea now.
It is a funny position that Esports are in. Normal, recreational competition takes place online with random pings. Pro competitions take place with set, usually minimal ping. For everybody but the pros, adjusting to lag is a crucial skill. Maybe tournaments games should be played as 'sets,' where everybody on each team gets randomized ping, to test that skill. (just kidding (although it would be interesting)).
> increasing the ping doesn't make it more fair for the other teams because they actually traveled to the event
Can't agree. Those teams had an advantage when playing RNG. Taking away that advantage does actually make the event more fair for them, even though they are being disadvantaged.
RNG already have an advantage when they are allowed to play remotely from the comfort of their home. Yes it's not the team's fault they can't travel but it is what it is. There's another tournament in 6 months.
There's a reason this Frankenstein ping is unheard of in bigger esports.
Could you clarify what you mean by "those teams had an advantage when playing RNG"? I may have missed something. I'm not sure what advantage you're referring to.
Yeah, the technical details are cool and all but it's unreal how this option was even on the table. It sidesteps one of the biggest reasons for a LAN. What's the point of gathering everyone together if 35ms is going to be added to the ping.
Funnily enough, all this hoohaa was for nothing as the Asian Games got postponed anyway.
I don't have a close observation either. I have read r/dota2 regularly, and the laziness of Valve just popped fairly often. Things like items promised got delayed and buyers actually forgot them, serious bug in TI (https://www.reddit.com/r/DotA2/comments/9aexks/a_morphling_b...) etc.
Reminds me of the cable lengths for black boxes connected to the network in Wall Street. Each cable is the same length regardless of which computer is closer to the access point.
Perhaps apocryphal/silly, but amusing nonetheless. Story goes that this means you want to be in the computer furthest from the interconnect because light travels slightly faster in straight fiber than in coiled fiber.
I work in this space and although true (this is called modal dispersion), there are compensation techniques always used in termination equipment to 'handicap' or mitigate these occurrences. Not perfect, but very, very, very small deltas. Unsure if it's enough to act upon without knowing the length from output to next input.
I never understood why they didn't use randomized length micro batches to solve this.
Instead of processing orders instantly, wait between 200ms and 500ms and then process all orders that came in that window in random order. Then being 5ms closer to the server wouldn't matter.
That sounds like a complex solution. Sometimes a dumb solution that works good enough is better than a complex solution that _probably_ can't be exploited.
It's complex on one end to reduce complexity on the other -- the trading companies wouldn't have to worry about millisecond optimizations if they trading batches were 200ms windows. So the wire lengths wouldn't matter but also not mattering is the processor, memory, software, etc. for the trading companies. Seems like a good tradeoff.
And honestly the wire thing probably isn't real. Light moves 30cm in a nanosecond. The wire lengths could be 3 meters different and only make a 10ns difference. Not sure that would matter all that much.
I used to work in HFT. I promise you that companies would still try to exploit randomized batches. There is an advantage to being the very last entrant into a batch (most up to date information). Truly random batches are not trivial to implement and any statistical pattern in the batching could be exploited.
"Truly random batches are not trivial to implement and any statistical pattern in the batching could be exploited."
If an HFT manages to exploit a correctly implemented entropy-pool random number generator using AES-256 to extend the stream as needed, they're welcome to pick up a few more bucks as far as I'm concerned.
Yes, problems have existed before but I'm sure in this case we can assume high-assurance and careful programming, not some fresh grad being assigned the problem and thinking linear congruential generators sound really cool.
Wait, come back, I'm serious! Finding an arbitrary hash of adjustable complexity is a scalable solution to batching transactions across multiple servers with a consistent throughput.
I think it just moves the goal post of the problem. If I now have a 200ms window, then I need to optimize for that 200ms cutoff. I want to do as much work as possible while _still reaching the cutoff_ and so minimization of "non work" still matters. Not only that, but now you incentivize filling up that 200ms with 200ms of analysis.
You could, but then if say the delay is 439ms, then you'll sit idle for 239ms. It would make more sense from a game theory perspective to simply process for as long as necessary and then place the order.
I don't think I get how that solves the issue: you would have set a fixed cutoff time, where you switch from one window/batch to the next. It doesn't matter when you arrive within the window. But statistically, even for random window lengths, if you have a smaller latency, you will make the cutoff for the earlier window more often.
Say I know exactly when the window ticks over. I still want to put my order through immediately anyway since it's to my advantage to get it in the earliest possible window. There's no advantage in waiting for the next window to start.
So I put in my order, and my closer competitor puts in their order at the same time, and they get in the current window, while I have to wait for the next one. And now my disadvantage is even bigger because my order is delayed on top of my base latency by the remaining time until the next window ends.
-----
Even if we remove the window idea entirely and just add a random 100-500ms delay to every command immediately, being closer still has an advantage since 10 + Rand(100,500) is still lower on average than 100 + Rand(100,500).
Ok but you don't know the window size for any given batch, so game theory-wise your best bet is to just process as fast as you can and then place your order.
• Pre-publish, for each time batch, a public key. You could publish lists of these well in advance.
• Let everyone submit, alongside each order, a number arbitrarily selected by them. It does not matter how they select the number, but it would be simpler if everyone chose distinct numbers.
• When order processing is done, do it by the order of closeness of the submitted number to the private key. Publish the private key after order submission is closed.
• Everybody can now verify that the order of processing is indeed by the order of the previously secret private key, and everybody can verify that the private key corresponds to the previously published private key.
Something very similar is done on smart contracts to introduce randomness (at least to the practical extent) on blockchain. Good thought experiment nevertheless to provably inject randomness into a system with untrusting parties.
That seems unnecessary. Today the traders have to trust the exchange to process fairly and in order -- there is no verifiability. The verifiability would be nice, but unnecessary.
I'm going to guess this effect is lower because it's being marginalized across all player skill levels (ELO). In Riot's post you can see that the higher tier the player is at, the more likely they are to use Ethernet. If you conditioned on ELO I suspect you'll find a much larger effect.
The part I was interested about the most was glossed over.
In what situation can this happen?
> we realized that there was a calculation error that only manifested in scenarios where the actual ping was significantly lower than the target latency. In this situation the actual latency would be considerably higher than what is displayed on the overlay on the player's screens.
Knowing the classic Riot Games MO: I’d bet money there was a plus instead of a minus that made it through code review, and they’re being intentionally vague to save face.
They added too much latency because they forgot to account for the latency they added in the client. You can see from their architecture diagram (Figure 4) that the latency measurement didn't include the client delay.
But the blog post states: "a calculation error that only manifested in scenarios where the actual ping was significantly lower than the target latency".
Sadly, my guess is that they are transparently lying about that. Since they apply the lag half on the client and half on the server, their lag compensation would have been off by a factor of 2x (or a factor of 100% from a relative perspective) and so they might be just claiming that a 100% error isn't very large when the total lag difference is small. "Yes we were supposed to add 2ms and we added 4ms instead, but at the end we were still only wrong by 2ms (in the other direction) which is not a big deal."
Distributed consensus is hard. Very hard. Anyone dismissing the problem as “should never have hopped,” hasn’t a clue as to how hard and error prone reliable distributed consensus is. 35 ms lag in overall responsiveness is an eternity in competitive game play.
There are videos floating around that show drawing on a tablet surface with various input latencies (perhaps someone has a link, I can’t find them at the moment). 35ms latency is very noticeable to anyone never mind professional competitors.
Because of the tricks game designers must pull off in lieu of proper distributed consensus which has hard requirements bound by the laws of physics, it is completely likely there are lots of bugs in the system. I think Riot did the best anyone could reasonably have expected of them, and the write up is particularly informative and helpful.
I think you meant https://youtu.be/vOvQCPLkPt4 . It was indeed not too easy to find, due to SEO tactics polluting YT search with gaming-oriented stuff from recent times (where one may actually reach the frame rates necessary for such low latencies in a frame-by-frame display technology).
Sure there is. The need to introducing a unified delay across all game clients in the first place, reliable measurement and comparison of latency data, and the effects of latency, artificial or otherwise, on game state which is not strongly consistent. There could be many other areas in which this bug intersects with distributed consensus in the actual implementation as well.
In fact neglecting the impact of distributed consensus is one of the biggest challenges to mitigating it.
They built a complex system to tweak tens of ms (at most) of ping to equalize. And they had a bug and it disadvantaged on team. Thus, they have to replay due to unfairness introduced by Riot itself.
They could've gone Option 1 - all teams at their natural ping. While it wouldn't be as perfect in theory, it wouldn't have resulted in replaying matches. Plenty of other esports and fighting games play at natural ping and have real results. There is unfairness due to location, but only scrubs blame 5-15ms ping advantage for a loss in a game as strategic as LoL.
People play Melee with ping differences bigger than that and it works fine lol. Waaay more technical game shows that there's tolerance.
Not to mention that the server option could've had some system to distribute across advantageous servers over the course of a series.
This shouldn't detract too much from your point, but for fighting games the situation is slightly different from most other genres:
Modern fighting games, as well as older ones that have been modded to included this (such as Melee), have a netcode model called "rollback" that can negate network-induced latency on local inputs. The trade-off is that what you see is usually inaccurate until you receive inputs from the remote player's machine.
Despite fighters often requiring fast reaction speeds, this downside is not a big deal granted that the ping is low. The first few frames of many actions are generally quite subtle, enough that they are quite hard to distinguish from each other; it's usually specific keyframes/poses that people will actually be reacting to.
Melee is probably hurt the most though, as movement is extremely fast and very important, and the difference between wavedashing left or right is not masked by subtle animations.
> People play Melee with ping differences bigger than that and it works fine lol. Waaay more technical game shows that there's tolerance.
Sure, people play with ping disparity in tons of competitive games, but that's because there's not really an alternative for online play. It's "play with a ping disparity" or "don't play at all".
Once there is more widespread reliable alternative, which Riot seems to be working toward in their own game tournaments, maybe we'll see if this tolerance persists in competitive tournaments.
I think you are completely incorrect when you say that a 5-15ms ping advantage doesn't cause a massive effect in a game of LoL. Even the greatest player of the game (Faker) mentioned the difference between "35"ms and the near 0 that Korean players are used to playing.
Low ping is part of players' muscle memory in how they react to visual signals. Changing that slightly creates a big difference, especially in the early and mid game where the outplays are less strategic and can often be more mechanical.
Even in later game teamfighting, the familiarity with latency between key/mouse input and actual in-game movement is mechanically intense (ex. auto-spacing in a close teamfight)
There is a huge difference between 0 ping and 35 ping though, particularly in pro-level LoL. It is strategic but that ping can be the difference between flashing a skillshot or not, and swinging the whole course of a teamfight.
Of course 35 ping is not THAT bad and I agree, while unfair, it wouldnt be the end of the world, it's not even Worlds, just MSI, an invitational event.
I think ping differences lead to very interesting effects. One example is, imagine you have a "favor the shooter" system, and you have a 1v1 where one player has a sniper rifle that takes a while to charge to full power when they put it up to their eye to shoot, and you have the ability to become instantly invulnerable when you press the shift key. If the packet that says "she is putting the rifle up to her eye now" arrives 100ms late, then you have a false sense of security, because your mental model of her state of charge is 100ms behind reality. It means you use your invulnerability too late when you think it's time for her to fire a fully-charged shot, and you die instead of live. (This happens in Overwatch; if you play Mei you'll know what it's like to die inside your ice block.)
Another interesting scenario is projectile-based weapons. For reasonable Internet latencies and reasonable in-game distances, the projectile can travel faster than the packet announcing that it has been fired. The game state that gets sent from the shooter over the network is "I fired my weapon, I killed them". At the time you die, your client has none of that information, and since the shooter is favored, you just die out of nowhere. (This is less annoying, because I don't think you could move out of the way fast enough for it to matter. "My opponent's WiFi is bad, so I die half a second later for no reason" still feels pretty bad though. In the intervening milliseconds you had grand plans to win the game!)
All of this, to me, is a great lesson in eventual consistency in distributed systems. People break their keyboards over this stuff. Thinking it will be fine for your more-critical-than-a-casual-video-game system means this stuff will be happening to you multiple times a day, and you can't break your keyboard because you can only blame yourself for brining that system into the world. Tread carefully! Tread very carefully. Shit gets weird when your system has a split brain.
> People play Melee with ping differences bigger than that and it works fine lol. Waaay more technical game shows that there's tolerance.
I dunno; I'd imagine people would be mad if there was a 15ms ping differential in Melee. I think the online/netplay tournament results are already looked at different than offline, console results.
Moreover, I'm pretty sure that you can't have a differential ping in a P2P game with rollback, so the competitive issues are different between the two scenes. I understand Riot wanting to equalize them. In Melee specifically, I think that the players have to decide on a common frame delay to play, right? If so, then the Melee community has come to the same solution for competitive netplay as Riot has.
Artificial or natural delays, I am somewhat confused that one end measuring/calculating ping can end up with a wrong value. Transmit at local time x, receive at local time y, subtract. It only needs a working local clock.
Second surprising thing is artifical latency added on the client. Anything on the client I would avoid, if possible.
Third, this all looks like magically connecting two ends. But it is packets flowing from one to the other, and the other to the one. Latency could be added to each independently, even on one end, by artificially buffering to-be-send and/or received packets.
What are the implications for game-play with an asymetric latency, e.g. theoretical 0 delay to receive vs. long delay to transmit, or vice versa?
The article doesn’t seem to discuss transmit versus receive latency asymmetry, nor did I see it mention jitter (although I admit I skimmed article).
With two cities, a dedicated common connection with measurements for latency in both directions, and real-time tit-for-tat latency adjustment would be appropriate (if remote city A gets latency X, ensure local city B approximately gets latency X on next game tick)
Without that:
0. latency fluctuations could penalise the remote team even though average latency was kept consistent over longer periods,
1. local players could advantage themselves by injecting delays or jitter into remote players during critical periods of play (DoS like network traffic between local and remote cities),
2a. local players could get server updates 10s of milliseconds before remote players (if all latency adjustment added on transmit from player to server), or
2b. local players could get their events updated to the server 10s of milliseconds before remote players (if all latency adjustment added on transmit from server to player).
I think the symmetric delay here makes sense, since you're trying to simulate the latency of a (likely) symmetric connection between another client and the server.
If you only add latency on the transmission, then 2 actions taken at the same time by players with different amounts of physical latency will be processed at different times. If we assume the game clocks on each client and the server are in sync, then this creates a competitive advantage for the player with lower physical latency. For instance, if we have a timing battle (ie: Zhonya's hourglass), where both players know that at time 'T' they must each take an action, and the first to do so comes out on top, then the player with higher physical latency is at a disadvantage.
You can accomplish this by putting it all on the client, all on the server, or a mix of both (what they chose).
For all on the server, you could add latency right after receiving data from the client, but before game code processes it, and before sending data to the client.
That said, there likely were technical considerations involved in their decision to do a mix of both.
That's a good point, I hadn't thought of having independent send a receive delays on the server, simulating a delayed connection from the client's perspective without having it participate.
I now wonder too what those technical considerations are, since having the client add latency requires trusting the client to add the delay and not cheat.
It looks like the error they made was this: They added too much latency because they forgot to account for the latency they added in the client. You can see from their architecture diagram (Figure 4) that the latency measurement didn't include the client delay.
I read that section as comparing what they logged before vs what the player experiences: the time taken for an input to be registered and it's effect communicated to the user. Maybe I'm wrong, but I don't think the intention was a precise description of the error, but rather the hazard of not having logs which reflect what the user experiences.
They stated that they use the same calculation for the logs as for other purposes such as latency compensation. From near the top of the blog post:
"The reason we did not find it sooner is that the cause of the issue was a code bug that miscalculated latency, which meant that the values in our logs were also wrong."
And from later in the post:
"Our logs were not displaying the issue because the calculation was wrong. It explained why the latency was worse in the venue than on the internet servers"
Right, but my point was that I think you're reading into the diagram beyond it's intended purpose. I'm not sure how you can tell that they're logging only the server-side delay, and not the client-side delay, from a systems diagram which shows each of those as a box connected by a bi-directional arrow. For instance, it's clear to me that they don't include the latency included by the game engine in that metric, but not whether they include the server-added latency or if they're aware of it and subtract it off.
They used a complex and buggy solution when a simple solution exists: Use the `tc` traffic control [0] program to configure the Linux kernel to add a fixed amount of latency to traffic from the local players [1]. If the game server does not run on Linux, they could put a Linux bridge/router in front of it.
As I understand the blog post, the difficult (and buggy) part is not the addition of latency, but calculating how much latency to add and where to add it. I'm not sure how tc would help much here, and actually don't see anything to indicate they weren't using tc already.
`tc` would give you non-deterministic microbursts that would be hard to quantify as to whether they impacted competitive integrity. This is an incredibly complex problem if they actually care about millisecond level precision in small duty cycles.
That's exactly what I'm saying. I'm describing an implementation detail of traffic shaping. In order to traffic shape, you need to buffer packets. And at some point, you flush that buffer. There is a duty cycle (read: loop) by which you decide when to buffer and when to flush, the timing of this duty cycle is arbitrary/configurable. But no matter the duration of that cycle, when the flush occurs, you naturally get microbursts. So on average, you might be able to target a given millisecond goal over time. But in competitive gaming, average network performance doesn't matter if critical win-or-lose moments are lost to latency fluctuations. The microbursts I spoke of can lead to one team winning the input race over the other as a latent effect of the artificial buffering mechanism (namely the flushing).
Now consider on top of the fact that there are more buffers than just the `tc` buffers involved. This leads to non-linear, non-deterministic fluctuations in latency performance when bursting occurs.
tc is purpose-built to shape throughput, not for tuning precision latency.
(for some context: traffic shaping is a big part of what I work on in my day-to-day helping build and run a wireless ISP)
The problem is that you can't just add a fixed amount of latency to hit a goal of 35ms, you need some sort of algorithm that adaptively changes the amount of added latency to account for occasional spikes in network latency (this is a game that takes >30min so those spikes are guaranteed to happen over the course of a game)
Why not? Their system simulates latency between their facilities in two cities. Latency does not change much. Their solution adjusts to realtime latency variance, but they don't show that such realtime adjustment is needed. The article doesn't say whether they even measured latency variance.
And they also don't show that their dynamic adjustments are more fair than a static adjustment. I think it's likely that their system penalizes local players for too long after a short remote latency spike.
>The difference was .01 seconds. It's not immediately obvious that there is a problem.
in games you really feel ping differences
if you were playing e.g for a year on 5ping and then jumped on 30, then you'll feel that in games like LoL and probably CSGO? idk about the second one.
They said the ping was 45 instead of 35. I'm not denying that the pros can tell the difference, I'm saying it wouldn't be obvious to whatever tester they had that it was 10ms off.
As others have said, the pros noticed. Having an incorrect calculation and seeing large jitter spikes can probably push the latency into something like 50ms momentarily, enough for pros to lose their "immersion".
well I disagree. in many situations there's just so much you can do. time the flash, drop the combo and that's about it. bash 5-10 buttohs and you are mostly done for next 10-20 seconds. good reflexes do matter but not as much. what matters way more is game / map sense, presence of actual strategic / tactical plan, team synergy, experience
I suck at lol, the amount of ping it takes for me to be able to tell it's different is only like 30-40 ms. Pro players being able to tell and care at half that seems pretty likely.
well adjusting the latency is the easy part.
doing it in real time while observing the connection and making adjustments, that is when it becomes hard.
and they did it basically after the network stack, so if a paket took 40ms they added no latency if it took 20ms they added 15ms and so on.
so EACH paket was evaluated and of course this is not an easy feat, because lol is serverside authoritive so the latency must be correct inside the server and the client for each packet.
Yea, I guess the question is whether or not to try to do it at the packet level or just keep a running average of ping latency and adjust it over time. They said `adjust the latency (ping) to 35 ms` so I am assuming they are just trying to do something like attempting to maintain an average ping latency of 35 ms.
This is a pretty classic PID here. You're attempting to keep latency at 35ms and based on the latency on each packet as observed by the server, and previously recorded latencies, you can add corrections. With some simulations and tuning I'll bet you can get a pretty good algorithm going.
Simulating this stuff is pretty simple. Model each incoming stream as an MM1 queue and play around with your PID algorithm until it gives you the results you need.
I'd argue the complexity of trying to achieve this at all is too much to bear in practice, especially considering the type of customers this would be inflicted upon. Competitive gamers will now also be wondering if the "fake lag" system is bugged, in addition to all of the other problems that could still exist.
Some problems cannot be solved with clever tricks.
I'm into music production. Try to process some sound through the compressor and adjust its attack - my ears quite easily can discern difference as small as 2ms. Another exercise - try to sing in a group over the internet (you gonna rage quit in five minutes guaranteed).
Anyways, by "scenarios where the actual ping was significantly lower than the target latency" do they mean like if the ping from Busan-server is 10 but they want it to feel like 35? I feel like that's the one scenario you'd care about, making the Busan players have 35 ping was the whole point.
Edit - also the other scenario, where the actual ping is significantly higher than the target latency...well in that case your latency service would be some sort of magical time travel box! Adding latency is literally the only thing it can do right? It's not possible to lower it. Maybe the it depends what they mean by "significantly" but idk the target latency was 35ms so like at most they could be 35ms lower than that.
I guess they didn't catch it because the bug was in the latency service itself which gave the wrong value to both the in game display and to the old network monitoring system. Idk in hindsight it's easy to say if you're going to introduce this new ping equalizing service it might break how you test ping especially because it hasn't been tested in soloq. But meh a lot easier to see what happened in hindsight after they write it all up.