
Compressing Networked State Data - ingve
https://blog.demofox.org/2018/06/04/a-neat-trick-for-compressing-networked-state-data/
======
golergka
So, I'm in the middle of rewriting our game networking right now (in the huge
crunch before the upcoming Steam sale), and a couple of months ago I've
encountered exactly this problem. Thankfully, I've put in measuring before
going and implementing the compression algorithm.

Turns out, if you have a previous version of the state, you can have an
enormous results with the most basic algorithm ever: prefix + suffix. I don't
even know if it has any official name. Simply put, you find a length of common
prefix of the two byte arrays, a common suffix, and send only the bytes in-
between with the lengths of the suffxies. Since in our entity model most
changes are contained to a couple of bytes in the middle most of the time,
it's not uncommon to see compression factor of 10 and more.

And, of course, after compressing any object we compare the message length to
the length of uncompressed state change message, and send the shortest one.

------
jascenso
Better yet is to design an arithmetic encoder that models the state
probabilities accordingly. Also, the state of the game should be refreshed
periodically by compressing it without XORing, otherwise, when packet losses
occur you may have losses or unacceptable delays.

------
speps
If you're interested in this, Charles Bloom has a lot of articles on
compression applied to games, in particular UDP packets:
[http://cbloomrants.blogspot.com/2013/11/11-14-13-oodle-
packe...](http://cbloomrants.blogspot.com/2013/11/11-14-13-oodle-packet-
compression-for.html)

------
HoochieKoo
Fun fact. Some flight data recorder (black box) manufacturers use a method
like this to compress the data on the recorder. They record a base, full frame
of data and then take delta frames and use Huffman (variable length) coding to
reduce the amount of stored data.

------
rollulus
I believe this technique is called "delta encoding".

~~~
hedora
Delta encoding often means “frame of reference”, where the operations are +
and - instead of xor and xor, and the data types are integers.

Using + and - enables some cute tricks, like binary searches of compressed,
sorted data.

------
szemet
Its ok if the data have fixed length/layout.

Otherwise a similar trick can be done if you have a compressor with stream
support, which can be operated in a manner that:

Compress(String1 + String2) == Compress(String1) + X

Where + is concatenation and String1/2 are the serialized states.

Then if you want to send String2, you just send X (which is small if String1
and String2 similar enough).

And on the receiving side just run: Decompress(Compress(String1) + X)

After stripping String1 from the beginning you get String2.

~~~
eboyjr
Very cool. What kind of compression algorithms support this?

~~~
hedora
rsync is the market leader, I think.

------
infogulch
This seems like an ideal use case for a zstd custom dictionary.

------
radiospiel
just XOR-ing doesn't cut it. The client might receive the state out of order,
some updates might be missing completely - all of this must be handled somehow

~~~
golergka
This is usually handled on a lower layer - you only use this algorithm if you
can assume the updates are guaranteed and ordered. This of course means using
more costly network protocols (either TCP or something custom on top of UDP),
but for most game systems it's good enough, since they don't get updated that
often. And for objects positions you combine the two: have reliable object
state synchronized via guaranteed and ordered protocol a couple times per
second, but also have UDP updates on top of it, while keeping the "reliable"
copy to use for compression with the next reliable update.

~~~
radiospiel
The OP states: "When it’s time to send an update to the state, XOR it against
the previous state, .. and send it." and this is not going to work if the
previous state just isn't there.

It would be easy to fix: add a base-version number to the update. If the
client doesn't know about the base-version it knows it must not update its
state, but rather fetch a full state again or something.

Your suggestion "make sure there is a safe base version available and then
send a diff against that" makes a lot more sense.

~~~
noonat
A common solution is to have the server track which state frames that the
clients have acknowledged receiving. The server can then delta compress from
that client’s frame, for each client. It’s a bit more costly on the server
side, as it means you need to maintain state frames back to the oldest
acknowledged frame across all clients, but it guarantees you will be sending a
packet a client can understand. Usually this is combined with some sort of
limit which kicks a client off as timed out if its acknowledged frame strays
too far from the most recent one.

------
bcaa7f3a8bbc
"One Weird Trick for Compressing Networked State Data"

~~~
ardillaroja
"ISP's hate him!"

------
bjpbakker
tl;dr Send the full initial state, and then send deltas where each delta is
the current state xor the previous state.

This reminded me of an article I once read about the Doom3 network
architecture [0]. Optimizing state transfer to network clients is a lot more
difficult in the real world than xor-ing two states.

[0] [http://fabiensanglard.net/doom3_documentation/The-DOOM-
III-N...](http://fabiensanglard.net/doom3_documentation/The-DOOM-III-Network-
Architecture.pdf)

