There is a theorem that it is impossible in general to remotely compare two files with less network traffic than it requires to simply send one file over the wire. But rsync does it. How? By accepting a theoretical but unlikely possibility of coming up with the wrong answer.
What rsync does is compare hashes of ranges of stuff. If the hash comes out the same, the two are assumed to be identical with no further investigation. It is possible that different files will be thought identical. But it is unlikely that any two hashes ever generated by rsync have accidentally been the same when the underlying files were different.
(I do have to say "accidentally" because rsync uses MD5 as a hashing algorithm, and people have deliberately created collisions in MD5.)
There is an even simpler idea at the core of many modern backup programs (eg. bup, attic, borg)! Possibly pioneered in `gzip --rsyncable`, not sure.
Each side slices files into chunks at points where rolling hash % N == 0, giving chunks of variable size averaging N.
These points usually self-synchronize after insertions and deletions.
This allows not just comparing 2 files for common parts, but also deduplicated storage of ANY number of files as lists of chunk hashes, pointing to just one copy of each chunk!