It's possible to be dramatically more communication efficient than bloom filters to synchronize sets.
Bloom filters require O(my_set_size) bandwidth, with a good constant factor. This work optimizes it somewhat by excluding members that existed at the time of last synchronization, but it still wastes bandwidth when those commits came in from another source.
It's possible to instead realize O(symmetric_difference_between_sets) when synchronizing a pair. Applied to this kind of usage the bandwidth required would be reduced by whatever redundancy factor (e.g. number of different repos you'd learn the same commit from).
This can be accomplished via error correcting codes, for a high performance industrial implementation I contributed to, see: https://github.com/sipa/minisketch/ (I tried submitting this to HN a while back, but no joy--).
For our paper on applying this to Bitcoin transaction relay to render rumouring bandwidth largely independent of peer count:
[Similar to the motivations of the post, we consider immunity to attackers to be critical-- and most prior efficient gossip work is extremely non-robust.]
For the specific case of synchronizing the blockchain itself, these fancy techniques aren't needed because there is only a single best chain 'list'-- there is only one relevant tip-- and it can be efficiently synchronized with bisection. Bitcoin sends up to 100 hashes at once (though usually more like 30), the first few dense around the best tip and the rest exponentially spaced to extremely efficiently find the highest point of commonalty.
Wow. If I understood its capabilities right, minisketch seems indistinguishable from magic.
Would you please clear a bunch of things:
0. Is my understanding that total elements in a set doesn't matter (afa reconcilation is concerned), right? See #1.
1. If Alice knows for sure there have been, say, 50 new additions (or rather, updates to existing elements) of elements of size 10 bits each (to a set of say 100M total other unchanged elements) since the last sync with Bob, does a sketch(10, 0, 50) by Alice cover all changes Bob needs to see or sketch(10, 0, 100) is needed?
2. A sketch(10, 0, 100) would occupy atmost 10x100 bits? Or, atleast 10x100 bits?
3. What's state of the art in set-reconciliation research right now apart from Pin Sketches?
0. Indeed. Though if your set consists of 100 million elements, you'll indirectly need to have at least 27 bit elements just to be able to identify them. So for practical problems you can probably say that the sketch size does grow logarithmitically with the set sizes.
1. The capacity scales with the symmetric difference between the sets. So if one party has [A,B,C] and the other has [B,C,D,E], you need capacity 3 (for A,D,E). So every "change" to an element costs at most 2 capacity; insertions/deletions each cost 1 capacity.
2. Sketch size is exactly (element size in bits)*capacity. So 1000 bits for a sketch with capacity of 100 10-bit elements.
3. I'm not really up to date here. I believe that PinSketch is the best known deterministic reconciliation (meaning that reconstruction is guaranteed to succeed if the capacity is sufficient). There are more performant algorithms and more trade-offs to make (between capacity and probility, e.g.) if you allow probabilistic recovery.
4. I just wrote a (slow) native Python re-implementation for demonstration/testing purpose: https://github.com/sipa/minisketch/pull/26. The C++ code relies on many low-level optimizations that are hard to replicate in higher level languages, so expect a very serious performance drop.
1. Shouldn't be hard to compile to WASM with emscripten, but I haven't tried.
Thanks a lot. I have use for something exactly like minisketch but need to interface it with JavaScript. I'll take a stab at it with Emscripten.
Can you point towards other performant algorithms? IBLT, from the minisketch GitHub readme at least, seems wasteful in terms of space.
If you're aware: What'd be a comparable data structure to Pin Sketches for membership checks (whether "X" exists in a set of, say, 100M records). Minimal Acyclic Finite State Automata (MA-FSA) and Succinct Tries come to mind but those aren't optimal though deterministic.
Yeah, IBLT has a size overhead which mostly matters for small differences/capacities. It's also probabilistic, so the higher your probability for recovery has to be, the larger it gets. Assuming you don't actually know an strict upper bound on the symmetric difference size before reconciliation, PinSketch of course also becomes probabilistic.
On the other hand, IBLT is very fast (and linear in the capacity; PinSketch is quadratic in the capacity).
A data structure for set lookups? Sounds like a Bloom filter. If you want it deterministic, just send the set in sorter order say? I don't think there is much fancy possible here.
That does sound like absolute magic. One hundred million? I didn't know about this algorithm, and it looks revolutionary for a ton of things.
If sketch size grows with only log(n) where N is the set size, then we are talking about practical efficient non-thrashing replication of billions of records at broadband speeds.
I think the most mind-blowing aspect about it is that (PinSketch) sketches are identical in size to the bandwidth that would be required to just send the difference if you did know it ahead of time.
Perhaps this intuition helps: imagine we have sets of positive numbers, and we know that the difference is at most 1 element. So there is one element that you have but I don't, or the other way around.
In this case there is a very easy solution: just send the sum of all your elements. If it's larger than the sum of the receiver's element, the (absolute value of the) difference is exactly the element that one party has and the other doesn't.
In a way, PinSketch is generalizing this "send sum of your elements" in such a way that it works for more than one difference.
In my understanding, a Bloom filter, while being O(set_size), nonetheless has a small multiplier for the linear factor (in the OP I think they mentioned using 10 bits). As I read the PinSketch setup, it looks like it would actually need the full bit size of the set members, so for SHA-256 you need the full 256 bits per set member. So while PinSketch would only need O(symmetric_difference_between_sets), the element size factor is ~25 times larger?
And also, in the example from the minisketch readme, it looks A still needs to send the full sketch (not just the symdiff) in the first message, right?
In other words, there are a bunch of trade-offs here that might very much depend on your application?
You can make the hashes you as small as you like-- subject to the limitation that you'll get false positives if you make it too small ---- the same problem that small hashes get in bloom filters.
Bloom filters also have poor communications efficiencies for small numbers of entries: you have to oversize the filter a lot. This can be ameliorated by entropy coding the oversized bloom filter, but once you're going that far you're better off with using a golumb coded set or something more efficient.
I think the only case where an approximate set-filter approach will be more efficient is where you want to communicate a small set one way to someone whos set is much larger. The issue there is the the sketch needs one entry per item it removes, while a bloom filter can be smaller (assuming the difference is big enough to overcome the bloom overheads).
Information theoretically it should be possible to decode a minisketch 'beyond the limit' using knowledge of the specific items in the local set. ... but the only algorithm I know for this has exponential time (brute force remove items from your set and retry the reconstruction). This problem is mathematically pretty close to list-input decoding a error correcting code, so efficient algorithms may be possible-- but so far there hasn't been any research to that end.
[You can, of course, combine a filter with efficient reconciliation in cases where an set size differences are expected. The filter's FP rate just needs to be low enough to get the receivers set size close to the sender's]
> it looks A still needs to send the full sketch
The sketch is exactly the same size as the symdiff it can recover.
An interesting challenge is that you likely don't know in advance how big the symdiff is... If you're too conservative you waste bandwidth, but never more bandwidth then you use sending the whole set. You can reduce your waste at the expense of round-trips, in an obvious manner: fetching a little more of the sketch each time until you have enough.
In the erlay paper (bitcoin application of this stuff), we give a scheme to bisect when the difference is bigger than expected without wasting much bandwidth. The bisection loses a little communications efficiency but avoids quadratic computation costs which become problematic for big set differences.
It depends on the protocol requirements. If you have 256-bit data elements to reconcile, and want 100% guarantee that reconciliation will succeed with 1 round-trip, then you'll indeed need to use sketches with 256-bit elements.
However, if you're fine with more roundtrips, or probabilistic reconciliation, you can typically use far smaller elements. Have the parties agree on a random salt, and construct sketches of short salted hashes of your data elements (where the size of these hashes depends on your tolerance for failure due to collisions in your sets). Run the reconciliation algorithm to figure out the difference, and then send/request the real elements based on their short hashes.
From what I can see in the graphs on GitHub it seems really computationally intensive, only a few thousand 30-bit elements per second (for decoding if I understand it correctly).
So I guess that is the catch which makes it impractical for more widespread use. Or am I wrong, is there an alternative speed wise?
Depends on how you split it up: if you're synchronizing 100 differences at a time it's on the order of 100,000 differences per second per core which is plenty fast for many applications.
If you're working on thousands of differences at a time, then IBLT's communications efficiency may be a acceptable, and it is considerably faster.
[You can always decrease the amount you feed into reconciliation at once by using hashing to subdivide your inputs, at a small loss to efficiency due to imbalancing.]
Bloom filters require O(my_set_size) bandwidth, with a good constant factor. This work optimizes it somewhat by excluding members that existed at the time of last synchronization, but it still wastes bandwidth when those commits came in from another source.
It's possible to instead realize O(symmetric_difference_between_sets) when synchronizing a pair. Applied to this kind of usage the bandwidth required would be reduced by whatever redundancy factor (e.g. number of different repos you'd learn the same commit from).
This can be accomplished via error correcting codes, for a high performance industrial implementation I contributed to, see: https://github.com/sipa/minisketch/ (I tried submitting this to HN a while back, but no joy--).
For our paper on applying this to Bitcoin transaction relay to render rumouring bandwidth largely independent of peer count:
https://arxiv.org/abs/1905.10518
[Similar to the motivations of the post, we consider immunity to attackers to be critical-- and most prior efficient gossip work is extremely non-robust.]
For the specific case of synchronizing the blockchain itself, these fancy techniques aren't needed because there is only a single best chain 'list'-- there is only one relevant tip-- and it can be efficiently synchronized with bisection. Bitcoin sends up to 100 hashes at once (though usually more like 30), the first few dense around the best tip and the rest exponentially spaced to extremely efficiently find the highest point of commonalty.