
Eventually Consistent: How to Make a Mobile-First Distributed System - astigsen
https://realm.io/news/eventually-consistent-making-a-mobile-first-distributed-system/
======
bitwiseand
I liked the article but I feel the CAP theorem is again misquoted; it often
leads to misunderstanding. From the article - "but they would still confront
the reality of the CAP theorem: your system can be consistent, available, or
partition-tolerant, and you can only pick two"

The CAP theorem states that in the event of a network-partition you have to
choose one of C or A. More intuitively, any delay between nodes can be modeled
as a temporary network partition and in that event you have but two choices
either wait to return the latest data at a peer node (C) or return the last
available data at a peer node (A).

Edit: Switched C and A

~~~
bigfish24
Agree that the CAP theorem isn't as strict as that statement implies. The
trade-off only occurs at certain points. Sorry it was a little unclear!

~~~
bitwiseand
No need to be sorry. I liked the rest of the article. Thanks for that.

~~~
bitwiseand
Edit: I also wrote my comment because the article is beginner friendly. A lot
of beginners will get the right idea about CAP once they go through comments
and then back again to your article.

------
bsaul
After having built an app that was supposed to work offline, and resync when
the connexion is up, using OT as well ( A basic one), i have the feeling that
the _general_ problem is hard, but if you stick to trying to solve _your
specific_ problem, things become much simpler. As an example OT operations can
be either generic such as "update a property of an object" and then good luck
solving conflicts, or very qualified with a business context such as
"transfert money from account a to b", in which case the server-side code with
which you synchronize will be much more able to resolve any issue case per
case.

~~~
astigsen
(Alexander from Realm here) Yes, where this approach fits is very dependent on
the use case. In our experience the key is to have a data model that is
flexible enough to express many different use cases. There might be cases
where updating a single field is enough, or you might need it to work as a
counter, or maybe your data will be better modeled by ordered insertions into
a list.

Having the flexibility to use the exact combination of data types that fits
your specific requirements and allows you to express your intent in a way that
makes it clear how you want the conflicts resolved is essential to make it
work.

You can find a bit more info about our approach to conflict resolution here:
[https://realm.io/docs/realm-object-server/#conflict-
resoluti...](https://realm.io/docs/realm-object-server/#conflict-resolution)

~~~
btown
That design doc seems to mix loose ideas with the current state of the system.
For instance, are strings last-write-wins currently, or do they have
character-level OT? Congrats on the release - I'm a huge fan of realtime, and
I'd like nothing more than to build apps that depend on such a system - but
seeing documentation with those kinds of holes gives a really bad first
impression.

~~~
bigfish24
Sorry the docs are not clear. Right now you can only apply set operations on
the entire string. Internally in the database core, we support substring
operations, but we haven't exposed them yet mainly because we need to add the
compliment to listen for these changes. We have a working collaborative text
editor demo/prototype so hope to have more to say soon!

------
tlarkworthy
In Firebase we aim for causal consistency, which is between eventual and
strong consistency. Remote clients obviously don't see an the same global
ordering of events, they see their local mutations before peers (0 latency
local writes). However, they do see remote mutations in the same order they
occurred. So you never see "foo left the room" "foo said x", the remote
mutations are always in the correct temporal ordering.

~~~
ex3ndr
How do you solve a problem that when client is too long offline queue of
events became too long. For example user can change 100s times it's name (just
an example) and you will have to deliver all changes or combine them on server
side that can be much harder as you will need to read too many events at once.

~~~
tlarkworthy
Write coalescing doesn't work with our security model, so we don't. The client
has to replay all the writes in disk cache when it comes online. The writes
are pipelined fairly efficiently, but I acknowledge that for some workloads it
might not be ideal. But most apps do not do require rapid streams of writes
while offline.

------
smnplk
I found two great explanations of cap theorem on youtube.

[https://www.youtube.com/watch?v=gLtO0vY_M78](https://www.youtube.com/watch?v=gLtO0vY_M78)

[https://www.youtube.com/watch?v=Jw1iFr4v58M](https://www.youtube.com/watch?v=Jw1iFr4v58M)

And now I hope my understanding is correct:

The way I see it, the most important thing to understand first is what the
word partition means in the context of distributed systems. If the system is
partitioned (being in partitioned state), that just means that some nodes can
not talk to each other. There is an outage of some sort and when building
distributed systems, you always have to expect that, so the P (partition-
tolerant) from CAP is always there, you always have to choose it. After that,
you just have to decide if you will be more C (consistent) or more A
(available). If you pick C, then this means some nodes will have to wait until
the system is back in connected state to get the latest state, but if you pick
A, then you always get the answer, the last one the node had before it got
disconnected.

~~~
simonask
That's correct. I also believe you could in theory forgo partition tolerance,
it just wouldn't be a very useful system. Maybe there are use cases out there
for it.

~~~
smnplk
Yeah, but I think the point of distributed systems is to be partition
tolerant, the system that you are describing is probably not a distributed
system by definition. Or is it, i don't know, I have a plan to read a book on
DS this year.

From wiki, it looks like there are 3 main pillars to DS [1]:

\- concurrency of components

\- lack of global clock

\- independent failure of components

So you need the failure tolerance part.

[1]
[https://en.wikipedia.org/wiki/Distributed_computing](https://en.wikipedia.org/wiki/Distributed_computing)

------
mjewkes
The C# documentation is quite clear:
[https://realm.io/docs/xamarin/latest/](https://realm.io/docs/xamarin/latest/)

    
    
      var puppies = realm.All<Dog>().Where(d => d.Age < 2);
    

The LINQ integration is appreciated.

    
    
      // Update and persist objects with a thread-safe transaction
      realm.Write(() => 
      {
          realm.Add(new Dog { Name = "Rex", Age = 1 });
      });
    
      // Queries are updated in realtime
      puppies.Count(); // => 1
    

I'm assuming that "updated in realtime" means "realm.Write is a blocking
operation", correct?

~~~
brmunk
Absolutely right. Only one writer at a time (locally), but writers don't block
readers of which you can have multiple due to the MVCC design.

------
erichocean
We shipped an app last year with Realm and are in the process of migrating off
of it—it just wasn't reliable enough. We regularly saw crashes deep in their
library with highly concurrent usage. Extremely frustrating.

Adding something complex like OT on top of Realm's existing foundation would
make me even less likely to use it in the future.

That said, it's nice to see them tackling the problem and I wish them the
best. Obviously, bugs can be fixed given enough time and effort and in
principle, I like their product.

~~~
brmunk
Really sorry to hear you had a bad experience! I assume you created an issue
for it? If not I would love to hear more about it (bm at realm dot io), so we
can get it fixed and hopefully get you back as a happy camper again at some
point.

Obviously one of the main points of building the new sync solution (or any
library really) is to give you much more time for building differentiating
features rather than building basics yourself. Now granted there are other
solutions for the pure local persistence, but not many - if any - that will do
what the new Realm Mobile Platform will for solving sync.

But obviously the basics must be right and solid - so hope you will provide us
with sufficient info so we can get any of your issues resolved.

------
jonathan_mace
Similar work from the research world: Diamond: Automating Data Management and
Storage for Wide-Area, Reactive Applications

[https://www.usenix.org/conference/osdi16/technical-
sessions/...](https://www.usenix.org/conference/osdi16/technical-
sessions/presentation/zhang-irene)

------
rjdlee
Does anyone know how this compares to Parse (I'm new to app development)?

~~~
brmunk
Being from Realm, I might not be the right one to give you a balanced view,
but I can at least try a bit:

One huge difference is that Parse is no longer developed by a company after
Facebook abandoned it. It's been open sourced and its future is up to the
community.

Another is very related to this article. Realm resolves all your conflicts
automatically, the developer doesn't have to do that. That's even done more
fine grained at the property level.

Realms focus on being an offline first database means that your data is always
available locally.

Then there are subjective differences. Until the launch of the Realm Mobile
Platform, Realm has not had a solution for syncing to a backend. We have a lot
of users who have chosen to use Realm locally and Parse as the backend. I
guess that's a testament to the local database features from Realm. But with
the new syncing solution, that's no longer needed, and all the networking code
can be forgotten.

I'm sure others can come up with pros for Parse, but if you are not already
invested in Parse, I think most would recommend looking elsewhere due to the
first point.

------
tezza
the text is very clear and easy to follow, however maybe a few concrete worked
examples could help

~~~
bigfish24
Thanks for the feedback! Agree it can be a little hard to visualize with just
the text alone. We were planning on follow up posts, so we will make sure to
include diagrams.

