
Show HN: Operational transform for realtime collaborative editing in JS/Flow - c0da
http://cricklet.github.io/sites/blue/index.html
======
tmail21
The basic problem with OT (and current "real-time" collaborative editing
approaches) is that they can only achieve eventual consistency.

While this sounds great, eventual consistency DOES NOT mean semantic
consistency. This rules it out for many applications where semantic
correctness is important.

Even for simple text documents you can get eventually correct but semantically
incorrect results.

For example, consider the sentence

"The car run well"

This has an obvious grammatical error.

Now imagine two collaborative editors.

Editor 1: Fixes this to

"The car runs well"

Editor 2: Fixes this to

"The car will run well"

Depending on the specific ordering of character inserts and deletes this could
easily converge to

"The car will runs well"

Obviously this statement is both grammatically incorrect as well as
semantically ambiguous. (However, both editors see the same result and it is
hence eventually consistent). Worse, OT collaborative editing will silently do
this and carry on.

Now, for non-critical text where errors like this are ok, this may not be a
big problem. But imagine contracts or white papers, or trying to use this on
something like a spreadsheet where semantic correctness is critical and one
can see why the current scope of collaborative "real-time" editing is very
limited.

In general current "real-time" editing approaches like OT are outright
dangerous.

~~~
theaustinseven
I would expect that if an editor makes a change, their change should be
preserved. There is no general case where you can decide which edit to keep,
because in some cases(like the one you presented) people are editing the exact
same sentence, but far more often people will not make edits to the same small
part at the same time(at least in the real world). This makes OT very
practical since generally the eventual consistency can be reached quickly, and
there is consistency, so the results with given inputs are predictable.

~~~
tmail21
It is fairly trivial to construct an example where editors are editing
_different_ sentences and OT takes two locally semantically correct states and
converges to a semantically incorrect (but grammatically correct) state.

I think OT and other "real-time" collaborative editors are practical if you
are willing to (or your use case can) live with "silent semantic errors".

The greater the document "interconnectivity" (eg, paragraph A is semantically
related to paragraph C), the greater the likelihood of having far-flung silent
semantic errors.

For documents like spreadsheets this is very obvious because you start getting
nonsensical results and (hopefully) errors very quickly. For Word-like
documents, the errors are "silent" and thus much more insidious.

My point was that that is an element of OT which many users don't realize.

With regards to predictability, I would not call the results of OT predictable
from a user's perspective. It is predictable in the narrow sense that for a
sequence of arrival of operations AT THE SERVER it is predictable.

However, it is impossible for a user to predict how their local operations
will interleave at the server with other users' local operations. For all
practical purposes the converged result is unpredictable from the user's
perspective.

The only property which one can confidently assert with OT is eventual
consistency.

~~~
theaustinseven
Yeah, I guess I see what you are trying to say. I just want to clarify when I
say predictable, I mean that given a set of operations. No matter the order
they come in, the results will be the same. This makes OT powerful in that
everyone just needs the operations eventually in order to have a consistent
document. The only middle ground that I could see that would allow
predictability in the document, and help mitigate these silent errors would be
to notify users of when they have both edited the same range before
consistency was reached. This would catch "almost" any case that I think you
are talking about, although would of course miss the situations in which
semantic errors arise due to errors in very different parts of the
document(e.g. referencing a figure 2.1, while someone changes that figure to
2.2), but these errors can still easily arise with a single editor, and so are
not really unique to OT. I do think that it would be nice to have a solution
to that problem though...

------
juliendorra
Great! I like the format of the post a lot: demo, code example and super clear
explanation all combined in a good starting point on the subject.

(You could also have cited Etherpad as a common implementation in addition to
Docs. Etherpad was a direct predecessor to character by character OT in Docs
—the team was acquhired— and it is still widely used by many organizations.
But then there is so many examples and libs, I understand that you wanted to
just give context!)

------
Leftium
Wow! Front-end only is actually a great advantage for me. I actually tried to
modify [ShareDB] to be front-end only[1]. (ShareDB uses WebSockets or any
other full duplex stream; it's a great reference if you want to implement true
client/server.)

I guess my use case is quite unique: [Todo.taskpaper] needs to sync multiple
"views" of a single document in the same web app. Right now it uses very naive
syncing; I'm going to try to upgrade it to blue-ot.js.

[ShareDB]:
[https://github.com/share/sharedb](https://github.com/share/sharedb)

[1]:
[http://stackoverflow.com/q/40616650/117030](http://stackoverflow.com/q/40616650/117030)

[Todo.taskpaper]: [https://todo-taskpaper.leftium.com](https://todo-
taskpaper.leftium.com)

------
theaustinseven
This is cool! I've been working on something similar in Go, but I haven't
spent much time on it recently. Is this a purely front-end application, or is
there a server associated with it?

~~~
c0da
Thanks! My implementation is currently front-end only.

Here's the code that simulates all the client/server communication:
[https://github.com/cricklet/blue-
ot.js/blob/master/js/ot/orc...](https://github.com/cricklet/blue-
ot.js/blob/master/js/ot/orchestrator.js#L533)

It shouldn't be too hard to take that and put it in an actual client/server
architecture. The client needs to have a way to send local operations to the
server (this can just be an endpoint on the server) and the server needs a way
to broadcast operations to all clients (probably webRTC?).

~~~
grizzles
Hey Kenrick, I agree - WebRTC would be a good choice since it allows out of
order packets, that can speed stuff up quite a bit.

Did you see this article? It was posted on HN yesterday. It links to a gh
project that might be useful for your project.
[https://getkey.eu/blog/5862b0cf/webrtc:-the-future-of-web-
ga...](https://getkey.eu/blog/5862b0cf/webrtc:-the-future-of-web-games)

