Hacker News new | past | comments | ask | show | jobs | submit login

Wow! I really, really like this because the goal appears to be to expose CRDTs in a developer-friendly way. I truly believe that well-designed CRDTs are the future for a lot of distributed systems problems, but they've been a bit hard to get into for people who just want to solve problems and not read 10 papers first.

The only thing I miss here is a clear spec of the CRDT itself so that people could implement compatible versions in other languages.

Conflict-free Replicated Data Type[0], for anyone else who didn't know what CRDT means.

[0]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...

I've spent almost a decade working on CRDTs, largely from an industry perspective but also some academic side.

They're great, and not too hard to understand if you take time to think about them.

Here are some animated/interactive cartoon explainers we've made to help teach the concepts:

- How a CRDT version of Operational Transformation (Google Docs) works: http://gun.js.org/explainers/school/class.html

- Why CRDTs are better than centralized alternatives like PAXOS/RAFT: http://gun.js.org/distributed/matters.html

Martin Kleppmann's work (author of Automerge) is outstanding, especially check out: https://youtu.be/yCcWpzY8dIA?t=29m36s

I have a rather specific question, in case you know. What would be the best CRDT algorithm for editing a document consisting of (i) nested lists, (ii) strings, and (iii) some primitive values?

I'm looking at Kleppermann's JSON work. But it doesn't look like it handles collaborative editing of strings very well: they have to either be treated as an immutable value in a register, which doesn't allow for collaborative editing of the string, or treated as a list of characters, which would have to represent large edits (e.g. deleting a word) as a sequence of character edits, which sounds inefficient. Kleppermann's work also handles maps, which I don't need, which is fine.

For context, I plan to make a tree/structure editor, and am considering representing the document as a CRDT.

(not an expert)

Treating the long string like a list of substrings should work, I guess the optimum substring length could be determined experimentally based on real-world interaction patterns.

Thanks, I didn't and wondered if it had anything to do with Computer-supported Collaborative Work or Computer-supported Collaborative Learning (CSCW and CSCL, respectively[0][1]).

Not really, obviously, but given that one of the big sub-topics of CSCW and CSCL is about designing for remote collaborative work, CRDT looks like a technology that naturally fits in there.

[0] https://en.wikipedia.org/wiki/Computer-supported_collaborati...

[1] https://en.wikipedia.org/wiki/Computer-supported_cooperative...

Yes, thanks!

Is CRDT a good way to add sync for a classic invoice app (that need to track multi-tables changes? like invoice header, items, payments, shipments?)

I agree that this effort is in the right place. Having the opportunity to change Realtime networks like Pusher, PubNub and Firebase would be great for apps.

I'm surprised those apps haven't made a bigger effort to implement this logic for production. It would certainly make them unique and hard to leave.

I imagine it is based on the author's earlier paper? https://arxiv.org/abs/1608.03960

I haven't compared them though.

It’s not the same as the paper for performance reasons. This is noted in the github README. Martin said on Software Engineering Daily late last year that Automerge was three orders of magnitude faster than his published version of the JSON CRDT.

This is my second time highly recommending a Software Engineering Daily podcast with Martin Kleppman on it. He is really smart and able to break down really complex topics in a way that's easy to understand. If you've never heard of CRDTs before, have a listen - https://softwareengineeringdaily.com/2017/12/08/decentralize...

Funny thing is, we have had a easy to understand CRDT datastructure and ecosystem for years. The semantic web. If you ignore the closed world stuff and only use open world semantics RDF becomes a GSet representing a graph.

A pretty neat workable representation. Trivial to implement.

I don't see how. Care to elaborate what is conflict-free and replicated in the semantic web?

Nothing is, he's just trolling.

I'm not actually. Open world semantics means that you can never assume that a fact that hasn't been explicitly given doesn't happen to be on some server that is just temporarily unreachable. If you combine this with a monotonic query language and inference engine, which never retracts facts, you get a GSet with pretty powerfull operations. You can even arbitrarily cache facts without worry of conflicts.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact