Hacker News new | past | comments | ask | show | jobs | submit login
Chronofold: A data structure for versioned text (2020) (arxiv.org)
207 points by tempodox 47 days ago | hide | past | favorite | 23 comments

This is from April 2020.

You can find a few implementations (1, 2) to play with now if you’re interested…

I found the video series [3] in the latter particularly interesting in trying to understand it, if you’re game to sit through a few hours of live stream.

[1] - (rust) https://github.com/dkellner/chronofold

[2] - (js) https://github.com/jmatsushita/purescript-chronofold

[3] - https://m.youtube.com/watch?v=gg5q_V7tBjE&pp=sAQA

unfortunately, [1] has incorrect behavior on some inputs. The hard part of the algo - integration - is implemented incorrectly there :(

and for [2] it is still an "open question", as per their Readme.

then would you mind opening an issue in the first repository so that it can have a chance to be fixed?

The repo seems to be abandoned. Author of chronofold (@gritzko) pointed me to another impl ( https://github.com/decentralized-hse/collab-edit ) but not checked it, yet...

Edit: I’m apparently wrong on the “abandoned” part. The author of this repo wants to maintain it and accepts bugs and issues on github, https://news.ycombinator.com/item?id=28025692

that's wonderful news! I'm hoping to be able to use this data structure in the near future using rust, so this helps a lot. thank you so much

how about opening an issue instead of complaining?

Holy moly your bar for "complaining" is low. It'd be like if I were hiking in the woods, overheard someone talking about crossing a bridge, and said "careful, that bridge isn't very sturdy!" and you respond "How about contacting your local ranger station, submitting a form BQ-447r with your real name and contact info, including a structural analysis of exactly where you see the issue, and sitting through multiple council meetings about the issue, instead of complaining!?"

GP's not really complaining, they're just warning people. And the reality is that submitting an issue can take a lot of time. I don't even bother submitting bug reports to Microsoft anymore because their burden for reproducibility is so high that they're basically asking me to reverse-engineer their software to find their bug for them.

what is GP?

GP is not the parent (P) reply, but the grandparent reply, i.e. the one written by "omgtehlion".

When you write a reply, P would be the one you're replying to. GP is the reply the person you reply to was replying to

So, this is essentially cache invalidation, mixed with relativity/causality... the hardest aspects of two different fields, all rolled into one data structure/algorithm.

I found it interesting, and something I will have to read a few more times before all the details make sense.

The idea of collapsing the structure when individual changes are no longer needed is appealing in terms of space saving.

I look forward to seeing this implemented.

> this is essentially cache invalidation, mixed with relativity/causality

In a sense these are quite closely related, the challenge with caches is when they go out of sync with their source data, as then reads/writes against the cache/source could appear to violate causality.

I suspect this is a follow-up to this recent article: https://news.ycombinator.com/item?id=28017204

I submitted it right after reading the paper from gritzko’s post but the story didn’t get any traction :( https://news.ycombinator.com/item?id=28020792

Here’s a presentation by Grischenko on Chronofold https://m.youtube.com/watch?v=dKzMZsg5EVA

Would appreciate anyone knowledgeable in the field who could provide a "layman's" breakdown of the relative merits of this approach vs OT and CRDTs. I'm afraid it's a bit above my head.

This is a CRDT.

I think the first page or so of the paper is very approachable, explaining well the history of these kinds of ideas and the advantages of Chronofold, did you try reading it?

I read the abstract and was a bit lost, I guess.

I took from it that this was a novel "locality" approach that was different than CRDT'S.

This is far from my area of expertise, so I guess 'approachable' is relative? Was hoping that (in the fine tradition of HN) someone who knows a lot more about this stuff could chime in.

I'd really like to learn some of this for a number of projects I'm working on, and I'm not sure where the best place to direct my focus is. ShareJS is OT based, iirc, and but quill js has some basic support for it too. CRDTs are a bit voodoo to me. This is something else?

Sorry if it's a remedial question, would appreciate any advice or insight.

Don’t feel bad for asking questions! IMHO that’s crucial for getting forward and gaining deeper knowledge.

If I could answer your question I would but I don’t know this domain well enough (^_^)

You say that like they have heard of the acronym CRDT before.

Don't let the curse of knowledge be an excuse to be rude to people.

They used the acronym CRDT and explicitly asked for comparisons to it.

Don’t let the curse of not reading be an excuse to be rude to people.

I like this one out of a series of similar papers: https://arxiv.org/abs/1905.01518

Maybe not light reading?, but accessible still.

CRDTs are fascinating data structures to me. One question: how would you handle something like cursor position with a DS like this? In the structure itself or a parallel structure with the site-id, insertion-point and expiration date?

Cursor positions are nothing but a counter. And counters are CRDTs by default, which means they are associative, idempotent and transitive. Best way is to have them as a separate structure pointing to an index.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact