Hacker News new | past | comments | ask | show | jobs | submit login

Apple’s Notes.app uses CvRDTs (I know this from figuring out the data format). For attributed text, they’re using something homegrown that they’re calling “topotext”. It’s somewhat similar to RGA/Causal tree, but they’re maintaining a digraph (unnecessarily?) and I suspect the conflict resolution isn’t quite right (doesn’t seem to do newest first). But for the sake of discussion, it’s a list of characters with lamport timestamps and a tombstone flag.

On top of this each character has a separate lamport time stamp for the attributes. This appears to act as a LWW register, per character. (So a conflicting attribute mutation would pick one users state and an insert conflicting with a embolding a range would not pick up the bold, but close enough?)

(On disk, runs of increasing clock are stored as one node, with a length, and the resulting text and attribute runs are stored separately from the CRDT data.)

A table is treated as an attachment in the main text (attributes on a placeholder character point to it). It is also encoded as a CRDT. (They compose CRDTs for maps, registers, etc here.) It is modeled as an ordered set of row ids, an ordered set of column ids, and a map of column id -> row id -> topotext. This is needed to preserve the semantics of adding/removing/ordering columns and rows in the face of conflicts.

Drawings are also stored as a CRDT.

(Sorry if this is terse or confusing, I’m typing on mobile.)




Fascinating! I've always wondered about this. Out of curiosity, how did you work this out (and why)?


Initially I wanted to figure out how to export my notes. I don't like to use software if I can't export my data. Then I saw strings in the table data mentioning CRDT and was curious. (I like to take things apart and see how they work.)

The data is protobuf encoded, gzipped, and stored in a sqlite database. I wrote some python code to help me work out the protobuf schema. Then I observed how the data changed as I tweaked the notes to assign names to the fields. (After I figured it out, I learned that the protobuf schema was being sent to the client in the web app, but it was a good exercise anyway.)

To fill in a couple of details, I took a look at some structures via classdump and the disassembled code in Hopper. (e.g. the point representation for the drawings.)


That's very interesting. Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: