Hacker News new | past | comments | ask | show | jobs | submit login

Syntax trees are lists and structs that have semantics.

Conflict-free means that there is no conflict in the semantics, not just that the operations on a syntax tree shape result in a valid syntax tree shape.

If we pass a syntax tree node with 12 children to two different servers which each add a node, we can easily resolve the structural conflict by adding both nodes, so now we have 14, and that may be a valid syntax tree.

However, that 12-child node could be, say, a call to a function that takes 12 arguments. (So 11 are being passed, and one is defaulted). Adding one argument is semantically valid. Adding two means the call is bad.




> Conflict-free means that there is no conflict in the semantics,

If you start with an incorrect premise you will wind up in a strange place. Conflict free has a specific meaning, and "semantics" can have multiple layers. If your threshold is perfect conflict free semantics at all levels, then you are indeed doomed. But a CRDT can provide a distributed, conflict-free canonical source substrate, with a specific semantic meaning at a minimally useful lowest semantic level. The challenge is to find transformations that are "good enough" to minimize conflicts at the higher semantic levels that you care about.


Practically, I don't think we're there yet. Yes, you can probably make a lisp-like that evals a currently existing crdt tree. The problem is, half the point (and the difficulty) of a crdt is constructing it so incomplete combinations of state updates kinda still make sense. I don't want a partial update causing the execution of gods knows what. And if we get real strictly transactional with the code tree updates, what's the point of the crdt in the first place?

I'm not sure these issues can't be overcome, but it's admittedly still an open problem.


> The problem is, half the point (and the difficulty) of a crdt is constructing it so incomplete combinations of state updates kinda still make sense.

The real problem is that we're trying to solve this problem at the wrong level of abstraction.

I think CRDTs can be reformulated as a subset of a larger replicated data type model that explicitly includes conflicts as part of the model, that any chosen set of transformations can be programmatically projected into a corresponding set of conflicts, that the choices of resolutions to those conflicts can be pre-packaged and composable, that some of those resolutions may need to be 'add a new transformation that choses between conflicts created by two other conflicting transformations' (basically exposing the problem up into the model of the data type), and that a composition of chosen resolutions can be proved to be fully conflict-free.

The biggest benefits of this approach are: 1. It enables modeling domains where conflicts are an unavoidable part of the domain itself. 2. It allows tuning an initially pessimistic model where all conflicts are exposed to the user by gradually adjusting the conflict resolution choices so that it becomes more conflict-free over time; especially by composing known conflict-free subsets of the model into larger conflict-free subsets.


Author here. This looks like a very useful perspective to have! i.e CRDTs with unresolved conflicts as part of the model with composable properties.

Can you talk a bit more on this stuff? Like with a basic example maybe? Thanks!


Thanks for considering! I've been watching the CRDT space for the past couple years and every article, talk, and thread makes me more convinced that we need a broader model and a layer of abstraction to make this work accessible to a larger pool of programmers. I think the comment above is the most precise version of this claim that I've made so far.

One nice example is from a prior thread where I aim at the same idea ( https://news.ycombinator.com/item?id=28717848#28724977 ):

> edit A sets an invoice status to paid with a payment of 100, edit B changes the invoice amount from 100 to 120

Here, the merged result should be a conflict state that needs additional attention to resolve. Maybe this happens once and the resolution is to call the customer to work it out, or maybe this happens a lot and there's a whole conflict resolution tree that applies based on the details. This kind of conflict is not possible to "eliminate by model"; its presence is integral to the problem domain itself.

But even if there is a conflict we should still be able to represent it in our model, and be able to apply one of a number of pre-packaged resolutions. For example: if the invoice amount changed due to adding a new item to the order, then the resolution could branch with a choice: 1. split the item into a separate order, 2. remove the item and notify the person that made the change to the order, 3. set the invoice status to partially-paid and notify the customer of the remaining balance and reason why it changed. If instead the price changed then the resolutions could branch differently: 1. backdate the order to before the price change, 2. give the customer a discount, 3. set the invoice to partially-paid and notify the customer of the remaining balance, etc.

Note that these choices are concerned with business processes and customer relationship which have no "right" answer and could change business to business, customer to customer, or even order to order. We can still get desirable CRDT-like high-level guarantees out of this system (like "every invoice is paid in full or billed for the remaining balance") while modeling conflicts in it. Another benefit: two customers could request different default conflict resolutions, maybe one wants to always split into a new order and the other always wants to bill the added balance; with the conflict included as part of the model interactions with both of these customers could be represented in the same system with different custom resolution strategies.

I'm not sure how to continue advancing this idea, or if I've missed something important that invalidates it. I have a feeling that this could be layered on top of existing CRDT work with a rich enough data model. My email is in my profile if you want to talk more. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: