I'm not familiar with CRDT, can you compress the idea of how to "remember" a whole history of states with them and why are vector clocks important? The introduction on wikipedia reads like they are useful for (geographically) distributed processing. I can't really relate or contrast this to monads.
That is what Consistency in the CAP theorem hint at.
The whole idea is that if you define things not only by their state but also by the order this state has evolved, it enable you to know what is the most recent one but also to go back in time. It is especially useful if someone come late after you updated the state and you discover that their update should have happened before the most recent change in state.
Forget that idea of data being hidden behind procedural interface.
Another way to put it :
You have a Dog that is dirty at 1600. So you decide to clean him and put him in a bath. He is now clean at 1610. Now you have another person (thread? computer? no idea) that come at 1620 and got the order to clean the Dog at 1550 because someone saw he was dirty. The person do not ask if the dog is dirty or not. He got an order and do it. You now have a dog that is being cleaned again.
With Alan Kay point of view, there would not be a single dog with his single name. But a dog which name would be defined by his name and the moment you named it. So when the person that saw that the dog was dirty at 1550 and decide to clean him at 1620 when he was already clean would come to clean the dog, it would take the dog1550 and not the dog1620. So he would clean a dirty dog and not a clean one.
He would be in another legs of the Trouser of Time.
A couple of paper : the seminal one by Leslie Lamport on vector clocks :
And here is mccarthy paper that Kay talk about :