In fact I think patch theory may even be stronger than the algorithms described here. This link says "we need to impose a restriction on the client: we can never send more than one operation at a time to the server.", but patch theory can handle that case just fine and still "recover" later. The only thing you need is some sort of merge policy, and my plan was just to arbitrarily (but consistently) pick a side, because unlike a code merge where you could literally have kilobytes of conflict and discarding either side could be bad, in the collaborative case you're looking at a few small commands. (Even a large copy/paste is just one operation, from the human perspective. No matter how collaborative your editor, two people editing the same character is always going to be a problem; there's really no point trying to get too clever.)
If you are interested in this topic, you might want to read http://en.wikibooks.org/wiki/Understanding_darcs/Patch_theor... as well. I think patch theory may be stronger than what was described because it can use patch inverses to do a few things I'm not sure operational transformation" can, and coming up with inverses for the collaborative editing case is quite trivial. Note that darcs had a well-known exponential case, I think in the collaborative editor case it could never come up as everybody is always within a few seconds of "HEAD".
(Incidentally, the patch approach would also potentially allow offline interaction as well, though then you can't ignore the merge issue quite so thoroughly and you could conceivably have exponential merges again, never got far enough to tell. Sounds like this approach would not allow that. Does Wave allow offline interaction which it then propagates up? If so then it may be using something more like patch theory.)
Wave does allow offline interaction, and their OT algorithm is optimized for speed (for example, for the execution of possibly hundreds of buffered operations). The tradeoff is the requirement for an acknowledgement from the server after sending each operation or batch of operations.
I'm currently working on an OT implementation that doesn't require ACKs from the server in my spare time, and relies on a vector clock as opposed to a single timestamp. It has higher CPU processing power needs, which is why Wave's algorithm is also more suited to mobile devices
 http://portal.acm.org/citation.cfm?id=1718949 I believe one of the people involved in this paper is also involved in Wave's OT implementation paper.
Of course one of the issues is that a lot of stuff comes down to details. Working in a browser is a hell of a constraint, I wouldn't dream of writing a collaborative editor in a browser. (Not because it's impossible, but because that's an amount of stress I don't need in my life. Obviously it's possible.) It would be interesting to talk to the people involved about the issues, but I suppose that's almost always true.
(BTW, this is me interested in someone who has actually done the stuff I've only designed and curious about the exact details to a fairly deep level about the actual issues they encountered and stuff, as partially indicated in my "to the extent it exists at all" above. I am not trying to tell anyone what they should have done, I'm just chewing the fat as an interested party who stands to learn some interesting-to-me-things. I'm not in a position to tell implementors anything. As usual, tone can be hard to convey in text.)
OT is a dead end. The problems they struggle with could be trivially bruteforced by employing unique symbol identifiers.
What I will concede is that their OT implementation is probably not perfect yet. If you read the comments on some transformation functions on their Java code, they shown concern and do extra logging in hopes to track unexpected issues they haven't solved yet.
"The authors even had to enrich XML with additional entities named “annotations” to make it suitable for their needs"
This is what Etherpad did for adding support for lists, bold, italics, etc. Did anyone complain? Does it not work?
Granted, Wave's approach is way more ambitious. They support any sort of XML groves, and not just simple 1-level deep tags like Etherpad.
I do not attack annotations or XML per se. The general approach "neither thing is precisely OK, so let's use both" seems questionable to me.
I believe, XML is complex enough, so why do we need XXML?
Finally, by using annotations they recognize that XML nesting is a pain in the ass, but they did not solve the problem, just sort of relieved it.
Actually, I doubt a lot, whether XML is actually carrying some useful load in the Wave. But the last one is just a feeling.
It seems to solve the issues that concern you (validating the correctness of OT).
In my more evil moments I considered specifying the save file as an XML document that deliberately left off the final close tag. Mu-hu-ha-ha-ha!
Pretty much the same as that doc, yes, though I never saw it. (I did see and was inspired by the Darcs patch theory doc I linked.) My plan was for people to "share" documents, so a server that had a full copy of the document was not a problem, it did by definition. For that matter, in the Google Wave model, I'd rather run my own Wave server so having a full copy of the document there isn't a problem either. I, personally, actively don't want my cloud to have the canonical version of the document.
"You can't constantly be diffing and sending patches using the synch-diff approach - rather you have synchronization cycles that happen every few seconds."
Why not? Assuming just a handful of users (not hundreds), and given that patches have a certain flexibility about when they get applied, why not just send things as they occur? (Within the topologies shown, that is, not just everyone sending to anybody as they feel like it. You could probably get by with full P2P without too much work, but having the central server simplifies things a lot and I don't think you lose much. I was never planning on 100 people "in" the same document.) Clients should probably chunk things up a bit during periods of rapid typing. (I was planning on going to raw sockets if I needed to; most higher level protocols would have indeed choked this from what I saw.)
Anyway, it's pretty theoretical now. Wave is probably good enough and this wasn't even the focus of my project, just an incidental sideline to what I really wanted.
And I didn't just mean saving a copy of the document on the server - you need to do this for most collab. algos. The differential synchronization algorithm requires saving a copy of the document _per client_ on the server.
Otherwise - sounds cool - you should definitely try to implement this and see how it goes.
(Careful design of the patch commutation algorithms could further reduce the problems you face, this and "pasting in a huge swathe of text" being the two basic "huge" operations available to a user in realtime. Imagine one user highlights a huge swathe of text and deletes it while another is trying to add a character or two. Normally this could be a conflict, but depending on how you represent the delete, you might be able to make it so the delete simply eats all further edits to the deleted text. In fact in my system this was actually a major problem due to some other issues (in my system you could limit your visible scope down to, say, a specific "chapter" and then I have a UI problem if that entire chapter gets deleted), but this wouldn't affect Google Wave anywhere near as badly, unless it also has narrowing available.)
(...and the way they dealt with XML is a very special song. OT is pretty much a complexity explosion itself, but they also multiplied it by the complexity of XML; got 15 kinds of mutation operations and all the stuff. The fact they invented "annotations" to skip XML nesting speaks for itself.)
Conclusion. OT is a big messy mess.
They developed a lot of systems using OT, even going so far as doing collaborative autocad. There are papers on that if you search for CoMaya and CoAutoCAD. The complexity, especially of the last one, is mind boggling though.
Is Google's documentation crap?
There is certainly space for someone to document the specifics of the Wave Protocol (and, along with that, operational transforms) and this blog post goes a long way toward that.
Edit: I should also note that the OT white paper, like most white papers, builds on prior knowledge and expects some background in the area of research. You have to read a few white papers in order to really grok it.
(full disclosure: I helped edit these)