
Braid: Synchronization for HTTP - tobr
https://braid.news/
======
BiteCode_dev
Reading the RFC, this seems to make HTTP stateful:

"A subscription is different from a GET connection (e.g. a TCP connection, or
HTTP/2 stream). If a client requests "Subscribe: keep-alive", then the
subscription will be remembered even after the GET connection closes."

It also uses PUT to patch things, but then why not use PATCH ? You could say
it's non standard, but at this point, everybody uses it. Plus their stuff
apparently deviates from the HTTP standard as well by allowing headers after
the first line of the payload and introducing the forGET command.

Or am I missing something ?

Interesting anyway, especially the way they create of graph of unique ID to
solve the ordering problem. Feels like Git. The flow of patch reminds me of
react-redux.

Plus it's great to have more geeky technical things on HN. I'm missing it,
among all the news and start up things.

They also mention a really cool RFC on a JSON patch format:
[https://datatracker.ietf.org/doc/html/rfc6902](https://datatracker.ietf.org/doc/html/rfc6902)

I'm reading RFC and getting excited about it. Damn, that must be what they
call growing old.

~~~
Karrot_Kream
I agree, this RFC seems to propose a great deal of different additions/changes
to the HTTP protocol without a unifying reason why. But the committee/process
is all in the open, through the IETF, so now is the time to voice your
concerns!

> Reading the RFC, this seems to make HTTP stateful

The reasoning behind this seems to be that the alternatives right now are
either polling or a homegrown websocket based protocol. The authors argue that
rather than continuing to let different implementors make bespoke or otherwise
non-standard polling/subscription based websocket implementations, to
standardize on a RESTful way to subscribe to a stream of updates.

> It also uses PUT to patch things, but then why not use PATCH ? You could say
> it's non standard, but at this point, everybody uses it. Plus their stuff is
> apparently deviate from the original standard as well by allowing headers
> after the first line of the payload and introducing the forGET command.

I've personally seen both PUT and PATCH used for updates, so this didn't seem
particularly odd to me, and seemed more like a nod to one particular method of
updates.

But yes, the overall RFC is proposing to modify HTTP from a state transfer
protocol to a state synchronization protocol, so that clients and servers have
to hold onto state and send diffs between their internal states, and then
implement some reconciliation algorithm (e.g. OTs or CRDTs) to merge state.
This seems like it is in response to a myriad of bespoke long-polling or
websocket based update mechanisms.

> They also mention a really cool RFC on a JSON patch format:
> [https://datatracker.ietf.org/doc/html/rfc6902](https://datatracker.ietf.org/doc/html/rfc6902)

Yup! Super cool right?

> I'm reading RFC and getting excited about it. Damn, that must be what they
> call growing old.

Well, you could do that, or you could read yet another article about startup
financials, which is all that seems to trend on HN these days ;)

~~~
toomim
> this RFC seems to propose a great deal of different additions/changes to the
> HTTP protocol without a unifying reason why

To read more of the unifying _why_ , I suggest checking out the original Braid
draft from last July: [https://datatracker.ietf.org/doc/html/draft-toomim-
braid-00](https://datatracker.ietf.org/doc/html/draft-toomim-braid-00). We had
a much longer introduction in that version, but shortened it in braid-http-01
so that we could cut to the meat.

If the _why_ still isn't clear after reading braid-00, please let us know on
the mailing list: [https://groups.google.com/forum/#!forum/braid-
http](https://groups.google.com/forum/#!forum/braid-http). And on the other
hand, if something in braid-00 helped, we'd also love to hear _what_ , so that
we can add that back into the braid-http-01 draft.

~~~
BiteCode_dev
I really like the RFC.

Two questions:

\- do you think it would make sense to allow the "Patches" header only in
PATCH and not PUT?

\- how do you feel about a generic "subscribe" mechanism that is not specific
to sync but can just say "I'm interested in this topic, with this params" ?
Then this could be specialized with the version+parents params to get sync.
This way we get a generic HTTP standard for PUB/SUB, which is badly needed,
and is a very basic primitive you can build many things on. It would be a
shame to have to add one after the fact.

~~~
toomim
Thank you very much!

Does EventSource meet your need for a generic subscribe mechanism?
[https://developer.mozilla.org/en-
US/docs/Web/API/EventSource](https://developer.mozilla.org/en-
US/docs/Web/API/EventSource)

I'm also curious what use-cases you have for a generic pub/sub beyond
synchronization. In my experience, 95% of pub/sub implementations are used as
a substrate for synchronization.

As for using PATCH vs PUT, this is an open question. I'd love to see more
discussion of it on the mailing list: ietf-http-wg@w3.org. There are a number
of pros and cons on both sides.

~~~
BiteCode_dev
EventSource is only from the server to the client. Pub/sub goes in any
direction: client to client, client to server, etc.

Crossbar.io provides that with websocket.

Eventsource + the sync rfc would mean everybody would build a non standard
bridge for this to link a publication, a sync from the server and an event
source. And because we only need 2, and they are overlapping, several
completly incompatible de facto solutions would emerge.

Pub/sub is useful for any kind of communication that doesn't involve a
resource: notifications of events , communication between microservices, etc.
Basically anything that doesn't have a need for an history.

Of course, you can always create abstract conceptual resources you sync with
to obtain this effect. E.G: the "streaming service is down event" could be a
sub to a "service/streaming/heartbeat" event that you sync, and ignore
versions and give no parents. It's just a bit twisted.

It feels more natural to have a pub/sub primitive that goes back and forth in
any direction, and build the specific case for sync with that. Even if sync is
90% of the time what you want.

------
svnpenn
I wrote 2 implementations for ETag, one using plain text files and one using a
database file:

[https://cup.github.io/autumn/talk-conditional-
request](https://cup.github.io/autumn/talk-conditional-request)

My implementations are in PHP, but they could easily be adapted to other
languages. I was surprised when I found that cURL really doesnt have support
for this. Yeah, you can do something like this:

    
    
        $ curl -I -H 'If-None-Match: "109-55035a2e5a100"' \
        > speedtest.lax.hivelocity.net
        HTTP/1.1 304 Not Modified
    

but its of limited usefulness. What you need is a cache storing all the
requests youve made, so that the next time you make a request the cache can be
checked. Without the cache its pointless. Also a problem is that some sites
only return Last-Modified, not ETag:

    
    
        $ curl -I https://en.wikipedia.org/wiki/Main_Page
        last-modified: Sun, 03 Nov 2019 20:12:16 GMT
    

and some dont return either:

[https://www.google.com](https://www.google.com)

------
pm90
Can someone with a mathematical background explain what the practical uses of
this protocol are? It talks about time travel and state synchronization v/s
state transfer. Is this an effort to create a protocol that e.g. allows for
distributed systems over http instead of relying on "custom" implementations
such as Paxos? (i'm pretty sure I'm misunderstanding this, please feel free to
correct)

~~~
toomim
Yes, the braid protocol could let you run a distributed system over HTTP. It
lets you put a CRDT or OT behind any HTTP resource, which lets you distribute
the resource, with multiple writers, and guarantee consistency after arbitrary
edits. Each resource will still have a URL, with one particular hostname, but
the actual state can live and be modified simultaneously on multiple hosts.

The first thing you might use this for is collaborative editing. Braid can
give you the power of Google Docs at any HTTP URL, without writing additional
code.

This also improves performance. Instead of using heuristics to determine when
a cache needs to be reload (cache-control max-age, last-modified, etags),
Braid will automatically push all updates to caches -- guaranteed. That means
you never need to force-reload a page, or force-clear a cache. Also, updates
are sent as minimal diffs, rather than re-sending the entire resource whenever
it changes. This saves a lot of bandwidth and a lot of round-trip latency when
loading a page.

As a third practical example, Braid makes it very simple to read and write
data from _multiple_ web sites. By collapsing time, braid implementations let
you write code that manipulates state at any URL as easily as a local
variable. It doesn't matter whether state is located on your server, or
someone else's server, or distributed on everyone's servers and clients. It's
all equally easy to interact with.

This also makes it easy to write a new UI for an existing site.

As for Paxos, yes, CRDTs are an alternative to Paxos, and CRDTs can be used in
Braid. CRDTs have some performance improvements over Paxos -- Paxos chooses a
leader, and whenever the leader is unreachable, a new leader election takes
place which requires a couple network round trips before any new edits can be
broadcast. CRDTs are always editable, and edits can always be broadcast.

~~~
sagichmal
> Braid can give you the power of Google Docs at any HTTP URL, without writing
> additional code.

What implements the OT or CRDT logic? The webserver itself?

~~~
toomim
Yes, the server and client implement the logic. Each URL specifies its "Merge-
Type" in a header, which defines the spec for the OT or CRDT that they
implement.

~~~
sagichmal
Why should this be implemented in HTTP itself, rather than as an application
using HTTP as a transport?

~~~
toomim
I suggest reading the introduction to either of these specs:

\- [https://datatracker.ietf.org/doc/html/draft-toomim-
braid](https://datatracker.ietf.org/doc/html/draft-toomim-braid)

\- [https://datatracker.ietf.org/doc/html/draft-toomim-
httpbis-b...](https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-
braid-http-01)

People _do_ synchronize at the application level today. But every application
invents its own non-standard synchronization method, which is incompatible
with every other application. This results in each website only accessing its
_own_ state, rather than sharing state with _other_ websites, and the web
becomes a bunch of walled gardens.

In order to decentralize the web, we need a standard for the internal state of
websites, that makes it easy for websites to re-use the state of other
websites. Where the original web allows any website's _pages_ to _link_ to any
other site's _pages_ , Braid allows any site's _internal state_ to
_synchronize_ with the internal state from any other site.

Now, you might ask why we should implement a standard at the HTTP level,
rather than make a standard on top?

Well, it turns out that HTTP and REST are already _designed_ for sharing
state-- but they are just limited to state _transfer_ rather than state
_synchronization_. It is very natural to extend it to _synchronization_ \-- we
can do it with just 5 new headers, 1 new response code, 2 range units, and 1
new registry.

And if you try to build something on top, you'll have to re-implement all the
great things that HTTP has invented that we now take for granted: caching,
CDNs, idempotency, media types, etc. By putting synchronization _into_ HTTP,
we can add these features to the existing web. Caches (like CDNs) can suddenly
support dynamic content -- not just static content. If you change a line of
code in your Javascript, all clients will update with just a diff, rather than
re-requesting the entire file. The reload button becomes obsolete. Existing
HTTP network traffic becomes more efficient. Every TEXTAREA can become a
collaborative editor.

~~~
sagichmal
> In order to decentralize the web, we need a standard for the internal state
> of websites, that makes it easy for websites to re-use the state of other
> websites.

This is not coherent.

~~~
sagichmal
To say more, state that can be modeled as an e.g. CRDT is always domain
bounded, and almost always subject to domain constraints like, at a minimum,
authn/authz. Even if you can come up with a domain whose state makes sense to
share between website boundaries, making agnostic intermediaries like HTTP
servers state-aware enough to perform semantic merges necessarily strips that
state of any notion of privacy. This logic doesn't make sense to have at the
transport or data layer of a public network.

~~~
toomim
Your concern is that adding CRDT merge semantics somehow prevents a server
from implementing access control? That's not the case.

The Braid spec does not impede access control — that works just like it always
has on the web. A client logs into a server. If a client does a GET request,
the server decides whether the client can see the result. If a client does a
PUT request, the server decides whether to allow it. The only difference is
that these GET and PUT requests can now be broken into granular patches with a
version history.

And if you want to build a peer-to-peer network, then you will replace the
server with a validation function running on each peer, and authentication
with a crypto scheme. But we aren't at the point of trying to standardize that
stuff yet.

~~~
sagichmal
> Your concern is that adding CRDT merge semantics somehow prevents a server
> from implementing access control?

My concern is that this requirement means the state can't be encrypted.

> if you want to build a peer-to-peer network, then you will replace the
> server with a validation function running on each peer, and authentication
> with a crypto scheme

How do you break a GET request of some state blob into "granular patches" if
the state is encrypted?

------
codetrotter
> Why "Braid"?

> 1\. It adds versioning—and time travel—to the web, just like the videogame
> Braid.

That video game was the first thing I thought of when I read the name. Cool
that it is one of the actual reasons they chose to name this that.

------
chrisweekly
" _Braid is an effort to incorporate new distributed technologies into the
existing World-Wide Web. We find consensus on extensions to today 's web
standards that support distributed web technologies. We work in the IETF's
HTTP Working Group. You can join the effort._

 _The Braid Protocol is a set of extensions to HTTP, which transform it from a
state transfer protocol into a state synchronization protocol. When a resource
is changed by one client or server, all other clients and servers update.
Braid supports Operational Transform and CRDTs at web URLs, enabling peer-to-
peer, offline-capable web applications._ "

------
jkarneges
> When a resource is changed by one client or server, all other clients and
> servers update.

This reminds me a little bit of a project I started some years ago:
[http://liveresource.org/](http://liveresource.org/)

It wasn't nearly as fancy as Braid. Just a way to get a URL's current content
and then listen for changes. Kinda like Firebase, but for the web. Didn't get
much traction though.

~~~
Arathorn
The team behind matrix.org had one prior to Matrix very similar to this too
(albeit longpolling) called Glow. It was basically simple & stupid pubsub on
top of HTTP: you could GET an arbitrary url, and whenever anyone PUT stuff
within that url tree your GET would return. It worked well enough to build a
pretty massive instant messaging platform on top of it, but the lack of schema
and lack of intelligent query language got a bit frustrating. Some of the
ideas made it into Matrix though.

Braid looks cool; we've hoped someone would layer OT or CRDT semantics on top
of Matrix but it hasn't really happened yet (unless you count Matrix itself as
a set of add-only monotonic DAG CRDTs, which I guess it is). Eitherway,
perhaps going in at a lower level like Braid has legs; time will tell :)

------
sansnomme
Would it be correct to say that this is a CRDT protocol on the HTTP level,
similar abstraction level to e.g. REST?

~~~
toomim
Yes. One way to look at this is that HTTP is already very close to a CRDT or
OT protocol -- it just needs a few new features.

By adding those features into HTTP, we generalize HTTP and REST from being
able to simply _transfer_ state to being able to _synchronize_ it, across
arbitrary arbitrary edits, from multiple writers.

    
    
        HTTP: HyperText *Transfer* Protocol
        REST: REpresentational State *Transfer*
    
        HTSP: HyperText *Synchronization* Protocol
        RESS: REpresentational State *Synchronization*

~~~
sansnomme
So stuff like PouchDB, Gunjs will now be trivial?

~~~
toomim
These are databases that support synchronization. They have to design their
own custom protocol, because HTTP (without Braid) does not support
synchronization.

We want to add Braid support to PouchDB and Gunjs. Then they can interoperate,
with one another, and with the rest of the web. You'll be able to build a
distributed app that stores some data in Gunjs, and some in PouchDB, on
different servers, on different websites.

The differences between different synchronizing databases are captured in
"Merge Types": [https://raw.githubusercontent.com/braid-work/braid-
spec/mast...](https://raw.githubusercontent.com/braid-work/braid-
spec/master/draft-toomim-httpbis-merge-types-00.txt)

Over time, I imagine that these databases will add support for each other's
merge types, and then -- yes -- their abilities will be "trivial", and baked
into most URLs of the web.

------
ChrisRus
This looks very interesting indeed. I am currently working on a system that
allows hierarchical system modeling and evaluation of so-called "observable
process models". I can easily understand leveraging something like this over
HTTP to reduce implementation details of model and state transfers.

~~~
heavenlyblue
What's an "observable process model"?

------
devj
How does it compare with CouchDB Replication protocol -
[https://docs.couchdb.org/en/stable/replication/protocol.html](https://docs.couchdb.org/en/stable/replication/protocol.html)?

------
User23
Looks interesting, but where is the proof? Distributed consensus is HARD.
Without a formal proof why should I trust this? TLA+ would be great, but I'd
be happy with anything that demonstrated formal correctness.

~~~
toomim
Braid itself is just a neutral protocol— the proof you want applies to the
particular CRDT or OT algorithm that you use with it.

For instance, you can use it with ShareDB, or Automerge. Both of these
synchronizers are quite robust, and prove correctness with fuzz testing.

Links:
[https://github.com/automerge/automerge](https://github.com/automerge/automerge)
[https://github.com/share/sharedb](https://github.com/share/sharedb)

------
skybrian
What happens when you have OT and CRDT interoperating? They are different
algorithms, so wouldn't they see different results?

~~~
toomim
The trick is that they only need to agree on how multiple simultaneous edits
merge.

Each URL specifies a Merge-Type. If the algorithms implement it, they can
merge conistently.

~~~
skybrian
It seems like that's saying each client has to implement all the algorithms in
use? So, no magic here, but a choice among standardized algorithms?

~~~
toomim
Almost -- you can actually have _different_ algorithms that still _merge the
same way_. See our interoperability demo here:

[https://braid.news/demo/interoperate](https://braid.news/demo/interoperate)

This demo shows a CRDT and OT system interoperating. They use different
algorithms, but merge (almost) the same way!

(I say "almost" because we aren't using the same sorting function to break
ties when two people edit in the same location. But this could be fixed.)

In practice, you can certainly specify a merge-type _in terms of_ an
algorithm, by saying "this resource merges in the way that the Automerge
algorithm merges." But you can also state it abstractly -- for instance, as we
do here: [https://braid.news/demo/interact#a-merge-type-defines-how-
to...](https://braid.news/demo/interact#a-merge-type-defines-how-to-flatten-
bubbles-in-space)

