
Gun – Distributed, embedded graph engine - marknadal
http://db.marknadal.com/gun/index.html
======
adambard
This actually sounds like a pretty cool and useful project, but I had to force
myself to overcome the site, which I think does a poor job of marketing.

The headline is a bit confusing -- it's not obvious that something can be
"embedded" and "distributed", let alone "massively." You might rewrite it to
tell me what the project _does_ , not what it _is_.

Besides that, as a developer, what I want to know is:

\- Why should I use it? (Instant in-browser storage synced to a distributed
server network)

\- How is it implemented? (it seems to be a javascript project, but that's
only mentioned once)

\- How do I run it? (It run daemons on servers and stores the data in S3)

\- What do I have to do to configure it? (Seems to me like "The granularity
and frequency of these snapshots can be tweaked by you" means "Prepare to get
elbow-deep into telling gun how to shard")

If you just take the answers to these questions and make them either bullet
points up above the video, or headers for your marketing essay, it would help
people like me digest the project enough to be interested and read more.

Finally, while I'm being a grinch, I'd adjust your styles just slightly to
make the site more readable. At the very least:

\- Make `#main` narrower; perhaps 700px or even less (long lines are hard to
read)

\- In `body`, apply a line-height of maybe 20pt (the lines seem too close
together for the font size)

\- Also in `body`, drop the text-shadow. Shadow on white tends to just make
things look blurry.

All that said, I _am_ interested in learning more. A self-hosted Firebase
would be a real boon for many projects.

~~~
marknadal
Great suggestions! Already fixed one of the things, and working on updating
the others! (Although content will take a while longer). Thank you so much.

------
_dark_matter_
This title is bad. It's not an embedded database engine. It's not a database
at all. The author mentions this directly in the article (How are you
different from every other database that is trying to reinvent the wheel? 1.
Because gun is not a database (NoDB), it is a persisted distributed cache.)

There are other possible issues, but this short summary doesn't give nearly
enough information about the system to make a possible determination on its
efficacy. I will be interested to see further work.

~~~
marknadal
Hey _dark_matter_,

Thanks for the comment! I'll be doing follow up posts soon on the technical
details of my conflict resolution algorithms, and will be preparing Jepsen
tests for gun. I just wanted to formally announce this now, though, as I have
a tendency to overly-perfect stuff, and getting feedback early on is great.
Mind listing things you would like me write in-depth posts on? Thanks for the
feedback!

~~~
_dark_matter_
Hey Mark, I'm interested in several different features.

First off, you make a great point about "Hosting your own database is a pain"
and/or expensive. I'm sure you know about the raging debate that has been
happening between MapReduce and Parallel DBs, and a big part of the reason
that MapReduce (Hadoop) style has had so much support is because it's easy and
cheap. In lots of instances it may run slower, but why pay for the hassle when
you can just run Hadoop for free (not including server costs)? You're making
the same point here. Whats the threshold on money savings? How much does it
cost to use a DaaS (SQL or NoSQL) compared to what you want to do with
Gun/Redis? (Also, is this going to be open source?)

Next, the conflict resolution should be interesting. What kind of eventual
consistency guarantees are you hoping to have? I see you're planning on
addressing this.

I'm also interested in the querying. That is, are you using Javascript? My
problem with what you're saying about current query languages is that it's not
really _that_ hard to get the bare usage out of them. Simple queries in SQL
are extremely straightforward. In addition, declarative languages really make
things easier for the programmer, not harder. Who wants to have to deal with
all the nuances of a join? Certainly not me.

~~~
marknadal
Realistically I won't have cost-comparisons for a while, as this project is
just in the infancy and obviously needs to stabilize/mature first. Although as
a quick summary, your costs should approximately be server(s) that are
expensive enough to fit your smallest set of per-user active-data, plus your
S3 storage amount, plus S3 API calls (I've made this part a simple options
parameter, so you control how often these calls are being made - the more
frequent, the more integrity from worst-case disruptions but also more
expensive, less frequent and your costs go down).

What is worst-case disruption? Everything goes offline simultaneously. User's
don't have localstorage fallbacks in their phone/browser so retries rely on
the server cache. The server cache is running the default method, not Redis -
and then your server crashes or the machine goes under. And finally, of
course, there is an S3 outage and you only persisted to 1 region, not multiple
regions.

I'll get into hairy situations like that in some of my following posts, but
too much for here.

Yes, you have another good point about query languages. I agree with you very
much - somebody else commented about this, check out my reply to muaddirac.
Please keep in touch with more questions/comments, or even email me!

------
jobeirne
I found the header image a bit disconcerting, and immediately hit Back (I'm at
work).

~~~
chadillac
Pictures of guns are offensive and dangerous at work now?

Really?

~~~
TallGuyShort
Heck, I used to work in an office where most people had similar weapons on
them at all times.

------
muaddirac
Looks interesting!

> you never have to learn some silly separate query language again. A query
> language which just attempts to be some DSL to RPC another machine into
> doing the same query you could have already written in half the time it took
> to learn the query language.

Not sure I agree with this sentiment - most programming languages aren't
declarative like query languages are, and that seems especially useful for,
well, querying.

~~~
marknadal
This is true. The neat thing about the modular design I have for gun is that
people can always write a plugin that receives some/any query language, and
then translates it directly into the appropriate algorithms. So you should be
able to write your own abstractions ontop - but this is only possible because
you are able to write the direct queries underneath.

------
reillyse
I'm withholding my judgment on the tech - it all seems a little too good to be
true - but the copy on your page is great, I actually read every line of it
which rarely ever happens when I visit a technical page.

------
asimpletune
Hey, I'm confused by the statement, "No amount of leader election and
consensus algorithms can patch this without facing an unjustified amount of
complexity. Gun resolves all this by biting the bullet - it solves the hard
problems first, not last. It gets data synchronization and conflict resolution
right from the beginning, so it never has to rely on vulnerable leader
election or consensus locking".

My question is, how do you solve data synchronization and conflict resolution,
without using the techniques that do that, i.e. leader election, or some other
type of consensus algorithm?

~~~
marknadal
Gun's conflict resolution algorithm is deterministic, meaning that it will
choose the same answer on every peer without having to communicate with others
which value it chose. This even works when the ordering of incoming updates is
switched depending upon which server was the "I" sending updates out to "you"
(pronouns, analogously, are inverse of each other depending upon which person
is doing the talking). Uh, this is not clear/sounds confusing, I'll explain it
better in my post that goes over how the resolution algorithm works.
(Basically the gist is that sometimes for two servers to agree on the "same"
answer they both have to have an algorithm which results in an inverse
condition from what the other would answer - similar to how quantum-entangled
particles have opposite spins of each other to "balance out")

Point being, this makes gun truly peer to peer, because it behaves correctly
if its the only one running, or if there are numerous guns interconnected with
each other. No leader election, no consensus algorithm - you don't need those,
because the system agrees as soon as the update are received because it
resolves them immediately with an idempotent algorithm. Make sense? More
details on this soon.

~~~
asimpletune
Then I look forward to your next post. Can you give me a preview on what type
of failure model Gun can support? Consensus is easy assuming there are no
faults, but if faults are possible, then leadership election, paxos, whatever,
something is needed.

~~~
marknadal
First let me go over the levels of redundancy:

1\. In memory in the browser tab's process.

2\. If available, in the browser's localstorage or fallback.

3\. In the server process's memory.

4\. If available, in Redis on the server.

5\. If in a multi-machine setup, any other connected server that is subscribed
to that data set, being in memory (3) or in Redis (4) if available.

6\. If configured, in a machine log on S3.

7\. Persisted to S3, which replicates and shards it for you internally.

8\. If configured, in a revision file on S3.

9\. If configured, in a multi-region S3 setup, redundantly in many places.

(2) is not cleared till an acknowledgment that (7) is confirmed. (1) is not
cleared until an acknowledgement that (7) is confirmed or if the tab is
exited. In the case of (7) it is no longer the delta/diff, but a snapshot of
that current data set with that delta/diff's update. Retries from (1) ~ (5)
will happen at various events, if the confirmations are not satisfied. If a
conflict has already occurred by (3) the acknowledgement from (5) will include
a notification that the value has already been updated, along with the
standard delta/diff of that conflicting update being sent down. Meaning (5)
does not guarantee that your delta/diff has "won", only that it has been saved
or is already outdated.

Worst case condition is that (2, 4, 5, 6, 8, 9) are turned off, in which your
user's data is as volatile as them preemptively leaving the page (although I
suppose you could use an onbeforeunload to warn them) - however this behavior
is the current norm for most http post based forms and apps. Actually, pardon
me, worst case condition is that everything is offline simultaneously, however
this is not really interesting because then users won't even be able to access
your app in the first place.

Please correct me if I have my terms wrong:

Fail stop - since gun runs in your application process, any bugs or errors in
gun should result in a standard error being thrown. In the case of javascript,
your process will crash. Generally speaking you are responsible for the
liveliness of your app uptime, however if you use some of my other existing
open source libraries, they will respawn the process for you. When your app
restarts, so will gun.

Receive omission - as mentioned before, the peer that originated the message
will attempt to retry messages until a confirmation is given. So even if a
process somehow does not receive a message, it eventually will, unless the
origin gives up.

Send omission - this is a bit trickier, gun tries its best to keep user
changes by all means possible until a confirmation is received. If for some
reason gun is unable to do this, or the user's browser starts going haywire
and doing weird things... you have no way to know and neither does gun. At
this level, something is fundamentally wrong with the runtime or the OS, and
gun has no way to check this.

Arbitrary - malicious attacks are best done from server peers, however this
would require the malicious node to know your key. Currently the event of this
happening may be decently high, as all it would require is for an attacker to
take over an IP that another one of your gun node might connect to - the
connecting node will issue the key as proof of being not malicious. I
definitely could use some help on figuring out a more secure, yet still fully
decentralized means of authentication and trustworthiness of server nodes. In
the instance of byzantine mistakes, this will only effect the system if the
message actually complies with the spec - in which case it does, the message
is indistinguishable from a real request, therefore will be treated as one.
And finally, what about an intentionally malicious user? The way gun's
conflict resolution algorithms work makes it very difficult to be abused in
any way that is actually meaningful to the malicious user, but I'm saving the
details of this for a real post. That being said, such abuse will happen and I
can already describe the results for you - "annoying" and "spammy" \- luckily
gun as filters built into the piping channels, so if you as the developer
notice any particularly fishy behavior, you can always block that user out.

Recovery - when gun starts (or restarts) it goes about its usual business of
checking its cache (if Redis is available) and retrying whatever is there that
had failed to be cleared, listening to messages which may be retries of things
that had failed to be cached (especially in the case Redis isn't enabled), and
intelligently pulling things back in from S3. When any data set is loaded into
gun it will also subscribe to that data set so it can be notified of updates
from any other node, gun will also merge (using the same conflict resolution
algorithms) the data set it got from S3 with any other node that also has that
data set, and even replay some persisted logs if available. This way it boots
itself back up into the most valid and live operating status that it can.

Partitions - there are 9 different layers, of which communication and
networking can all fail or go out. That is about 9 factorial amount of
combinations that can, could, and probably will go wrong (although this is an
over estimation since not all the layers require networking). I simply can't
cover them in this comment, but will go over some of the most concerning and
common cases in a separate post - I also plan on building a partition
simulator, so that way people can play with causing these issues, and so that
I can do testing against them. That being said, gun as a strong commitment to
handling these things, since it can afford to (since gun is eventually
consistent, we have time to make up for networking problems). I hope this
comment was helpful in what I did go over - I'm not sure if you'll even wind
up seeing it (please notify me that you did! Even if that is just an upvote)
I'll probably reuse a lot of this in one of my actual posts.

Thanks again! Anything else?

------
noelwelsh
Nice idea which mirrors a lot of my current thinking; far more details are
needed.

For example: "All conflict resolution happens locally in each peer using a
deterministic algorithm." Hrrrmm.... if the model is to use CRDTs for conflict
free resolution I can believe it; if it's timestamps, for instance, I'm much
more doubtful.

~~~
marknadal
noelwelsh, thanks for the comment! Timestamps do fail pretty miserably for
handling conflict resolution - while my algorithm does use timestamps, the
timestamp is hardly responsible for the actual synchronization. It just
provides a basis to initially sort things off of, and then the conflict
resolution kicks in. Obviously I'm going to need to do a pretty detailed post
on how this works, cause unless it can be battled-tested and skeptically
investigated by others, it is only as good as snake oil. So more on this soon!
Thanks for bringing it up.

~~~
avodonosov
Conflict resolution is the main question.

------
johne20
How would gun handle user auth and data that can only be seen by the
authenticated user?

~~~
marknadal
So I'm making sure that the concepts of piping and transforms are built right
into the API, so you'll be able to easily filter out data by whitelists or
blacklists as it is getting pushed down to your users. I'll also be writing
plugins to gun that will allow people who like ORMs to just attach one on, and
the filtering and validation will be handled for them. Does this answer your
question?

~~~
johne20
Yes conceptually it does, I would be interested to see how it is implemented.
Thanks!

------
ilaksh
This reminds me a little of Meteor and a bit more of ShareJS. And it reminds
me of a few other things.

As far as transformations and conflict resolution, I am wondering, are you
using something like operational transformations?

~~~
marknadal
I had to develop my own synchronization algorithms after doing a lot of
research on OT for a couple of reasons. One is that Google sometimes relies on
being a centralized authority (Google's server) to resolve some conflicts,
especially for collaborative rich text editing (Neil Fraser, a sync genius
that Google hired, has a great talk on his own type of implementation "diff-
match-patch", although Google uses different algorithms now I believe).
Another is that OT sometimes requires you writing your own transformation
commands, and I wanted something that was generally applicable to most data
types without forcing developers to write their own. (It has been a while
since I did a lot of this reading, so this comment may not necessarily be
accurate/true, please somebody correct me!)

Instead, I developed my own method, called "Analytical Fluctuation" \- but I
haven't written papers on it yet, but need to soon! One requirement is that it
has to be truly peer to peer and cannot rely (even occasionally) on some
centralized server or leader election setup. Another requirement is that it
has to be latent proof, which for me means "preparing for the future" when
people need to collaborate on documents not only from the other side of the
world but also from Mars. This is where Neil Fraser says his system breaks
down, because if there is too much latency, the patches stop applying
idempotently. Again, more on this in upcoming posts.

------
curveship
So CAP: which does gun throw off the island? Based on this statement ...

> It bridges the distance with a realtime connection, so updates propagate at
> the speed of the raw pipes linking them.

... I'm assuming it's the P.

~~~
dantiberian
CA system's don't exist so it must be CP or AP.

    
    
      However, further consideration shows that CA is not really a 
      coherent option because a system that is not Partition-
      tolerant will, by definition, be forced to give up 
      Consistency or Availability during a partition.
    

[https://foundationdb.com/white-papers/the-cap-
theorem](https://foundationdb.com/white-papers/the-cap-theorem)

~~~
GregorStocks
It could also be neither, which is a popular choice these days.

------
mallyvai
"Because face it, any sufficiently capable query language has to be Turing
complete" \- No, no it doesn't. Stock SQL is not Turing complete. There are
extensions that support recursion to make it so, but it is perfectly possible
and capable of doing everything you could reasonably want without it.

Caching is hard. Consistency is hard. Peer to peer is hard. I hope the author
is addressing these in a sane, verifiable way. I'm really curious to see the
demo apps that come out of this.

~~~
marknadal
I apologize, I wasn't saying every query language /is/ Turing complete, just
that to do exponentially more complicated tasks in a single atomic query...
the lines start to blur between the two.

Yes, these are hard subjects - especially things like cache invalidation. You
raise good points, and I hope to answer them in the follow up posts I write -
I expect your's and other's good eye to check my work. I'll let people know
when I've written them. Demos are coming too!

------
beshrkayali
Regardless of the title being confusing or not, this seems pretty interesting!
Would like to see how it performs under some heavy load / multiple sources of
data.

------
xkarga00
Not everyone is familiar with the concept of NoDB so this paper [1] is a good
read to start with.

[1] [http://bit.ly/RjBl9S](http://bit.ly/RjBl9S)

------
stewars
What license is the code distributed under? It is not clear from the site or
the code that is available through npm.

------
Allower
I have been looking for something just like this! Fantastic

------
jellicle
The name "gun": a) has negative connotations, b) is ungoogleable, c) can't
even write about it without being ungrammatical and d) will be filtered out at
some workplaces.

Dumb, de-dumb dumb dumb.

