
Dragon: A distributed graph query engine - dineshp2
https://code.facebook.com/posts/1737605303120405/dragon-a-distributed-graph-query-engine/
======
mark_l_watson
The query examples looked like Clojure code; is that correct?

The technology of Facebook and Google (where I worked as a contractor)
impresses! That said, I like a simpler Internet with smaller scale services
like Gnu Social, and simply using email to stay in touch with family and
friends. I have this preference both for privacy reasons and also prefer more
one on one communication.

~~~
hellofunk
It looks a lot like Clojure but I'm not sure it is. For example, this
expression:

(filter (> age 20))

That second form to filter should be a function, but in this case it is an
expression that returns a boolean and is not itself a function.

~~~
adsharma
Right, it's not Clojure, but we tried to borrow some syntactical elements for
readability. More discussion here:

[https://groups.google.com/forum/#!topic/clojure/P3eW6Vi2QcU](https://groups.google.com/forum/#!topic/clojure/P3eW6Vi2QcU)

~~~
okram
It has a very similar look and feel to Gremlin
([http://tinkerpop.apache.org](http://tinkerpop.apache.org)).

    
    
       (->> ($alice) (assoc $friends) (assoc $friends) (filter (> age 20)) (count))
    

...in Gremlin is:

    
    
       g.V(alice).out("friends").out("friends").has("age",gt(20)).count()

------
eranation
Nice! What are the benefits over TitanDB, Apache Spark/GraphX, GraphLab,
Giraph, Neo4j, openCypher Tinkerpop etc?

~~~
jimbokun
Seems like the big improvement is heuristically keeping related data on the
same hosts as much as possible, to improve query performance.

------
jimbokun
This sounds really cool.

For distributed relational or graph databases, seems like the key trick for
making queries efficient is to get related data on the same host, whenever
possible.

So would be cool to dig into the specifics of the algorithms they are using,
to see exactly how they are optimizing where to store the data. With a graph
database, it's impossible to guarantee having all related data together
(eventually, friend of a friend of a friend...will be on a different host). So
needs to be heuristics based.

For tree shaped data, on the other hand, it is possible to have the root of
each tree and all of its related data on the same host (assuming each tree is
"reasonable" size). Google's F1 project took this approach.

~~~
adsharma
Hello, I wrote the blog post. There is a link to a video in the blog and we
published something at NSDI last week:

[https://www.usenix.org/conference/nsdi16/technical-
sessions/...](https://www.usenix.org/conference/nsdi16/technical-
sessions/presentation/shalita)

~~~
jimbokun
Thanks for the link! Looks like an interesting read.

------
deforciant
looks interesting, where can I download Dragon? :D

~~~
lynxaegon
I don't think it's open source yet. Maybe in the near future

------
rubyfan
Slight tangent here but is it me or has the experience in social networks
taken a hit? Feels like it was better with low complexity hack technology than
with technology solutions focused on scaling the experience. I sometimes
wonder if we are changing experience too much for technology optimization.

\- LinkedIn has been terrible for about 2-3 years since things stopped
updating real time and the timeline started getting random

\- Facebook randomly suggesting things from your graph based on what it thinks
you like

\- Twitter I heard soon will be no longer in chronological order?!?

I think there is something to be said for low complexity MySQL and memcached.

~~~
bananaoomarang
I don't think that's to do with stack complexity, more like:

"Crap how do we make money"

"Crap how do we retain users/keep them here longer"

The stack is to do with providing a fast performance to vast numbers + making
it easier to work on for engineers than it might be at that complexity.

~~~
rubyfan
yeah I hear that and to be clear I'm not blaming all their experience issues
with the stack.

But there are definitely stack complexity trade offs when you get to the scale
of these guys.

Also could be more correlation not causation that when you get to a certain
scale and more stack complexity you also start hiring specialized staff that
end up diluting the culture and spirit of a startup.

