
The Gremlin Graph Traversal Language - okram
http://www.slideshare.net/slidarko/the-gremlin-traversal-language
======
liorn
Python developer here. Here's my rant.

I've been using OrientDB (one of the leading Graph databases) for the last few
months; it's been a horrible experience to get it working with Python, as the
official Python OrientDB driver is essentially a very thin wrapper around the
binary protocol.

Using Gremlin would actually be nice, and save me a lot of nasty queries, but
it seems like there is no python ecosystem for it. The presentation mentions
"Gremlin-Python". A quick Google search brings up these results:

1\. "Bulbs" ([http://bulbflow.com/download/](http://bulbflow.com/download/)) -
it's a dead project, last commit being 10 months ago. Look at
[https://github.com/espeed/bulbs](https://github.com/espeed/bulbs)

2\. "Gremlin-Python" \- [https://github.com/pokitdok/gremlin-
python](https://github.com/pokitdok/gremlin-python) . Dead project (last
commit 6 months ago), and requires one to install Jython.

I would love to know if I've missed something - did anyone get Python to
nicely work with Gremlin?

~~~
jlarocco
I don't think no commits in 10 and 6 months is a very good indication of
whether or not the projects are dead.

Assuming they just wrap other libraries, there's usually not much to screw up,
so I wouldn't expect them to be heavily committed to, except for after a
release of the underlying library.

------
fxbois
Anyone could mention some benefits of Gremlin over Cypher ?

~~~
okram
Gremlin is an Apache Software Foundation query language and as such, can be
used by any graph system (Titan, Neo4j, OrientDB, etc.). It is not bound to a
particular vendor.

Gremlin has a natural compilation to the common distributed vertex-centric
computing model (bulk synchronous parallel for graphs). Thus, Gremlin works
for both OLTP (graph databases) and OLAP (graph processors). The Apache
distribution provides OLAP connectivity to Apache Hadoop, Spark, and Giraph.

Gremlin supports both imperative path expressions and declarative pattern
matching.

Gremlin can be embedded in any host language. No "fat string" with result set.
The user's database query code and data manipulation code are in the same
language. There exists Gremlin-Java8, Gremlin-Groovy, Gremlin-Scala, Gremlin-
Clojure, Gremlin-PHP, etc.

Gremlin is Turing Complete. Most any complex enough language is. However,
Gremlin is related to a Turing Machine by a very simply mapping.

See [http://arxiv.org/abs/1508.03843](http://arxiv.org/abs/1508.03843) for
detailed specifics of the aforementioned benefits.

~~~
eranation
Thanks for the post! You mentioned Spark via OLAP connectivity. Can you please
elaborate a little on how gremlin works with spark? Does it use the GraphX API
behind the scenes or is it just spark? Are there any sources on how well it
works?

~~~
okram
Gremlin (over Spark) does not use GraphX. It simply represents the graph as a
tensor RDD (i.e. a multi-layered matrix) and with the Spark functional
library, it implements BSP-based vertex-centric computing (i.e. message
passing). You can see examples and a diagram explaining how it works at this
location:

[http://tinkerpop.incubator.apache.org/docs/3.0.0-incubating/...](http://tinkerpop.incubator.apache.org/docs/3.0.0-incubating/#sparkgraphcomputer)

