
Ask HN: Which graph database would you advice? - jennoo
In terms of scalability, reliability and performance? And what are your thoughts about implementing a graph inside e.g. PostgreSQL or Cassandra?<p>The graph will contain roughly about 150.000 vertexes, around 50.000 of those are highly connect between the other 100.000, expecting around 2.500.000 edges in between. The 100.000 vertex will almost all be updated daily. The graph will be used for OLTP workloads, expecting around 10 q&#x2F;s. Queries will resolve the similarity (one-to-many).
======
mindcrime
If I were doing a lot of graph specific stuff, where it was crucial to store
the graph itself, I'd definitely go with a pure graphdb over trying to shoe-
horn it into a relational db, or even Cassandra. It's hard to give a real
precise answer without more info, but Neo4J is probably a good starting point
unless your requirements are real specific in some way that isn't compatible
with Neo4J.

You should probably also ask if you really need a graph _database_ or if you
just need to use a graph _processing engine_ (like Giraph) to perform graph
operations on data that can be extracted from elsewhere.

------
PaulHoule
The main concern I would have is over the OLTP requirement.

A well-built analytics system should be able to start from the raw data and
rebuild if you trash it, thus would not be so concerned about transactions,
consistency, etc.

For online transactions at the volume you are describing, however, you don't
want a glitch to break the app, so the first question in my mind is what the
story is for concurrent access and updates to the DB.

I build systems big enough to break Neo4J but as others mention, it works just
fine for graphs your size.

------
fratlas
I'm only speaking from experience, but I quite liked Neo4J. I had ~50M edges
and as long as you get the Cypher command right, it's amazingly fast. The web
GUI is slick and very helpful, and it's got a community edition!

~~~
emrgx
+1 for Neo4j and the web GUI. I use it on my side project. Cypher is very easy
to learn. If your data is highly relational its a good choice.

------
janemanos
You could also have a look at ArangoDB. Open-source with Apache2, Cpp-based
and some neat features like Foxx framework. It's open-source with Apache2
license and I have to say their support is pretty impressive. Worth giving it
a spin.

------
erichocean
It's straightforward to build a graph database using LMDB, the performance is
excellent, and it supports transactions, online backups, etc.

If you don't mind Java, Neo4J is another good choice.

------
PaulHoule
It would help if you'd share something about: (i) data shape and size, (ii)
expected query load, (iii) expected update schedule, (iv) transactional vs
analytic nature, etc.

~~~
jennoo
Yes you are right, please have look. I have updated the question text.

------
DeShadow
ArangoDB is very suitable for you. It's a multi-model database (key-value,
document, graph).

Your data size is not big and can very fast handled on one machine.

------
emocin
As long as what you are doing is data that makes sense for a graph, I'd go
with neo4j. We use it at work with great success.

------
solisoft
+1 for ArangoDB

------
madgumby
Datastax

------
adamb_
Neo4j

------
doozy
OrientDB

