
Ask HN: Graph Database or RDBMS? - thebillkidy
Hi everyone,<p>So currently I am working on a new project where 3 types of users are heavily connected with each other, languages, tasks and other entities. When working on my database schema I ended up with having 4 many to many relations (in 11 entities) that had to be split up in a join table (and I will probably have to add more many to many join tables later on).<p>This together with the fact that I will have to build a &quot;matching algorithm&quot; that needs to take location and other attributes into account, makes me wonder if I should use a Graph Database for the whole project, Use a RDBMS system, or go for a hybrid solution that starts with a RDBMS and uses a Graph Database for fact extraction.<p>Is there anyone out there that has experience with this and can recommend a route that I should follow?
======
spotman
From my experience, I have found that a properly designed schema in an RDBMS
is going to outperform a graph database for OLTP style queries _if_ your graph
traversal is light.

So, if you just need to build a system that finds similarities and you only
need to look at a level or two of relationships, its going to be more simple
to use an RDBMS, possibly.

But, if you need to do a lot of graph traversing, this is where and RDBMS
system is going to get tricky.

One caveat with graph databases is that there is different kinds with
different strengths. Some are fast and in-memory, some are distributed and
meant more for OLAP. Some are distributed with reads, and are meant for OLTP
but are slow for writes.

While there is similar tradeoffs with different RDBMS systems, if your use
case is to get something up and running, I would start with an RDBMS and keep
it simple until you can't.

Finally, your final paragraph may be one path forward, start with an RDBMS,
and if you can't leverage to do what you need quickly but its working for most
of your use case, you can extract the relationships to a graph database and
use it alongside.

~~~
thebillkidy
Great answer! thanks. I decided that I will star with a RDBMS, and then when I
get to the point of creating the recommendation / matching engine I will look
into hooking up a graph database if I can not solve it through the RDBMS.

------
craig_taverner
I was faced with a similar situation about 8 years ago, increasing modeling
complexity, more join tables and more attributes to take into account. I did a
prototype on Neo4j just to see if it would make things simpler for my case,
and it did. Subsequently, I ported the entire data model to Neo4j, as I think
the complexity of handling multiple databases is unnecessary if it can be
avoided. After taking two products to market using Neo4j as the database, my
interest in the technology was such that I now work full-time in the Neo4j
development team, so obviously I have a bias (although I did not back when I
was in your situation).

But I still think the best way for you to make your own mind up here is to
create a prototype project. This is usually the best route for any technology
choice. You already know how to model your domain in an RDBMS. Make a
prototype in Neo4j to get the knowledge you need to make an informed decision.

------
tom_b
I have not found many-to-many relationships to be particularly problematic in
RDBMs from a modeling and usage perspective.

Just curious, are you asking about a multi-way relationship between more than
two entities? A many-to-many between Entity1 and Entity2 is a pretty normal
model. A many-to-many relationship between Enity1, Entity2, and Entity3 (or
even more entities) has been rare in my experience. When I taught RDBMs, I
made a point of emphasizing that these types of multi-way relational models
should be carefully examined just to ensure the 3-way, 4-way, etc relationship
was absolutely necessary.

Honestly though, I think much of this depends on what your experience with
both techs is. If you have spent a ton of time using a graph db and are
comfortable using that as a backing store for your project, roll with it.

~~~
thebillkidy
It is limited to two entities yes. My experience also leans more towards a
RDBMS

------
codr4life
I would recommend getting a plain old code solution up and running first, get
the logic right. Once you have that staring you into the face, you will see
what is needed where. Starting with the database paradigm will never lead to
an optimal solution. Good luck!

~~~
thebillkidy
The logic and goals of the application have been defined already and tested.
Right now it is choosing between the most extensible and best performing
storage model. We also want to support further question asking that allows us
to extract more domain knowledge, so we want to keep this in mind

