
Ask HN: Implementing a graph database using Postgres tables for nodes and edges? - jordanchan
After hearing&#x2F;reading about the negative experiences many people have had with upcoming graph databases, I am wary of jumping into a graph db. But almost any modern application has things that will be suited to a graph representation.<p>So I am considering a workaround to use a graph structure, but represent the underlying data in a reliable rdbms, in particular postgres.<p>The approach is:
1. Make two parent tables: node and edge. 
2. Make separate tables for all &quot;objects&quot; (like people, places, things, etc.) which would be inherited from node, so they would all have a unique node ID. 
3. Make separate tables inherited from edge which would be used to represent the many many relations. So each relation has an edge ID, and each table inherited from edge can be thought of as representing a specific kind of relation (like person lives in place, or person is friends with person).<p>One thing I have observed so far, is a large number of tables with few columns, but I think that lends itself to the advantage of  easy indexing. There can be a large number of individual queries from the front end, but I believe I can use views to make represent tables with more comprehensive info to reduce the number of queries.<p>What do you guys think of an approach like this? What am I missing, what is wrong with it? I haven&#x27;t come across this previously, and so am a bit nervous about the ramifications. Is someone else also doing this?
======
buro9
Storing the data isn't an issue. It's fairly trivial to come up with a number
of very good solutions to describing a graph.

The issue is querying the data, specifically when that involves walking the
graph.

The real question then, is how do you want to query the data? If you are
willing to make an assumption now that adds a tight constraint on your future
work... i.e. "We will only ever perform queries that are at most 1 or 2 hops
from the nodes/edges being queried" then perhaps PostgreSQL will work for you.

~~~
devit
You can do recursive graph queries with PostgreSQL:
[http://www.postgresql.org/docs/9.5/static/queries-
with.html](http://www.postgresql.org/docs/9.5/static/queries-with.html)

~~~
buro9
Agreed.

For shallow or narrow queries of the graph this is the way to go.

But for deep and broad queries, if you want to perform some valuable analysis
that involves traversing the graph some non-trivial distance from the point of
origin, that's where PostgreSQL is going to let you down and you would
probably be better off picking a more appropriate tool.

Same question though: How do you want to query your graph, and are you willing
to limit yourself to not querying it in a certain way?

~~~
andrewstuart2
I'm pretty sure you can get comparable performance from a graph in SQL if you
make sure you have covering indexes for the Node ids and the edges/adjacency
list table.

This may be implementation-dependent but in theory, it's solid. I'm not sure
what else a graph database would be able to do to beat an indexed adjacency
list.

------
duncanawoods
The boring answer is that it depends on your data and what you need to do with
it. Insights like "all data structures are graphs!" are dangerous - it can
actually add vagueness that might prevent more useful understanding of your
problem. Excessively generic handling of data often ends in pain when it comes
to the real world (performance and complexity) compared to domain specific
approaches.

If you have multi GB graphs and need to run intensive classic graph analysis
algorithms then a lot of that can come optimised out of the box with a
dedicated graph db. They will still leave you with plenty of hard problems.

However, if you have small graphs or only need conventional crud operations
then using a new and unfamiliar graph db would probably be insane.

If your case is the latter and postgres is your comfort zone and you can get a
working solution quickly, I would be tempted to prototype along the lines you
suggest. When you start getting performance problems or need graph analysis,
you can then make a better decision whether a graphdb or restructuring your
relational db will solve it the best.

~~~
jordanchan
Thanks, that was almost the same reasoning I used to settle on this approach.

I definitely need acid transactions, and also some graph capability - which
may or may not grow in the future. More pertinently, the kind of
representation I was attempting led to a very normalized design anyway, and I
think it also lends itself to all but the most sophisticated of graph
functionalities! But I am only just starting, so hopefully it works out well.

------
pella
The PostgreSQL Conference 2011 :"Trees and Graphs in the database"
[https://www.pgcon.org/2011/schedule/events/357.en.html](https://www.pgcon.org/2011/schedule/events/357.en.html)

* Trees in the database - Slides: [http://www.slideshare.net/quipo/trees-in-the-database-advanc...](http://www.slideshare.net/quipo/trees-in-the-database-advanced-data-structures)

* Graphs in the database - Slides : [http://www.slideshare.net/quipo/rdbms-in-the-social-networks...](http://www.slideshare.net/quipo/rdbms-in-the-social-networks-age)

~~~
jordanchan
Thanks for the links. Glad to know there are more experienced people pursuing
this seriously!

------
emehrkay
I am a huge fan of the Tinkerpop Stack
([http://tinkerpop.incubator.apache.org](http://tinkerpop.incubator.apache.org))
-- they have written a common set of interfaces for graph databases. There is
an implementation that can use Postgres as its storage engine
([https://github.com/pietermartin/sqlg](https://github.com/pietermartin/sqlg)),
there is one that can use ElasticSearch, I even seen one that uses MongoDB.
Check it out

~~~
jordanchan
This made my day! Thanks, I will check it!

------
twic
This sounds broadly sensible. It's basically a classic normalised relational
design, which is as perfect as alligators or sewing needles, but with
extracted supertables for nodes and edges.

I would challenge the idea that you need these supertables, though. What those
allow you to do is to write graph queries which are generic (polymorphic?)
over multiple kinds of node and edge. So as well as "find me all the people
who are friends with my friends", you can write "find me all the people who
are friends with my friends, or who have been to a place i've been, or who own
a thing i own" without using a union. Do you actually need to write queries
like that?

There are two downsides to the supertables. Firstly, more complexity, although
it's a minor, or at least constant-factor, amount. Secondly, a loss of type
safety. If your edges are defined in a supertable, then the columns which
point to the ends of the edge have to be foreign keys to the node supertable.
That means they can be any type of node; there's no way to constrain
particular kinds of edge to connecting particular kinds of node. That seems
like a considerable drawback to me.

~~~
jordanchan
True, as I was designing the tables, I was surprised to notice how normalized
it looked.

The generic kind of queries you mention: actually, yes, I think I will need
them.

Your second point on downsides is something I have/had to think about, thanks
for raising it. I had some vague premonitions on those lines, but you helped
make it concrete. The problem can be avoided by not making meaningless
connections, but that's not a real solution. It won't be a preventive, but it
could help to audit relationships - by having a script list out which tables
FKs originate from.

Since I started writing this post, I have had two ideas on how to address
this, it'd be great to have your opinion on them - I'm thinking out loud here.

1\. Use referential integrity. Since each relationship/edge will be in its own
inherited table, one could impose a constraint that each
column_in_the_specific_edge_table REFERENCES
another_column_in_a_specific_node_table. Like the columns in the
person_lives_in_place relationship table must reference columns in the person
table and the place table - each of which is also its own inherited node
table.

2\. More convoluted and a bit cumbersome, but if the previous/simpler approach
is insufficient, one could create data types corresponding to each node type.
And impose that as a type constraint on the edge tables.

But maybe the 1st option could actually work...what do you think?

------
t-rav
Why re-invent the wheel ! All you need to do is properly understand your tool
of choice. I have used Neo4j extensively for years and have had great results.
Our biggest hurdle was understanding how to model the data for a graph
properly. Get this right and everything falls into place. The one thing you
are going to need is a good visualisation tool, so be prepared to roll one of
these too :)

------
berdario
I was also wondering about the same thing:

Implementing a graph on top of a RDBMS is trivial, and if the semantics are
correct (the same as the ones exposed by a graph db?), then I'm not sure why
would people want to use a proper graph db.

I thought that probably it'd be an issue of performance: the "right tool for
the job" that "does one thing and does it well" probably is leaner and more
efficient. After all, unlike a trivial implementation, getting a graph on a
RDBMS to perform well might not be that simple after all (still, your idea of
inheriting tables might make things more flexible and maybe more efficient)

But then, when looking up some Neo4j benchmarks, the numbers seems to not be
good at all:

[http://baach.de/Members/jhb/neo4j-performance-compared-to-
my...](http://baach.de/Members/jhb/neo4j-performance-compared-to-mysql)

I'd like to hear from someone that used Neo4j (also, other graph databases are
interesting) in production, and benchmarked it against a RDBMS prototype,
finding the former as the better solution of the two.

~~~
pjonesdotca
While the author admits that the documentation advises to use the traversal
API (which gives the best performance) he goes with cypher over the REST
interface which would never match a direct db connection. It's a fundamental
flaw in his comparison.

~~~
berdario
I noticed that, but he's doing a single REST query for every timing:

[https://github.com/jhb/neo4j-experiements/blob/master/query_...](https://github.com/jhb/neo4j-experiements/blob/master/query_friends_neo4j_rest.py#L24)

There's no way that a single HTTP request is responsible for a difference in
hundreds of seconds

(his benchmark code isn't really sound, but I don't see how this can affect a
difference so striking)

------
joshberkus
I've done this a few times. First, see the list of PostgreSQL-based extensions
and tools listed by other folks below; they're all good things to look at
before you implement something from scratch. Regardless, this is probably a
good approach if you are using Postgres anyway and just need a graph for part
of your application.

For short graph traversals, recursive queries and joins perform surprisingly
well, even when compared against dedicated graph databases -- at least, the
open source ones I've been able to try, including neo4j. The ones I've tried
perform better than Postgres mostly when you get more than two hops away --
but then at least Neo and the MySQL thing tend to suffer if the data set
doesn't fit in memory.

I had at least one project which reverted from a dedicated graph database to
Postgres specifically because it needed decent on-disk performance (2TB per
shard).

Anyway, to your question: you could build a PostgreSQL database like that, but
why would you? If you have distinct types of entities (person vs. furniture
vs. town), why would you denormalize them into generic node interitance?
Having a completely abstracted database with one big table called "things" and
another big table called "relationships" seems really attractive before you
actually do it. Then it starts to suck.

Relational databases are designed to perform well, and to be easy to query,
with "normalized" databases in which different types of entities are
represented by different tables and relationships are defined by foreign keys.
The further you get away from that model, the harder the database gets to use.

This is one of the problems with using graph concepts for questions which are
not, fundamentally, graphs. Yes, you can define people's nationality as a
graph from location to nation, but that makes it pretty hard to do voter
registration stats. Graphs are a great tool for answering many important
questions, but they are not a great tool for answering _all_ questions.

Now, if you are answering an actual graphing question (i.e. show me everyone
who shares something in common with me via at least two links within 3 hops),
then proceed. But figure out what kinds of questions you specifically want to
answer before you get too far.

~~~
jordanchan
I noticed the same - the "explosion of tables" when I was doing the actual
designing. But, almost accidentally, it also turned out that the tables were
almost all either 5NF or 6NF. So I thought this is just a highly normalized
setup anyway - something traditional databases are supposed to be designed
for.

The list of Postgres-basd extensions mentioned in this thread turned out quite
relevant actually! I'm still going through them all, seeing if I can reuse
something..

Many thanks for the inputs, quite helpful.

------
pella
some alternatives to check :

* RDFLib Store for PostgreSQL ( [https://github.com/RDFLib/rdflib-sqlalchemy](https://github.com/RDFLib/rdflib-sqlalchemy) )

* Sparqlify is a SPARQL-SQL rewriter that enables one to define RDF views on relational databases and query them with SPARQL. ( PostgreSQL supported [http://aksw.org/Projects/Sparqlify.html](http://aksw.org/Projects/Sparqlify.html) )

* PostgreSQL-Neo4j Foreign Data Wrappers [https://github.com/nuko-yokohama/neo4j_fdw](https://github.com/nuko-yokohama/neo4j_fdw)

* PgRouting (network ) [http://docs.pgrouting.org/dev/doc/src/tutorial/analytics.htm...](http://docs.pgrouting.org/dev/doc/src/tutorial/analytics.html#graph-analytics)

~~~
jordanchan
Thank you so much for this list. I am looking into these... Maybe there's
something I can reuse, but at the very least I can get some inspiration from
checking them out.

------
kitd
I think you want to google "triple store" because that is effectively what you
are trying to do AIUI. Triple stores are the persistence strategy of choice
for RDF and other Semantic-Web-type applications.

Each row in a triple store stores a relationship between 2 objects, with
unique IDs for the subject, the relationship type, and the object. The IDs
could refer to other tables that store entity and relationship details.

There are specialised triple store DBMSs, but implementing one in PostgreSQL
should be pretty easy.

------
rikkus
If directed acyclic is sufficient for your use case, you can use this
approach, which has worked perfectly for me in the past:
[http://www.codeproject.com/Articles/22824/A-Model-to-
Represe...](http://www.codeproject.com/Articles/22824/A-Model-to-Represent-
Directed-Acyclic-Graphs-DAG-o)

~~~
beat
This is a really interesting approach, thanks! I have a relational-to-graph
problem that can be done as a DAG, and I was planning on using Postgres as a
source of truth and mirroring out the data to Neo4J just for queries, but I
think I might try this instead.

------
mikkergp
I use Neo4J and I'm curious what negative experiences you are referring to,
and what benefit this might have over that. The biggest problem I foresee is
querying and possibly data integrity. Querying since doing any kind of graph
traversal would probably require writing hacks in software, and data integrity
for much of the same reason. It may be easy to ensure a basic relationship but
beyond that I would think it would be more difficult.

~~~
sorenbs
I was never able to use Neo4J as a first level database on a busy website.
Simple traversals were just too slow. I ended up maintaining a simple in-
memory structure on a pair of machines behind a load balancer with all writes
going to sql and then propagated to the graph instances. I realise this
introduces issues around read-your-writes etc, but good enough for my use
case.

------
pbnjay
I've done this a few times. Depending on the size/density of your data, it can
be fine. If you have billions of entities and/or highly dense relations, it's
not very efficient though.

I didn't use table inheritance, and I don't distinguish relationships at the
query stage, so if that's core to your method you should think about it. I
assume with that setup it'd be easier to take advantage of partial indexes
though.

Recursive queries work but if the intermediate tables gets large it explodes
fast. You can't use aggregates in the recursive function either (e.g. to count
number of leaf nodes in a tree), you have to apply them at the end, in which
case the intermediate table has to be large...

I've had decent experience so far with GIN indexes and @> operators on an
ARRAY[] column adjacency list. In my case I stored "ancestors" so I could
select by object id and get all ancestors, or use `ancestors @>
ARRAY[parent_id]` to select all object ids as descendants.

Of course, if you don't need any "self" relationships, then none of the really
complicated stuff matters...

~~~
jordanchan
Thanks for the feedback.

Indeed, one of the reasons behind inheriting tables was to use partial indexes
- which should help with performance. Another was ease scaling out, if needed.

Using the @> operators on array columns is something I also looked into,
mainly for materialized paths - I expect them to remain static, so no
expensive updates to the paths/arrays. But actually, I don't think I will need
many self relationships - that was the third reason for using inherited
tables, so I could split entities into separate groups.

Of course, I have no idea how any of it will work out, since this was just a
semi-serious line of inquiry initially, and everything was conceptual so far,
except the feedback of experienced people such as yourself, on this thread -
which has led me to pursue this seriously and actually try to build it out.
After reading the responses on this thread, I now think it will be worth the
effort.

------
bobosha
I had a similar challenge and ended up using ElasticSearch in a thoroughly
denormalized data format Allowed me the flexibility of stateful (contextual)
queries etc. Bit of a hack, but served my purpose. Hope that helps.

~~~
jordanchan
Yes, it might be a hack [actually I don't think it is], but it should offer
some powerful functionalities...

------
jerven
A lot of the early academic RDF graph databases used this approach. However,
performance was no where near to what is required. This let to for the
SPARQL/RDF graph databases to now a whole set of independently developed
stores. Some on top of existing solutions e.g. Oracle semnet, Virtuoso and DB2
sparql. More on their own solid foundations.

You could test out 3 or 4 different SPARQL solutions in the time that it would
take you to develop something graph like on your own.

On the other hand, cutting edge approaches, actually take a graph
representation of data and lay it out in a relational manner. [http://ercim-
news.ercim.eu/en96/special/monetdb-rdf-discover...](http://ercim-
news.ercim.eu/en96/special/monetdb-rdf-discovering-and-exploiting-the-
emergent-schema-of-rdf-data) giving the best of both worlds.

In short you can build something yourself. But don't expect that it will be
better than something build and supported by someone else.

So investigate the competition: BlazeGraph, Virtuoso, StarDog, Oracle 12c
EE/semnet, DB9, Dydra before deciding to build your own. Building your own
because its fun to do is great, but unless it pays your bills not a good idea
for production environment.

PS. The edge table (EAV) is the major problem, it leads to a lot of self joins
and difficult exercises for the query planner.

You can improve a lot on this if you can put "different" edges into different
tables or partitions.

Triple stores with support for JSON-LD framing such as Dydra can also make it
easier to have a front-end on top of your DB without extensive middle layer
code.

A store like StarDog and BlazeGraph on the other hand gives you a lot of
flexibility by both supporting SPARQL and TinkerPop. (both cluster and scale
out, although BlazeGraph has GPL option. StarDog is only Commercial)

~~~
amirouche
edge is not stored as IAV the table hierarchy is:

> Node <\- Entities

> Edge <\- M2M

which is strange since an edge is already a M2M.

Thanks for the monetdb link.

~~~
jerven
Ah ok, so in RDF terms each predicate gets its own table, which makes sense.
Then you have a subject foreign key relation ship and a object foreign key
relation ship to the node tables.

That would be better than one big table in performance, which was a major
problem in the RDF on SQL databases.

Of course those accepted any graph, if one constrains the number of possible
predicates/relations then this solution could more efficient.

------
newobj
[https://www.facebook.com/notes/facebook-engineering/tao-
the-...](https://www.facebook.com/notes/facebook-engineering/tao-the-power-of-
the-graph/10151525983993920)

~~~
tkinom
I am a fan of this APIs base approach.

It is trivial to implement these APIs on top of simple SQL. I did that in a
few days in both Python and go.

------
ladon86
No-one seems to have yet mentioned Facebook's TAO paper, which describes
scaling basically this approach to 1 billion reads per second.
[https://cs.uwaterloo.ca/~brecht/courses/854-Emerging-2014/re...](https://cs.uwaterloo.ca/~brecht/courses/854-Emerging-2014/readings/data-
store/tao-facebook-distributed-datastore-atc-2013.pdf)

The nice thing about your approach is that you can shard your objects by their
id and your edges by their "from" id, and have all lookups go to one box when
you shard. Throw in some caching and it scales to multiple boxes really well.

~~~
jordanchan
You said it!

Actually, Facebook's TAO was an early inspiration for me as well, but it is
really customized and complex too, so probably not something for a tiny team.
But the concepts from TAO are applicable to many similar scenarios.

------
russtrotter
One technique you might look into is "Materialized Path":

* usually one single query to fetch an entire tree or subtree * more work on reconstructing tree from query result * requires prefix pattern matching operator, e.g. "like" from RDBMS

~~~
jordanchan
Thanks! That's a useful technique..

------
rawnlq
Dropbox does something similar called Edgestore which is built on top of
MySQL: [https://www.youtube.com/watch?v=VZ-zJEWi-
Vo](https://www.youtube.com/watch?v=VZ-zJEWi-Vo)

~~~
sorenbs
Facebook as well with TAO [https://www.facebook.com/notes/facebook-
engineering/tao-the-...](https://www.facebook.com/notes/facebook-
engineering/tao-the-power-of-the-graph/10151525983993920) I think they
presented it at one of the @scale conferences.

------
amirouche
Building a graph database is time consuming. Why are you putting edge's
children (person, place, friend) in a separate table instead of using a json
field? Do you plan to query the person, place, friend table directly? You need
to benchmark the different queries.

My understanding is that big5 and others do not use graphdb for all their
stuff since they are not as good as rdbms to do queries with a single hop or
JOIN. They embrace microservices and maintain a graphdb (in memory, persistent
or distributed) to answer domain specific queries. That approach is similar to
your schema except that graph queries run on a single node without superfluous
network roundtrips.

It would be nice to be able to use a single database for all data related
stuff to have atomic writes and simpler architecture. That's what multi-model
databases are tackling have a look at OrientDB and ArangoDB [1].

Also, Tinkerpop, already mentioned in the thread, is a ready-to-scale graphdb
with much love I recommend you have a look at Tales of the Tinkerpop [2].

[1]
[https://news.ycombinator.com/item?id=10180185](https://news.ycombinator.com/item?id=10180185)

[2]
[https://news.ycombinator.com/item?id=10316140](https://news.ycombinator.com/item?id=10316140)

~~~
jordanchan
Of course, JSON is a good way to store ad hoc, or new relations which haven't
yet been structured. But for frequently occurring relations, I'm preferring
separate tables instead of JSON for ease of querying and joins. If number of
relationships are large, querying within a long JSON field is not the best
approach.

Multi model db's are a good idea indeed. They were the first things I had
considered, and extensively so - both OrientDB and ArangoDB. I came back to
traditional solutions simply for practical reasons of maturity, and ease of
finding professional support.

Someone here mentioned about a Tinkerpop implementation that uses Postgres as
the storage engine, I think that might offer the best of both worlds, but I am
yet to look into it.

------
3pt14159
I've done this before with MySQL in 2011 after throwing my hands in the air
with two different graph DBs.

By far the most productive thing I did. People will say that you need
recursive queries, etc, but I found most of the time just keeping the query
set in the application layer was easy enough and super fast. Scaling it was
easier too because there are lots of articles on scaling the MySQL / Postgres.

~~~
jordanchan
Agreed, my reasons for coming back to Postgres are quite similar. In general,
one only needs to go a few layers in the graph. Recursive queries are needed
for specific use cases. Even then, querying from the app, like you said,
and/or building materialized views and materialized paths are very good ways
of working around this, as long as the query is anticipated in advance and the
view/path built before the query. So that reduces the real use of graph db's
to an even smaller subset - complex ad hoc graph queries.

------
swah
I'd do the same. I think this book is still relevant:
[http://www.amazon.com/SQL-Antipatterns-Programming-
Pragmatic...](http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-
Programmers/dp/1934356557)

------
barakm
If you want to help hack on this, I've written a Postgres backend for the
graph database Cayley that's fairly fast and about to merge (for use on a prod
feature):
[https://github.com/google/cayley/pull/289](https://github.com/google/cayley/pull/289)

For traversals, it tries its level best to let Postgres create the appropriate
JOINs, which Postgres does well... most of the time anyway. In doing this, I
learned really fast how many weird semantic corners of bog-standard SQL are,
and the optimization choices Postgres makes based on the way you formulate the
query.

Ping me on that PR! And/or join #cayley on Freenode

~~~
jordanchan
Thanks for sharing this! I would love to help with this - once I have gotten
up to speed on various things by building a graph on Postgres myself.

------
JoachimSchipper
What are you actually trying to do? If you just need some queries on a small
amount of read-only data, objects in memory (in a dedicated process, perhaps)
are a perfectly sane approach, and much faster than ping-ponging queries with
any database.

------
neurohax
I believe Redis and Hyperdex have constructs such as bitarrays and sets that
can be easily used for representing and querying graphs/hypergraphs. For Redis
there are some success stories and even ready to use extensions.

------
calpaterson
I don't know much about this but Joe Celko has a whole book on representing
trees in SQL: [http://www.amazon.co.uk/Hierarchies-Smarties-Kaufmann-
Manage...](http://www.amazon.co.uk/Hierarchies-Smarties-Kaufmann-Management-
Systems/dp/0123877334/)

The obvious first question would be: why are you representing your data as a
graph? Can you represent it better as sets of predicates (ie: relations)?

Using views should not reduce the number of queries: a view is just a query
with a name. If you can do it by combining views you can do it by combining
queries.

------
nrjames
I would love to see a SQLite extension to support graphs...

------
rottyguy
This is fairly timely as I'm investigating possibilities for our data model
and graphdb just popped up on my radar. Are gdb good for modeling multiple
types of products? Let's say v1 of product has attributes a, b, c but v2 now
adds x, y, z and removes a. Would that be 2 separate nodes?

I too am interested in understanding the downsides of a gdb (seeing some refs
to performance but nothing concrete).

~~~
jordanchan
Graph db's are a very good idea. Its just that the solutions in the market are
not very mature yet, and the community is not very large yet. Most of the
downsides stem from these two facts.

So unless the solution you're building has a specific need for graph queries -
in particular, complex, realtime and ad hoc graph queries, a traditional
solution is a safer bet.

You can make v1 and v2 two separate nodes, if you need information of both of
them. In this case you just link each version to its attributes. If you want
the "latest" marked separately, you can have three nodes - one for "current",
and the other two for the individual iterations. So initially "current" would
link to a,b,c and later link x,y,z, and remove a. Or, you can have two nodes -
current and previous iteration, and store the version number of the latest
iteration as a field within the "current" node.

------
madawan
Do you have a gist that implements this approach?

------
marknadal
I met with Kyle Kingsbury (Aphyr of Call Me Maybe the Jepsen database testing
series) out in Berlin and I recall him mentioning at one point that Postgres
was a good generic database especially with the JSON blob storage. So I am
wondering if making the extensions you propose might not be worth it cause it
will come with all that extra overhead.

I do however recommend exploring and even building your own graph database, it
is a great learning exercise. But don't fool yourself by doing it on top of
Postgres, actually jump in and become immersed. I built
[http://gunDB.io/](http://gunDB.io/) to scratch a similar itch, I wanted an
easy graph database with realtime updates like Firebase. I would love to hear
your feedback, or even have you contribute.

