I there any information about this?
I wanted to build a system with tagged content and thought about using a graph-db. (Soft-)realtime querys etc.
Neo4j offers Master-slave replication for efficient scaling of reads. Horizontal scaling of graph databases often involved partitioning, which is a hard problem and an active area of research.
I would say this however:
- If your data and query workload is a natural fit for the graph model then the speedup you get offsets a huge amount of the advantages offered by horizontal write scalability in other DBMS.
- A single Neo4j instance can store and query a very great deal of data indeed (in personal testing I have imported low 100s of millions of nodes, and I am given to understand it can go much further still). For many use cases this is sufficient.
But I will look in neo4j, thanks :)
For instance (warning contrived example ahead) if you wanted to say "Give me all people that live in Germany" then Germany would be a node (and Lives_In a relationship) rather than a property on each individual person node.
Graph databases are optimised for thinking about data in this way. So you might start your query at the node with the label Country and the name property Germany, then return all connected lives in relationships. This obviously considers far fewer nodes than if you loop through all nodes with the label Person.
This will probably the most flexible way.