
NoSQL Data Modeling Techniques - victorbojica
https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/
======
michaelmior
This post provides a great general purpose overview of common data modeling
techniques. I'm going to take an opportunity to share some of my work in the
area. NoSE[0] is a tool I've been building to automate the data modeling
process for NoSQL systems. The models it produces cover many of the techniques
mentioned in the article.

The high level idea is that with a model of the data an application wants to
store and the workload, NoSE will suggest a data model for a particular NoSQL
store. Currently I've only tested this out with Cassandra although most of the
work to support MongoDB is in place as well.

Happy to work with anyone who may be interested in trying it out :)

[0]
[https://michael.mior.ca/projects/NoSE/](https://michael.mior.ca/projects/NoSE/)

~~~
bogomipz
This is very interesting. What was the impetus for starting this, a particular
data migration?

~~~
michaelmior
The idea is that there are a number of complicated tradeoffs that must be made
when designing a schema. See [0] for an example; note that every rule comes
with an immediate caveat. The result is that you have to be an expert to come
up with a great design for a particular system. Even then, if your workload is
complex, you might miss a non-obvious choice which will outperform your manual
selection.

The goal of NoSE is to automate the process by estimating the cost of
executing a workload against a wide range of schema designs. We can then
select the one which is likely to perform the best. As far as data migration,
moving from a relational database is a reasonable use case. Some of the work
I'm doing right now is looking at how you can transition between different
schemas in a denormalized database. However, our initial use case was to build
something that would allow a non-expert user to design a good schema.

[0] [http://www.ebaytechblog.com/2012/07/16/cassandra-data-
modeli...](http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-
best-practices-part-1/)

------
dacm
Should be noted in the title that this post is from 2012

------
bogomipz
This is a great survey of the modern data stores landscape. The cartoon at the
top is hilarious as well. Thanks for sharing.

