

Ask HN: Modern OLAP Cube? - mrits

What are startups using as alternatives to Oracle and Microsoft for olap data?<p>The only decent open source project I can find is Mondrian on top of Postgresql.<p>I&#x27;m investigating cubes because I believe it fits my current problem. We have a star schema that we currently just run tons of different reports on. We take advantage of Postgresql materialized views and abuse the crap out of functions.<p>One of the larger problems we have is dealing with non-additive metrics. E.g, we might want to see how many distinct ip addresses are in a time frame. And since &quot;distinct&quot; is not additive we currently keep large sets around in the rollups to make them additive. I&#x27;ve investigated HyperLogLog data structures to solve this problem but I couldn&#x27;t get product management on board with probabilistic data structures.<p>I&#x27;m looking forward to hearing what other people are experimenting with.
======
greggyb
Unfortunately for you, my work causes me to focus on the Microsoft stack, so I
can't give any specific product recommendations.

That being said, technology-wise I can suggest a columnstore database to
address some of your needs, particularly the need to use the distinct
operator. The storage format of a columnstore lends itself to aggregation and
distinct operations, with multiple orders of magnitude improvements possible
compared to SQL.

Additionally, traditional OLAP does not handle distinct operations all that
wonderfully to begin with, so it likely wouldn't help you too much for that
option.

Further, you've not provided much detail about your dimensional model or your
use case. It's worth exploring different fact table structures. The Kimball
Data Warehouse books are always worth reading.

------
akbar501
Apache Kylin, created by eBay, is a modern cubes on HBase solution. However,
the overhead of HBase is likely not worth it unless you have a data size
problem.

[http://www.kylin.io/](http://www.kylin.io/)

------
skrowl
I'm not sure what language you're targeting, but you might look at
[http://cubes.databrewery.org/](http://cubes.databrewery.org/)

