Hacker News new | past | comments | ask | show | jobs | submit login
Data Brewery – Python Framework for Data Processing (databrewery.org)
96 points by dedalus on May 13, 2016 | hide | past | web | favorite | 11 comments



I would be interested to know

- if those projects can be easily used with pandas ...

- and if some of their features are already in pandas ?


anyone ?


So is Cubes an interface for Bubbles? I'm having a hard time figuring out the connection between them.


Author here. They are very independent. In short: Cubes is OLAP framework for abstracting aggregate queries on top of a datastore, mostly through SQL. You define a so called multi-dimensional model – how the analysts see the world, and cubes knows how to project it to underlying data structures, preferably star or snowflake table schemas.

Bubbles was meant to be data integration library (also known as ETL) – get data from multiple sources, transform them and produce tables that are more suitable for further analysis or reporting.


Cubes sounds like what is often called "ROLAP" in the enterprisey analytics world.

https://en.wikipedia.org/wiki/ROLAP


I work in the BI space and Cubes is exactly ROLAP.


It's been a little while since I've seen this, is there still much active development? Github doesn't show many recent commits.


Author here. I admit, it has been more than a year since I made a significant contribution. It kind of correlates with change of my job – since I started to work with my current employer I didn't have almost any time to work on neither of Data Brewery projects directly. All I have is full notebook of new ideas and mailbox of unanswered feature requests and bug reports. It makes me sad, but honestly I don't know how to proceed.

I really enjoyed working on Data Brewery and I eventually will resume my work as there is a lot to do. It is just put aside on the back burner. In the meantime I might only hope that someone would volunteer to at least handle bug fixes. I'm open to grant access to the repositories.

Any suggestion how to prevent open-source project from dying is welcome.


I am interested to find out what volumes of data each of these solutions, Cubes and Bubbles, have been tested against?

I work with enterprise tools to create ETL and OLAP solutions daily and really like what I see here but I am concerned with the performance since it is all done using Python and performance is not Python's strength.

Do you have any more information regarding this?


As an interested user of your project I would be more than happy to see a sign/text on the github project page that this project is temporarily on hold/abandoned, I'm a little annoyed every time I find an interesting project with no commits in the past year and other users asking the question if the project is dead in the "issues" section but still get no answer. I know it's open source and I know you do this in your free time, but still, just let me know.

And if your looking for a maintainer to take over then also state this on your page, unfortunately the outcome and success of a new project owner will not be in your hands. It's like your child growing up. :)


It doesn't look like it. The most recent blog post is from 2014.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: