

Bamboo: An open-source real-time data analysis system - mfalcon
http://bamboo.io/docs/index.html

======
sz4kerto
The name is not a good choice I believe as Atlassian's CI system is called the
same.

~~~
opinologo
They also have Stash, which is named after the "git stash" command. I guess we
can just call the CI system, Atlassian Bamboo.

~~~
GilbertErik
Yes, and while Atlassian's Stash bears the same name as the git command,
bamboo.io has nothing to do with continuous integration or Wacom tablets or
any such related software. I'm going to keep calling Bamboo bamboo and if I
ever need to refer to this project, I'll call it bamboo.io.

------
ollysb
I'm currently really struggling against the limits of my analytics
provider(mixpanel) and this looks like it might be a great alternative(we've
already built our own GUI on top of mixpanels engine). I don't recall seeing
any similar projects, have I missed a good one or is this the first of a new
breed?

~~~
philjr
Currently evaluating mixpanel... curious what you are you struggling with?

~~~
ollysb
Admittedly it's a fairly advanced usage for analytics, but the limitation
we've found is that you can only use one value as a unique identifier e.g. you
use a user's account number as a unique id for a visit but then if you want to
calculate the conversion rate in a funnel all visits by that user have the
same id and so there's no way to calculate the conversion rates per visit. It
means you have to choose your events and properties incredibly carefully and
often it means you have to record the same events multiple times with
different values just so that you can run the calculations in the right way.
This would be fine except that mixpanel isn't exactly cheap per data point.

Beyond this the other major sticking point is that you can't update data
points. You have to get them right first time otherwise you need to blow the
whole database away and start again. Depending on your requirements this can
be an absolute killer. We've now resorting to using a version id in the event
names so that we can blow the old series away (via hiding) and then repopulate
using fresh data, far from ideal.

It's really frustrating because overall I think mixpanel has a fantastic
product. Not being able to update data points and only being able to use one
property as a unique identifier is proving to be a real PITA though. People
have been asking for these things for years as well and there just doesn't
seem to be any willingness in the team to address these fairly fundamental
issues. I'd absolutely love to keep using them but having seen bamboo.io it's
starting to get more difficult to justify the loyalty.

------
mrcactu5
there is also LIBRE = Libre Information Batch Restructuring Engine
[https://github.com/commonwealth-of-puerto-
rico/libre/](https://github.com/commonwealth-of-puerto-rico/libre/)

------
alphakappa
What am I missing on the example [1] page? It claims to be interactive, but
all I see is a static map image.

1\. [http://bamboo.io/docs/examples.html#an-interactive-map-of-
th...](http://bamboo.io/docs/examples.html#an-interactive-map-of-the-egyptian-
election)

~~~
pldpld
Hi alphakappa, we had used an image because the original did not have a
permanent home, but now it does. The image is now replaced with an iframe,
[http://bamboo.io/docs/examples.html#an-interactive-map-of-
th...](http://bamboo.io/docs/examples.html#an-interactive-map-of-the-egyptian-
election), also here is the map site [http://sel-
columbia.github.io/egypt-2012/](http://sel-columbia.github.io/egypt-2012/)

------
desireco42
I meant to say, this project is probably excellent idea struggling to get
across... now I see it is open source.

It could use site that would explain faster what it does and to whom it is
targeted for. There is definitely need for this, this just needs to be
presented better.

------
lerchmo
how is the data stored, is it just aggregated with mongodb's "aggregate"
command?

~~~
pldpld
Hi lerchmo, writer of bamboo here, the data is stored column-wise in mongo,
aggregates are calculated with pandas, hence the name of the library. This is
a prototype and using pandas let us quickly perform many common statistical
operations. In a future version we plan to decouple the processing engine so
we can push calculations into the db layer as appropriate.

