I’m part of the Muze team at Charts.com. Over the years I’ve seen lots of people who struggle to find the perfect balance between low-level visualization kernel (like d3), or black-box configurable charts (HighCharts, FusionCharts).
So we decided to build Muze taking a data-first approach, where you load your data in an in-browser DataModel, run relational algebra enabled data operators to get the right subset of data, and then just pass to Muze engine, which automatically renders the best visualization for it.
Any changes to data (including application of data operations) automatically updates the visualization, without you having to do anything else.
Couple of added benefits are :
- With other libraries, if you’ve to connect multiple charts (for cross-interactivity, drill-down etc.), you’ve to manually write the ‘glue’ code. With Muze, all charts rendered from the same DataModel are automatically connected (enabling cross-filtering).
- Muze allows faceting of data out of box with multi-grid layout.
- Composability of visualizations allow you to create any kind of cartesian visualization with Muze, without having to wait for the charting library vendor to release it as a ‘new chart type’
- Muze exposes Developer-first API for enabling interactivity and customizations. You can use the low-level API to create complex interaction
We’ve literally just launched this last month or so, so I’d love some feedback if you can spare the time.
Thanks for taking a look!
This is the biggest pain point for me with most current solutions. Either development time is super fast (e.g., tableau, periscope) but going beyond 80% is difficult, or development time is much longer (e.g., d3 or apis thereof) but you get full customization and getting to 100% is straightforward. For me, there is certainly a need to develop an 80% solution fast, but I also am always wanting to then redo the whole thing with lower level solution. I would prefer that I can piggy back off the 80% solution to 100% in the same software. That's a huge win for me. Thanks for providing a solution to this end, will definitely play around with this.
In a perfect world, how would the charting in Periscope work? How would you want to go from getting 80% very fast, to going to 100%, in the same software?
A problem I encountered (granted over a year ago) was creating grouped bar charts with confidence intervals. Bars were grouped on some discrete x axis labels. The suggested solution for confidence intervals on grouped bars was to use a scatter plot to draw the confidence intervals, but this clumped them all on the xlabel position, not in the center of each bar. matplotlib for example treats the visualization as an object, in which case it makes a lot of sense that to add confidence intervals just query the bar objects for their positions and place line segments of desired widths in the center of the tops of the bars (or wherever, you have full control over this). So in general, a marriage of these two paradigms, quick development of a visualization based on data, but then the ability to switch to viewing and manipulating the visualization as a collection of instantiated objects with full control over their attributes.
I am open to revisiting over any development periscope has made to this end.
Yeah, the hack you described for CIs is typical of "80% charting". We have a list of probably thousands of longtail visualization requests and we're way past the 80/20 point.
These days customers who want to go 100% use the Python/R editors and do their custom visualization there. So you do your SQL query like usual, but then pipe it to Python/R for the visualization. Have you tried that, and has it worked for you? Or do you prefer another model?
Now, I'm not saying that directly applies to your problems, but there's good reason this is a saying.
Going from 80% to 100% will take as long or longer as going from 0 to 80, because the last 20% is all edge cases that are a nightmare to test and mentally exhausting to comprehend.
Sometimes, it's just never cost effective to go from 80 to 100, and it's best to just leave it as is, even if there is some manual work left over.
Am I missing something?
Does this help?
This will be very helpful for cases that uses large datasets...
I built visualization using dc.js, and working with large datasets was the biggest pain point for me.
We would love to know your
- use case
- number of data points
- ops on data on serverside
You can mail us to firstname.lastname@example.org
To build the visualization in , I used 3 datasets in csv format from a kaggle competition , and I implemented the charts using dc.js and Leaflet.js. The charts were interactive and I could managed to filter the data even in the map.
The largest dataset was 284 MB, which was still ok and didn't crash my browser.
There were 2 drawbacks to my approach: 1- All the data was in the browser. If my data was bigger (~1GB), then it would crash my browser. 2- If I deploy the visualization to a server (for example AWS), then it would make the rendering extremely slowly as it has to download all the data to the browser...
So here is the thing with our DataModel. Every time you perform an ops on DataModel it create another instance. Now performing multiple such actions create a DAG where each node is an instance of DataModel and each edge is an operation.
We have auto interactivity, which propagates data (dimensions) pulse along the network. Any node which is attached to visualiztion receives those pulses and changes the visual.
So far I have not found any relational interface which exposes this DAG graph and api to user. Hence we though of building this.
Having said that, we might use some established relational interface and do the propagation ourself.
The thing that always struck me about Power BI (and also Qlik) is that it is very much a model-first tool. Visualization is secondary to the model, to the extent that much of the friction I see in new users has been treating it as a reporting/layout/visualization tool when, in fact, it is a data modeling tool with a visualization engine strapped on.
One of the big drawbacks with Power BI is that it has a terribly inefficient implementation for propagating filter contexts for visual interactions (this is their translation of your "auto interactivity, which propagates data (dimensions) pulse along the network"). I do not know the internal implementation, but I am relatively certain that visual interactions are ~O(N!) in the number of visual elements on a report page, based on my experience of performance scaling across a wide range of reports. Regardless, one of the best practices is to limit a Power BI report page to a small number of visualizations (recommendations of the cutoff value vary, and types of visuals can also impact this).
If I understand you correctly, you are calculating the minimum set of recalculations/re-renderings necessary, based on the data element that a user has interacted with. This should be something much closer to O(N) in the number of visuals to propagate user selections to other visuals. I am making an assumption that most visuals should interact, as typically the scope of a single report should have a high degree of intersection of dimensionality across all report elements.
I do not know of any analytics engine that exposes the sort of DAG and associated API you are discussing, either. The reason for my initial question was simply because that sort of engine is a product in and of itself. There are plenty of columnstore databases (and following other paradigms, but optimized for OLAP workloads) out there. It seems like biting off a lot to tackle both the data engine and the visualization tier at the same time.
The big reason that I ask is that this sort of approach to visualization seems to me to benefit greatly from a data model that supports transaction-level detail. The type of interactivity that you expose is extremely powerful. I have seen interactive tools hamstrung by data models that do not allow sufficient interaction. As soon as you put interactivity in front of users, in my experience, they want to do more with the data. If you are limited to datasets that can live comfortably in the browser, that seems a showstopper to me, as it will require pre-aggregation to fit most of the datasets I've seen; pre-aggregation negates many benefits of interactive data exploration.
I'll be taking a much further dive into your product either this weekend or next. I'm very interested.
Which is why performing this in browser env even for low amount of data (say 10k) is nightmare. There are ways you can address this but while in browser you hit the limit pretty soon.
We wanted the concept to be validated first hence we have build it for browser only. But would love to hear / learn / discuss with you on this before we go ahead and build the data model in server.
Another ambiguity with the interaction is visual effect of interaction. Questions like do you really want all your chart to be cross connected. A in house survey showed us there is no certainty of the answer. And what kind of visual effect should happen on interaction differs person to person and is a function of use case. Which is why we have chosen go for chosen behaviour like
muze.ActionModel.for(...canvases) / for all the chart in page /
.enableCrossInteractivity() / allow default cross interactivity /
.for(tweetsByDay, tweetsByDate) / but for first two canvas in the example /
}) / if selection using mouse click or brushing happens filter data /
we are still writing docs for this. We hope to finish all the these docs in two weeks time.
You're hitting a very important question in your fourth paragraph about ambiguity of desired effect from interaction. I often catch myself thinking I've heard every use case and built most of them in various viz tools. But I have learned that I am always wrong when I think that. I frequently encounter people asking for new things and it is always a toss-up whether what they want is trivial and novel or impossible and obvious.
I tend to be a data-guy much more than a viz-guy, but I fully understand the value of viz for actually presenting knowledge. Like I said, I'm interested in trying out your tool more.
WebAssembly is on our radar and is coming soon. But we just wanted to release a super early version of what we build so far.
Will figure out the effort and roadmap and then keep you updated on the plan.
If so. I think this would definitely be worthy of a blog post on exactly how you did it.
Everytime I search for domain names, they are all taken up by squatters who want 5 to 6 figure amounts and they have been sitting on them for more than 5 years, 10 in some cases.
Of course I email them back laughing, but it does mean I have to choose something completely different.
I'm definitely not going to do a snappa.io to snappa.com lol .
I went through the tutorial and I have to say...oh man, this is amazing. Building a Tableau clone is now possible! I hope you guys don't go under because its going to take me a while, but I'm super excited!
Does this work on mobile? Also, when I click "Play" the chart takes atleast 1-2 seconds to render. Is that just your code running engine or does every visualizations have that lag?
The web framework fetches data, does some additional checking on data and schema, process visible code and render it. That is probably the reason you are seeing lag.
Also there are few areas where Muze performance needs to be improved. We are doing a release to address this soon.
Plotly seems like its just the charting/graphing layer. A common use case (and increasingly an expectation) is that a series of graphs on a single page be responsive and cross filterable. For instance, if you click on a single element on one chart, it should filter the related charts accordingly. Additionally, these filters should build on each other and the developer/analyst should be able to define that.
Really, you're now doing a form of data modelling and in the domain of BI, and Plotly isn't going to help you figure any of this out.
Tableau and Power BI have gotten traction by building products that not only include but prioritize this form of modelling. Once you define your data model, you get the multi dimensional charting for free.
The appeal to me of this library (not having done a thorough G2) is an open source alternative to those products that integrates charts and data modelling easily.
- you feel you had hard time achieving with Plotly?
- a feature you wanted is not supported by Plotly?
I'd recommend pointing out the Plotly features you'd like to improve and seeing if the Muze team has solutions/improvements.
Vega is descriptive version of d3. We find it hard for debugging and creating complex viz.
Vega-lite is concise and more intuitive version of vega though.
However Muze was created to start directly from data, creating layout, composable layers, automatic cross interaction and a robust interaction mental model. Muze is inspired from VegaLite-InfoVis and Snap together viz paper.
Vega-Lite paper and layered grammar of graphics are the biggest motivations to write Muze. Vega-Lite is still my goto viz library for my ipython and JS work. Hence there are healthy intersections between vega-lite and muze terminologies and concepts.
Would definitely love to check new development.
A number of the charts in the examples are cut off (height-wise). https://www.charts.com/muze/examples/view/heatmap for example (in Chrome 69). Seems fine in Firefox.
For the time being, you can click on the play button on top right corner of the code section.
Codebase is being migrated to Polymer 2.0, and better documentation.
I do not see any reason why this could not work with your API. If interested, some concrete/simpler examples available from https://github.com/PolymerEl/multi-chart (also being ported / simplified; ETA next week)
"Project" is also a fairly common term to refer to open source tools and the community around them. See the nav bars for linuxfoundation.org or numfocus.org, for instance.
We will be making more products in future, some OS, some paid. So yes, monetization is on the cards :) - but our first goal is to create real good value!
Also what is the Reactjs story here ? We went down the "tabular views" journey a while back and eventually settled on react-virtualized which i think is the best of breed.
However, will make sure it gets tested with all linux dist before next release.
Are you looking for integration?
> Also what is the Reactjs story here ?
Offtopic question, apologies: Where and how one can create the start page animation ( the one in 3D , with the moving impulses and floating charts ) ?
explains some part about muze's composability. And this is an example https://www.charts.com/muze/examples/view/composition-of-lay...
Apart from composable layers Muze has
- tabular layout (visual crosstab) created from data facets https://www.charts.com/muze/examples/view/crosstab-chart
- auto interactions https://www.charts.com/muze/examples/view/crossfiltering-wit...
- Legend on any chart https://www.charts.com/muze/examples/view/gradient-legend
The website could be a bit more responsive. The charts are overlapping if I make the browser narrower.
Will upload the docs for this soon.
Is that required, or is there some way to generate and display these without the iframe?
However, every operation does not support formula storing. Operation like joining, grouping creates new data.
We are updating the docs rapidly. All this info would be on the docs soon.
Muze is data-first. You start with data, apply any operations (if needed), then render. Muze automatically detects the right chart for that and then renders. Also Muze allows you to compose any kind of cartesian visualization, as it follows grammar of graphics.
So if I've to explain this in a spectrum, it goes like this:
d3 (very powerful, high learning curve, you can do anything)
Muze (data-first, Grammar of graphics oriented, compose viz)
FusionCharts (chart-first, lot of depth in configurations, but can't extend yourself for new chart types)
Hope this helps.