Hacker News new | comments | show | ask | jobs | submit login
Facets: An Open Source Visualization Tool for Machine Learning Training Data (googleblog.com)
181 points by stablemap on July 17, 2017 | hide | past | web | favorite | 9 comments

Very impressed to see the confusion matrix consist of the actual images in that deep zoom style rendering. We've implemented something similar in spirit in some image processing machine learning application but instead I have a traditional confusion matrix with counts that are "<a>" anchor links to a webpage that displays all the constituent images. Nice work Facets team.

I particularly like this language here... "Dive is a tool for interactively exploring up to tens of thousands of multidimensional data points, allowing users to seamlessly switch between a high-level overview and low-level details. ...Dive makes it easy to spot patterns and outliers in complex data sets." https://github.com/pair-code/facets#facets-dive

That's key functionality to drill into our data with powerful navigable dashboards and visualization tools. We're creating this seamless transition with some Python and Flask and Bokeh tooling but nothing as impressive is Facets. But we've cued in all the domain specific things of interest, but it's nice to see a general purpose feature set on display with Facets.

I'm impressed. I would certainly have loved this when I was a practicing "data miner" (I'm old-ish). Anything that allowed me to better understand my data and models was welcome. I used DataDesk (https://datadesk.com) back then.

This tool should be useful to classic statistical modeling as well as DL.

Given how easy it is to point-click-slide complex data transforms into existence, there's also a danger of mucking things up and creating overfits and so on. But that's a minor consideration given the rather obvious benefits.

This looks amazing.

And... I keep waiting for MS to provide an add-in to Excel that will allow ML analysis and similar visualization.

Even better, someone beat MS to it and do one for Libre Calc.

They've got all the building blocks really. For instance check out sand dance https://www.microsoft.com/en-us/research/project/sanddance/

It's just that Microsoft is so focused on getting you into azure services that they're pushing these capabilities up there instead.

What you've said makes a lot of sense. There's a sea of researchers out there who do extremely complex PCA and all kinds of wizardry with Excel, mountains of complex systems are built atop Excel in both academia and the wall street, MS really needs to see and realize their moves in Excel space could be huge.

Disclosure: I work at Google.

This is one of many things coming up that help make the adoption of ML easier - we'd love to hear more about what else we can do and/or what problems you're running into as you adopt machine learning (either TensorFlow based or something else). Thanks!

I am curious about Jupyter support. It seems like it can run as a plugin to Jupyter, which would be ideal. I had a quick go at installing the jupyter plugin, documentation is a bit lacking once you've installed it.

Here's how to use Facets Dive inside Jupyter Notebook.

1. Download and install as explained in "Enabling Usage in Jupyter Notebooks" at https://github.com/pair-code/facets

2. Open a new notebook and copy&paste code from here: https://github.com/PAIR-code/facets/blob/master/facets_dive/...

It's interesting that I can't see any text on Google Blogs on mobile - Safari (iOS) with ad blocker "Focus". I can see the article on Chrome iOS though.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact