
Microsoft’s visual data explorer SandDance open sourced - slowhand09
https://cloudblogs.microsoft.com/opensource/2019/10/10/microsoft-open-sources-sanddance-visual-data-exploration-tool/
======
devy
This is a repost from
[https://news.ycombinator.com/item?id=21224685](https://news.ycombinator.com/item?id=21224685)
(posted 7 days ago) also this wasn't the original news from Microsoft.

~~~
slowhand09
I didn't see it on YC, nor did I see it at MS. I saw it on Flowing Data. I
found it thru them, thus I posted it thru them. If they get left out perhaps
they aren't needed thus we all have to do more legwork to find interesting
things. JMHO.

~~~
devy
That's fine - most of the time we discover information through 3rd party not
the origin.

But HN guideline[1] suggests that please submit original source.

    
    
         Please submit the original source. 
         If a post reports on something found on another site, submit the latter.
    

Also, reposting the same news again after a week doesn't bring other people
who saw it any more use and may divert conversations in different threads.
Occasionally the mods allow news of great importance to be reposted but I
think this is probably not the case here, IMO.

[1]:
[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

~~~
slowhand09
Posting it if you didn't see it the 1st time is reposting to you, but not to
me or others who missed it the 1st time (or 4th time) around. Diverting
conversations... really?

~~~
dang
There are different ways of looking at this, of course, but HN's rules are
clear: (1) please post the original (as devy quoted above), and (2) a post is
a dupe if the story has had significant attention in the last year or so
([https://news.ycombinator.com/newsfaq.html](https://news.ycombinator.com/newsfaq.html)).

These rules maximize the overall quality of the site. It's certainly true that
not everybody sees every post; but if we went with that logic, there would be
no such thing as a duplicate at all. That would be bad. Front page space is
the scarcest resource on HN, and needs to be distributed across as many high-
quality stories as possible.

A great tool for running across interesting stories that you didn't see on HN
the first time round is the 'past' link in the top bar, which lets you browse
HN's top stories day by day.

------
ageitgey
Not sure why they released a press release about this now. This has been out
on Github for like 6 months, right?

If you haven't seen it, SandDance is a graphing component where the elements
of the graph (like the bars in a bar chart) are made up of individual boxes
representing all the rows in the actual data. So in a bar chart, you can
highlight part of the bar and see the exact data point it came from.

I appreciate MS releasing it and it is a nice representation for some types of
interactive visualizations. But since it is a "modern" tool written as a
jumble of web views, javascript and React, it isn't exactly the most
performant thing.

------
filmgirlcw
This is awesome! I’d missed this news from last week, so I’m glad to see this
on the front page. I remember looking at this when it was first released (it
was Power BI only at first, right?) and trying to find a way to utilize it in
a data journalism project I was working on.

Love that it’s a VS Code extension (and apparently has been for 6 months,
shows how in the loop I am) — this looks super slick.

Disclosure: I work at Microsoft but not on this team, as evidenced by the fact
that I missed this news (ironically, part of my job is producing a weekly
developer news video show — clearly I just missed this).

------
breck
Very excited that this has been made open source. This was one of my favorite
projects to follow back in 2015 or so when I was at MS.

It's a very neat idea. The way I think of the core idea is that you give
instructions to all of your rows, and they arrange themselves in a way that
high level patterns become visible. That way you can always see the forest and
the trees. This is opposed to the current practice of generating new marks
from functions on the data to then see high level patterns. In the current
practice you can see the forest or the trees.

A very interesting thing happened to me though when I played with this pattern
for a while (and this is anecdotal, so take with a grain of salt): it seemed
to damage my eyes. I would click around changing the views and the dots would
jump in all sorts of directions to the new layouts, and after a while I got
headaches and visual aura. Now, it was probably just a coincidence because I
was also a bit overworked at the time, but because it was so new, and in some
ways so unnatural (to see so many patterns of movement so quickly that you
don't generally see in nature), I decided not to take chances and moved away
from animated visualizations. Anyone else ever had anything similar happen?

~~~
fierarul
> A very interesting thing happened to me though when I played with this
> pattern for a while (and this is anecdotal, so take with a grain of salt):
> it seemed to damage my eyes. I would click around changing the views and the
> dots would jump in all sorts of directions to the new layouts, and after a
> while I got headaches and visual aura.

I suppose the animations are there just for show. Just like iPhone's parallax
effect which I find odd whenever I notice it.

PS: I assume you ruled out photosensitive epilepsy or something serious.

------
sails
VS Code extension [0] seems pretty good, tested to be fast and useful on 10k
records. I see this as a nice way to explore a new dataset prior to a more
extensive investigation.

[0]
[https://marketplace.visualstudio.com/items?itemName=msrvida....](https://marketplace.visualstudio.com/items?itemName=msrvida.vscode-
sanddance)

------
kyeb
Wow! I worked next to an intern team this past summer that was attempting to
build SandDance into Azure Data Studio - super cool to see a big press release
from Microsoft mentioning an intern project as one of the major places
SandDance is available to use, especially only less than 2 months after it was
handed off to the full-time employees.

------
_han
This is pretty cool. It reminded me of Hillview
([https://github.com/vmware/hillview](https://github.com/vmware/hillview)),
something I worked on a few years ago while interning at VMware research (a
few team members in fact used to work at Microsoft Research).

Hillview is a lot less fancy at this point, but the ambition is that it scales
a lot better (that is, to way more than 500K rows. In the order of trillions).

Paper here:
[http://www.vldb.org/pvldb/vol12/p1442-budiu.pdf](http://www.vldb.org/pvldb/vol12/p1442-budiu.pdf)

~~~
breck
Very interesting. "a trillion-cell spreadsheet for big data". Google Sheets
IIRC caps at 5M cells.

The HN mod DG built something pretty awesome back in 2009 called SkySheet,
perhaps if he's reading he'll let us know what the limits of SkySheet were
back then?

------
skilesare
A company I worked for built this almost 10 years ago. Here is our video from
SIFMA where we gave a joint presentation with Microsoft. Interesting.

[https://www.youtube.com/watch?v=7KBaX_t_hvM](https://www.youtube.com/watch?v=7KBaX_t_hvM)

------
th0ma5
They probably thought they were on the trail of something great but came up
with something simply neat to play with which is probably why it is now open
source.

------
degenerate
Working demo link:
[https://sanddance.js.org/app/](https://sanddance.js.org/app/)

------
Havoc
Is this any good for non-spatial stuff? The graphs look very spatial-y

Either way...nice one MS

------
mey
Can the link be updated to
[https://cloudblogs.microsoft.com/opensource/2019/10/10/micro...](https://cloudblogs.microsoft.com/opensource/2019/10/10/microsoft-
open-sources-sanddance-visual-data-exploration-tool/)

The original link is just a repeat with less detail than Microsoft's own
article.

~~~
40four
I was just about to post this, good call. The linked article provides no
information, just the comment

 _Nice. I hadn’t heard about SandDance until now, but I’m saving for later._

The Microsoft blog is the place to be.

