
Visualizing TSNE Maps with Three.js - polm23
https://douglasduhaime.com/posts/visualizing-tsne-maps-with-three-js.html
======
yboris
PSA: Try UMAP instead of t-SNE

[https://github.com/lmcinnes/umap](https://github.com/lmcinnes/umap)

It's much faster and usually results in better clustering / representation.

~~~
jointpdf
I’m a huge fan of UMAP, but this [0] paper suggests that t-SNE can be tuned to
produce UMAP-like results (the algorithms are extremely similar—you can
recover t-SNE with certain UMAP parameter choices). One of the insights is to
use PCA first to better preserve the global structure.

For example, see figure 9 in the paper: the plot on the left is the typical
result of default t-SNE (distance between global structures not well-
represented, since everything is jammed together), and the plot on the right
is very UMAPish.

Basically, there are a lot of preprocessing and parameter choices involved in
producing these embedding plots, so it’s advisable to try to understand the
effects of these choices regardless of which algorithm you choose.

[0]:
[https://www.nature.com/articles/s41467-019-13056-x](https://www.nature.com/articles/s41467-019-13056-x)

~~~
tetris11
I thought UMAP's main advantage was being able to project new data without
having to recompute the embedding, whereas tSNE still does - making persistent
plots difficult

------
throwaway_2047
[https://en.wikipedia.org/wiki/T-distributed_stochastic_neigh...](https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding)

In case someone wondering, like me, what TSNE is. Which I still don't
understand after reading

~~~
ArnoVW
it's an algorithm for projecting data to lower dimension. I.e. you have an
Excel sheet with 20.000 lines (representing customers for ex) and 200 columns
(representing blood pressure, height, weight, etc).

What you want to do is "visualise" those 20.000 points in 2D or 3D so you can
get an idea of how the data is distributed. So you use t-SNE to "compress"
those 200 columns to 2 or 3, and you display that.

Traditionally you would use Primary Component Analysis, but that only uses
linear projection, and will not be able to project data that has non-linear
relationships in the distributions.

Another algorithm, sometimes more powerfull and scalable is LargeViz.

~~~
lmeyerov
UMAP has largely replaced t-SNE in our toolkit as one of our top go-to viz
pipelines. Unlike most examples out there, we post-process with k-nn to expose
the graph of correlations over arbitrary data sets -- bank accounts fraud
scores, cancer protein mutations, twitter bots, malware files, etc. -- and
then investigate. Algorithms like UMAP figure out this connectivity anyways
(see also: TDA), and useful for guiding subsequent explorations. If you're
doing an interactive analysis, like looking at data in a Jupyter notebook,
super powerful to expose that inferred connectivity and make it interactive
(on-the-fly filtering, clustering, etc.) on it.

Tool-wise, we do it in a few lines over tables with many rows/columns via end-
to-end GPU acceleration using [https://www.RAPIDS.ai](https://www.RAPIDS.ai)
(GPU dataframes + UMAP) + Graphistry (GPU viz, which we make).

------
nestorD
Here is another, very good, interactive visualization of t-SNE on distill :
[https://distill.pub/2016/misread-tsne/](https://distill.pub/2016/misread-
tsne/)

~~~
abhgh
I was about to post that article myself; good write-up on the pitfalls of
interpreting t-sne viz.

------
danaugrs
I've built an implementation of t-SNE in Go ([https://github.com/danaugrs/go-
tsne](https://github.com/danaugrs/go-tsne)) and really like the fact that your
visualization has a short Z dimension. Very interesting effect.

------
jononor
Here is a fun demo of TNSE, projecting MNIST digits interactively in your
browser. To start it hit "iterate". [https://nicola17.github.io/tfjs-tsne-
demo/](https://nicola17.github.io/tfjs-tsne-demo/)

------
joshribakoff
I’d highly recommend [https://github.com/react-spring/react-three-
fiber](https://github.com/react-spring/react-three-fiber) it reduces a lot of
threeJS boilerplate. I’m using it to visualize reactive programs in
[https://rx-store.github.io/rx-store/](https://rx-store.github.io/rx-store/)

------
swiley
Writing an OSM map renderer in JS was the first modern JavaScript I ever
wrote. I did it to prepare for my last summer internship which was some pretty
intense modern JS.

