
Open sourcing Embedding Projector: a tool for visualizing high dimensional data - runesoerensen
https://research.googleblog.com/2016/12/open-sourcing-embedding-projector-tool.html
======
jph00
I'm really happy to see this. I demonstrated an (in some ways) more advanced
version of this a few years ago in my talk on TED.com - see the last few
minutes of
[http://ted.com/talks/jeremy_howard_the_wonderful_and_terrify...](http://ted.com/talks/jeremy_howard_the_wonderful_and_terrifying_implications_of_computers_that_can_learn)

Unfortunately back when I was at Enlitic we never got around to open sourcing
this, but I hoped my demo would encourage people to explore tools built on
projections. There's great potential to rapidly label data and improve models
using these kinds of tools. I'm certainly hoping to find time to return to
this myself sometime soon.

~~~
garysieling
Do you know what the best options are currently for the labelling part? I'm
looking for a tool to highlight phrases to train an entity recognition. IBM
has something, but it's pretty expensive, and I'm not sure what this would be
called.

~~~
nl
Not entirely sure what you are looking for, but have you seen
[https://demos.explosion.ai/displacy-
ent/](https://demos.explosion.ai/displacy-ent/)?

~~~
garysieling
This is right sort of UI. I want to be able to highlight text and say what
type of entity is selected to train a new model.

It needs to be more of a workflow based system though, so you can upload 10s
of thousands of documents and tag them as quickly as is reasonable.

~~~
nl
I think BRAT can do that:
[http://brat.nlplab.org/index.html](http://brat.nlplab.org/index.html)

I've never used it for doing the annotation myself though.

There are a few other possible tools here: [https://omictools.com/text-
annotation-category](https://omictools.com/text-annotation-category)

~~~
garysieling
This looks like what I need, thanks!

------
ohitsdom
Looks cool, but couldn't quickly try it out. The website for Projector caused
Chrome to crash- it showed a "WebGL hit a snag" error.

[http://projector.tensorflow.org/](http://projector.tensorflow.org/)

~~~
nsthorat
Thanks for the feedback. What version of chrome and what OS are you using?

~~~
ohitsdom
Version 54.0.2840.99 m on Windows 10.

------
nl
Is there a TF project to build (word) embeddings suitable for this? Gensim is
so easy to use, but the added flexibilty of TF could be useful.

~~~
rnabel
I couldn't find one, but from what I gather from the docs[1] you can give it
basically any tsv file of high dimensional data. You can have a dig around in
the repo here[2].

[1][https://www.tensorflow.org/versions/master/how_tos/embedding...](https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html)
[2][https://github.com/tensorflow/tensorflow/tree/master/tensorf...](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tensorboard/components/vz_projector)

