

NIPS 2014 papers - lebek
http://cs.stanford.edu/people/karpathy/nips2014/

======
zackchase
I took a stab at trying to interpret the topics output by this run of LDA.
Green is one the clearest: generally convolutional deep nets, image
classification, empirical work.

Brown seems to have picked up on linear algebra. "Vector", "matrix", "tensor"
and "decomposition" all get consistently labeled brown, as do "eigenvalues",
"orthogonal" and "sparse".

The rest are not as useful. Black almost always has "number", "set", "tree"
and "random", but little else. Purple at times seems to signify topic
modeling, but also contains "neural" and "feedforward". Blue seems to be the
stats topic, containing "Bayes", "regression", "gaussian", and markov
processes. But it also contains random words like "university" and
"international".

Overall, very interesting. I wonder if these topics would be even better
defined with a higher setting of k.

~~~
taneliv
Karpathy had a different interpretation (in the green bar at the top of the
page). For example, purple would be neuroscience.

In addition to adjusting k, another change that might be interesting would be
to include also previous years' papers in the model estimation. Changes in
component (topic) weights year-over-year could perhaps reveal something about
the topics, or the papers.

------
slashcom
Karpathy constantly shows the gap between "Anyone could've done that" and
"Yeah, but he _did_."

~~~
nl
It's only in the last 12 months that it became clear this was possible. The Ng
"Zero Shot Learning" paper came out at NIPS2013, and given the lead time for a
paper like that I think they must have started work at about that time.

~~~
nl
Wow, those downvotes are pretty strong! Clearly I'm wrong - what am I missing?

~~~
mlla
This is done using Latent Dirichlet Allocation (LDA). The original algorithm
was published by David Blei et al over ten years ago, link to the paper:
[http://machinelearning.wustl.edu/mlpapers/paper_files/BleiNJ...](http://machinelearning.wustl.edu/mlpapers/paper_files/BleiNJ03.pdf)

There are many machine learning libraries that have good implementations of
LDA (e.g. Gensim), so it should be "relatively" straightforward to create the
topics and clustering based on the abstracts of the papers.

~~~
jamessb
I think there might be confusion about wht nl was referring to. Yes, the link
is to a list (produced by Karpathy) of papers on which LDA has been performed.

But one of the listed papers is also by Kapathy ("Deep Fragment Embeddings for
Bidirectional Image Sentence Mapping"), and I think this might be what nl is
complimenting as being done quickly.

~~~
nl
Yes this is the case. Thanks

------
redlabs4000
When the papers mention that code will be released, is that right now, or when
the conference happens? I couldn't find any links to the code in any of the
papers, including the karpathy one

------
j_juggernaut
Also check out the octopus visualization.
[http://cs.stanford.edu/people/karpathy/scholaroctopus/](http://cs.stanford.edu/people/karpathy/scholaroctopus/)

------
bra-ket
93 occurrences of "deep"

~~~
sushirain
I counted 15 occurrences in the deep learning topic.

------
nl
_A Multi-World Approach to Question Answering about Real-World Scenes based on
Uncertain Input_ is very cool.

The Karpathy paper, too.

I love the cross-modal work that's going on at the moment.

