
Unsupervised sentiment neuron - gdb
https://blog.openai.com/unsupervised-sentiment-neuron/
======
ericjang
Why are people being so critical about this work? Sure, the blog post provides
a simplified picture about what the system is actually capable of, but it's
still helpful for a non-ML audience to get a better understanding of the high-
level motivation behind the work. The OpenAI folks are trying to educate the
broader public as well, not just ML/AI researchers.

Imagine if this discovery were made by some undergraduate student who had
little experience in the traditions of how ML benchmark experiments are done,
or was just starting out her ML career. Would we be just as critical?

As a researcher, I like seeing shorter communications like these, as it
illuminates the thinking process of the researcher. Read ML papers for the
ideas, not the results :)

I personally don't mind blog posts that have a bit of hyped-up publicity. It's
thanks to groups like DeepMind and OpenAI that have captured public
imagination on the subject and accelerated such interest in prospective
students in studying ML + AI + robotics. If the hype is indeed unjustified,
then it'll become irrelevant in the long-term. One caveat is that researchers
should be very careful to not mislead reporters who are looking for the next
"killer robots" story. But that doesn't really apply here.

~~~
eanzenberg
Is it wrong to be critical of research? Back in my previous life of doing
basic research I scrutinized papers left and right.

[http://karpathy.github.io/2015/05/21/rnn-
effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
towards the end has similar methodology and is 1.5 years old.

Hype is an interesting thing especially when it comes from laymen.

~~~
laingc
As someone familiar with the field, you likely know this already, but the
similarities between the Karpathy post from 2015 and this work from OpenAI is
likely because Karpathy is a founder and lead researcher at OpenAI.

~~~
eanzenberg
Ya but he's surprisingly absent from being a paper author.

------
1024core
I don't know, but this seems a bit hyped in places.

They start with:

> Our L1-regularized model matches multichannel CNN performance with only 11
> labeled examples, and state-of-the-art CT-LSTM Ensembles with 232 examples.

Hmm, that sounds pretty impressive. But then later you read:

> We first trained a multiplicative LSTM with 4,096 units on a corpus of 82
> million Amazon reviews to predict the next character in a chunk of text.
> Training took one month across four NVIDIA Pascal GPUs

Wait, what? How did "232 examples" transform into "82 million"??

OK, I get it: they pretrained the network on the 82M reviews, and then trained
the last layer to do the sentiment analysis. But you can't honestly claim that
you did great with just 232 examples!

~~~
derefr
This actually demonstrates something very interesting, I think: you can take
an ML model trained with the "low-level prerequisite knowledge" of a subject,
and then _very quickly and easily_ teach it a high-level concept that relies
on that knowledge.

Which, now that I think about it, makes the human brain and its amazing
adaptive general-game-playing abilities a bit less mysterious. Since _we
humans_ all have these huge corpuses of sense-data we've been receiving
reinforcement signals about since birth, we've likely built up all sorts of
low-level models which we just use to predict the world for reflex responses a
little bit better and faster (speech models so we can respond to what people
are saying even as they're still saying it, visual models so we can throw
spears where lions are _going_ to be instead of where they are, etc.) But
those low-level predictive models make it nearly effortless to build higher-
level models.

I wonder if we'd take a giant leap forward in AI if we just managed to
scan+emulate a regular animal brain (say, of a rat), and then built the AI as
a neocortex-equivalent for that brain. It would have instant access to
thousands or millions of pre-trained low-level predictive models, which it
could easily discover as having outputs correlated to success and thus "attach
to" during its own training.

~~~
emcq
What you describe is exactly what practitioners in the field have been doing
for years. I think that's why the parent is a bit puzzled at the publication,
as it's difficult to understand what's novel.

~~~
Cybiote
Yes, I agree with you but with a caveat. Semi-supervised learning is well
known but has, I'll argue, recently fallen out of fashion in favor of throwing
gallons more of labeled data at a really big neural net, crossing your fingers
and hoping for the best. Usually, the neural net is either a really big conv-
net with a novel architecture or a biLSTM with some elaboration on attention
(which is actually closer to memory/state).

Most of the time, in neural net land, what people are doing with the fine
tuning part is taking a model trained on looaads of supervised data, chopping
off the head and using those features to train on smaller data. This OpenAI
method is different in that it used patterns it learned on its own, instead of
the recently more common technique of features extracted from a heavily label
trained model to reduce the supervised learning burden in a nearby domain.

Arguably yes, this is an ancient technique but it has mostly been forgotten
when it became clear that many problems are surmountable with a large enough
helping of GPUs and a small moon's worth of data. OpenAI's is a good idea
because it makes you say 'yeah that's obvious, pretrain a simple char rnn on
loads of free text and oh wait, why has no one tried this before!?'

What is interesting here is that such a straight forward method compares so
well to glittering methods that laboriously advanced the state of the art.
What I also found surprising was that there was a 'neuron' that was tracking
something very close to sentiment. Why?

A bit of thinking and I came to a simple idea. One way of looking at the LSTM
in the practical setting (as opposed to a theoretically Turing Equivalent
thing) is as a really big finite state rube goldberg machine. In learning to
predict the next character, it makes sense that one set or part of a set of
states it can enter/track is extremely correlated with what we humans call
sentiment in review text.

In summary, the trained model can be thought of as a computable theory of
amazon reviews that also works really well on IMDB reviews (and probably short
but probably not sarcastic text reviews in general).

~~~
dnautics
thanks for clarifying this- it isn't _transfer learning_ at all, more like the
techniques like, digging through LSTMs _post hoc_ to find the neuron
responsible for opening and closing quotation marks (insert karpathy youtube
vid here), except for a more "high level" feature - in this case, sentiment.

------
srush
If you are interested in looking at the model in more detail, we (@harvardnlp)
have uploaded the model features to LSTMVis [1]. We ran their code on amazon
reviews and are showing a subset of the learned features. Haven't had a chance
to look further yet, but it is interesting to play with.

[1]
[http://lstm.seas.harvard.edu/client/pattern_finder.html?data...](http://lstm.seas.harvard.edu/client/pattern_finder.html?data_set=32sentiment&source=states::states&pos=110&brush=28,31&queried=true&ex_cells=)

------
YCode
The synthetic text they generated was surprisingly realistic, despite being
generic.

If I were perusing a dozen reviews I probably wouldn't have spotted the AI-
generated ones in the crowd.

~~~
haddr
We are getting better and better with automatic text generation. I wonder who
will be the copyright owner of an AI-generated text, comments, songs, etc.?

~~~
gallerdude
A weird thought: at some point AI short stories may be far more profound than
our own.

~~~
beaconstudios
at the moment, AI short stories are derivative, so it's unlikely. They may
well be better than the average, if trained on highly regarded works, but
they're not completely novel.

~~~
happycube
At the moment RNN's can't remember context, so they can make stuff that
_looks_ correct, but only on the surface.

I think that'll change, eventually...

~~~
visarga
We need some kind of hierarchical approach, and/or memory.

------
nl
So char-by-char models is the next Word2Vec then. Pretty impressive results.

It would be interesting to see how it performed for other NLP tasks. I'd be
pretty interested to see how many neurons it uses to attempt something like
stance detection.

 _Data-parallelism was used across 4 Pascal Titan X gpus to speed up training
and increase effective memory size. Training took approximately one month._

Everytime I look at something like this I find a line like that and go: "ok
that's ncie.. I'll wait for the trained model".

~~~
rspeer
Yeah, part of what let word2vec make such a splash that it became the one word
embedding model everyone has heard of, is that the word2vec team released
their model.

This is a really cool example OpenAI has, but I don't know why I should
ultimately care about their character model more than anyone else's if all
we've got is their description of how cool it is.

I hope OpenAI defies their reputation for closedness and releases the model.

~~~
gdb
Yep weights will be up soon!

EDIT: in fact, weights were up at launch:
[https://github.com/openai/generating-reviews-discovering-
sen...](https://github.com/openai/generating-reviews-discovering-
sentiment/commit/15bfb78e4d5e92d5b5129a8b6ad86b100349eb5e)

~~~
rspeer
Sorry for my pessimistic outlook, then! Thanks.

------
emcq
It's very difficult to understand what the contributions are here. From what
I've read so far this feels more of a proposal for future research or a press
release than advancing the state of the art.

* Using large models trained on lots of data to provide the foundation for sample efficient smaller models is common.

* Transfer learning, fine tuning, character RNNs is common.

Were there any insights learned that give a deeper understanding of these
phenomena?

Not knowing too much about the sentiment space, it's hard to tell how
significant the resulting model is.

~~~
kleiba
* _advancing the state of the art_

It says right at the top: "we get 91.8% accuracy versus the previous best of
90.2%" on a standard sentiment corpus. In addition, their method needs less
training data than previous approaches.

* _Were there any insights learned that give a deeper understanding of these phenomena?_

The main appeal lies in the fact that a model trained on a (1) different and
(2) very general task basically "in passing" also learned to predict sentiment
(i.e., a specialized task that more or less arose from the domain the general
model was trained on), and pretty much through a single neuron (out of the
4096 used). The authors speculate that this might be a general effect that
could also be transferred to other prediction tasks.

~~~
emcq
If the main contribution here is the quality of the model and its interesting
and powerful representation of text, I hope OpenAI does something
distruptively different and releases the weights and trained model.

The accidental sentiment neuron is a function of the model, distribution of
the input dataset, and the optimizer finding nice saddle points. Insight into
these foundational components would make these results amazing. It sounded
like training on other datasets doesn't have the same sentiment properties,
which provides a lever to explore these concepts more.

At the moment it feels like the Google cat neuron. It attracted a lot of
intrigue but the individual contribution from that in terms of research was
more on the infrastructure side, and few people seem to refer back to that
publication at this point.

That said OpenAIs mission in itself doesn't necessarily require novel
research. For example, the gym is fostering a competitive atmosphere for the
community to work on RL which hopefully leads to more progress in the field.

Training a model for a month is difficult and if it has captured interesting
phenomena it seems in the interest of the community to release the weights and
model. It would be hard for the community to reproduce this without a month of
compute and 83M Amazon reviews.

~~~
mappingbabeljc
Hi there, the weights and model are here:
[https://github.com/openai/generating-reviews-discovering-
sen...](https://github.com/openai/generating-reviews-discovering-sentiment)

~~~
emcq
This is awesome, thanks! My apologies I must have missed it somewhere.

------
wackspurt
(Apologies for the slightly incoherent post below)

I've been noticing a lot of work that digs into ML model internals (as they've
done here to find the sentiment neuron) to understand why they work or use
them to do something. Let me recall interesting instances of this:

1\. Sander Dieleman's blog post about using CNNs at Spotify to do content-
based recommendations for music. He didn't write about the system performance
but collected playlists that maximally activated each of the CNN filters
(early layer filters picked up on primitive audio features, later ones picked
up on more abstract features). The filters were essentially learning the
musical elements specific to various subgenres.

2\. The ELI5 - Explain Like I'm Five - Python Library. It explains the outputs
of many linear classifiers. I've used it to explain why a text classifier was
given a certain prediction: it highlights features to show how much or little
they contribute to the prediction (dark red for negative contribution, dark
green for positive contribution).

3\. FairML: Auditing black-box models. Inspecting the model to find which
features are important. With privacy and security concerns too!

Since deep learning/machine learning is very empirical at this stage, I think
improvements in instrumentation can lead to ML/DL being adopted for more kinds
of problems. For example: chemical/biological data. I'd be highly curious to
what new ways of inspecting such kinds of data would be insightful (we can
play audio input that maximally active filters for a music-related network, we
can visualize what filters are learning in an object detection network, etc.)

------
tshadley
"The selected model reaches 1.12 bits per byte."
([https://arxiv.org/pdf/1704.01444.pdf](https://arxiv.org/pdf/1704.01444.pdf))

For context, Claude Shannon found that humans could model English text with an
entropy of 0.6 to 1.3 bits per character
([http://languagelog.ldc.upenn.edu/myl/Shannon1950.pdf](http://languagelog.ldc.upenn.edu/myl/Shannon1950.pdf))

------
itchyjunk
I would imagine stuff like sarcasm is still out of reach though. It seems hard
for humans to understand it in text based communication. Also using anything
out of the standard sentimental model might throw it off. "This product is as
good as <product x> (where product x has been known to perform bad." I am just
trying to think of scenarios where a sentimental model would fail.

Sentimental neuron sounds fascinating too. I didn't realize individual neurons
could be talked about or understood outside of the concept of the NN. I am
thinking in terms of "black box" its often referenced to in some articles.

Since one of the research goal for openai is to train language model on
jokes[0], I wonder how this neuron would perform with a joke corpus.

\----------------------------

[0] [https://openai.com/requests-for-
research/#funnybot](https://openai.com/requests-for-research/#funnybot)

~~~
wackspurt
>>>Sentimental neuron sounds fascinating too. I didn't realize individual
neurons could be talked about or understood outside of the concept of the NN.
I am thinking in terms of "black box" its often referenced to in some
articles.

Yes, I agree. I recall seeing such individual neuron analysis before in
Karpathy's "The Unreasonable Effectiveness of Recurrent Neural Networks". He
takes a char-rnn that was training to predict the next character for source
code and finds neurons that have learned to do paranthesis/bracket
opening/closing.

------
aabajian
I'm trying to understand this statement:

"The sentiment neuron within our model can classify reviews as negative or
positive, even though the model is trained only to predict the next character
in the text."

If you look closely at the colorized paragraph in their paper/website, you can
see that the major sentiment jumps (e.g. from green to light-green and from
light-orangish to red) occur with period characters. Perhaps the insight is
that periods delineate the boundary of sentiment. For example:

I like this movie. I liked this movie, but not that much. I initially hated
the movie, but ended up loving it.

The period tells the model that the thought has ended.

My question for the team: How well does the model perform if you remove
periods?

~~~
jcoffland
Why would that matter? Human understanding of sentiment would also go down if
you removed vital information such as punctuation.

~~~
aabajian
My point would be to see how much the model is relying on punctuation. It
could provide insight as to why character-based models outperform word-based
models for sentiment analysis.

------
d--b
Can someone explain what is "unsupervised" about this? I'm guessing this is
what confuses me most.

I think this work is interesting, although when you think about it, it's kind
of normal that the model converges to a point where there is a neuron that
indicates whether the review is positive or negative. There are probably a lot
of other traits that can be found in the "features" layer as well.

There are probably neurons that can predict the geographical location of the
author, based on the words they use.

There are probably neurons that can predict that the author favors short
sentences over long explanations.

But what makes this "unsupervised"?

~~~
fiter
I wouldn't expect that the neurons are orthogonal on a set of features which
we find interesting (sentiment, geographical location). They could be bound up
in some other basis of features that we do not find interesting. Other people
do not expect this because there are papers about how to incentivize neurons
to correspond to interesting features.

~~~
wackspurt
>> Other people do not expect this because there are papers about how to
incentivize neurons to correspond to interesting features.

Could you clarify that statement? Are you saying that it was unusual for this
group to find such a neuron? Also, I did not know that there are papers on how
to incentivize neurons to correspond to interesting features. Could you please
give me some references on those?

~~~
fiter
The paper I was thinking of is called: "InfoGAN: Interpretable Representation
Learning by Information Maximizing Generative Adversarial Nets"[0]. I do not
have experience training and investigating neural nets, but from what I read
in that paper, there's no reason to presume you'll find neurons that represent
a feature you're interested in. In the paper they alter the reward function to
get neurons that correspond to the features they are interested in.

[0]
[https://arxiv.org/pdf/1606.03657v1.pdf](https://arxiv.org/pdf/1606.03657v1.pdf)

------
huula
Machine Learning has become more and more like archaeology after people start
saying "empirically" more and only provide a single or limited datasets.

------
andreyk
I think it's fair to criticize this blog post for being unclear on what
exactly is novel here; pre-training is a straighforward and old idea, but the
blog post does not even mention this. Having accessible write ups for AI work
is great, but surely it should not be confusing to domain experts or be
written in such a way as to exacerbate the rampant oversimplification or
misreporting in popular press about AI. Still, it is a cool mostly-
experimental/empirical result, and it's good that these blog posts exist these
days.

For what it's worth, the paper predictably does a better job of covering the
previous work and stating what their motivation was: "The experimental and
evaluation protocols may be underestimating the quality of unsupervised
representation learning for sentences and documents due to certain seemingly
insignificant design decisions. Hill et al. (2016) also raises concern about
current evaluation tasks in their recent work which provides a thorough survey
of architectures and objectives for learning unsupervised sentence
representations - including the above mentioned skip-thoughts. In this work,
we test whether this is the case. We focus in on the task of sentiment
analysis and attempt to learn an unsupervised representation that accurately
contains this concept. Mikolov et al. (2013) showed that word-level recurrent
language modelling supports the learning of useful word vectors and we are
interested in pushing this line of work. As an approach, we consider the
popular research benchmark of byte (character) level language modelling due to
its further simplicity and generality. We are also interested in evaluating
this approach as it is not immediately clear whether such a low-level training
objective supports the learning of high-level representations." So, they
question some built in assumptions from the past by training on lower-level
data (characters), with a bigger dataset and more varied evaluation.

The interesting result they highlight is that a single model unit is able to
perform so well with their representation: "It is an open question why our
model recovers the concept of sentiment in such a precise, disentangled,
interpretable, and manipulable way. It is possible that sentiment as a
conditioning feature has strong predictive capability for language modelling.
This is likely since sentiment is such an important component of a review" ,
which I tend to agree with... train a on a whole lot of reviews, it's only
natural to train a regressor for review sentiment.

------
eanzenberg
I think one of the most amazing parts of this is how accessible the hardware
is right now. You can get world-class AI results with the cost of less than
most used cars. In addition, with so many resources freely available through
open-source, the ability to get started is very accessible.

------
stillsut
> The model struggles the more the input text diverges from review data

This is where I fear the results will fail to scale. The ability to represent
'sentiment' as one neuron, and its ground truth as uni-dimensional seems most
true to corpuses of online reviews where the entire point is to communicate
whether you're happy with the thing that came out of the box. Most other forms
of writing communicate sentiment in a more multi-dimensional way, and the
subject of sentiment is more varied than a single item shipped in a box.

In otherwords, the unreasonable simplicity of modelling a complex feature like
sentiment with this method, is something of an artifact of this dataset.

------
gallerdude
The neural network is savage enough to learn "I would have given it zero
stars, but that was not an option." Are we humans that predictable?

~~~
teraflop
The training data consisted of 82 million reviews, so I'm sure that phrase (or
slight variants) occurred hundreds of thousands of times.

~~~
visarga
Could be checked by counting n-grams to see just how much it differs from
other reviews.

------
anonymfus
This article is not accessible. It puts all textual examples into images and
ever has some absolutely unnecessary animation. Please fix it.

~~~
ebildsten
Thanks for pointing this out! We've moved the textual examples into html,
added alt text for images, and will be reviewing feature posts for
accessibility

------
ChuckMcM
This is a great name for a band :-). That said, I found the paper really
interesting. I tend to think about LSTM systems as series expansions and using
that as an analogy don't find it unusual that you can figure out the dominant
(or first) coefficient of the expansion and that it has a really strong impact
on the output.

------
kamalbanga
What they have done is semi-supervised learning (Char-RNN) + supervised
training of sentiment. Another way to do is semi-supervised learning
(Word2Vec) + supervised training of sentiment. If first approach works better,
does it imply that character level learning is more performant than word level
learning?

------
mdibaiee
As far as I understand, it means that there must be a relation between a
character's sentiment and what the next character can (/should) be for neural
network to use this as a feature, am I right?

Does this mean we have unconsciously developed a language that exposes such
relations?

~~~
vhold
They muse about the reason behind the sentiment neuron in the paper.

"It is an open question why our model recovers the concept of sentiment in
such a precise, disentangled, interpretable, and manipulable way. It is
possible that sentiment as a conditioning feature has strong predictive
capability for language modelling. This is likely since sentiment is such an
important component of a review."

They go on to frame that as an important consideration for further work like
this:

"Our work highlights the sensitivity of learned representations to the data
distribution they are trained on. The results make clear that it is
unrealistic to expect a model trained on a corpus of books, where the two most
common genres are Romance and Fantasy, to learn an encoding which preserves
the exact sentiment of a review."

I'm wondering if a "funniness" neuron could be discovered in a model trained
on millions of jokes of various funniness, or what sorts of undiscovered
meaning there is in other neurons in this model.

------
kvh
Impressive the abstraction NNs can achieve from just character prediction. Do
the other systems they compare to also use 81M Amazon reviews for training?
Seems disingenuous to claim "state-of-the-art" and "less data" if they
haven't.

------
auvi
just wondering, how many AI programs (models with complete source code) OpenAI
has released?

~~~
tshadley
Lot of stuff here: [https://github.com/openai](https://github.com/openai)

------
du_bing
Train on character-by-character basis, this is really incredible, quite
opposite to human's intuition about language, but it seems a brilliant idea,
and OpenAI tried it out, great!

------
mrfusion
why did they do this character by character? Would word by word make sense?
Other than punctuation I'm not seeing why specific characters are meaningful
units.

~~~
tshadley
Word by word would require adding prior knowledge of words into the system,
and they're trying to "start from scratch" as much as possible.

------
djangowithme
Why is the linear combination used to train the sentiment classifier? Why does
its result get taken into account?

Is this linear combination between 2 different strings?

------
changoplatanero
What's the easiest way to make a text heatmap like the ones in their blog?

------
sushirain
Very interesting. I wonder if they tried to predict part-of-speech tags.

~~~
visarga
That would probably work. Karpathy's character based RNN could detect semantic
meaning in text and code. [http://karpathy.github.io/2015/05/21/rnn-
effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

------
grandalf
This has amazing potential for use in sock puppet accounts.

------
curuinor
moved that needle I guess

