Imagine if this discovery were made by some undergraduate student who had little experience in the traditions of how ML benchmark experiments are done, or was just starting out her ML career. Would we be just as critical?
As a researcher, I like seeing shorter communications like these, as it illuminates the thinking process of the researcher. Read ML papers for the ideas, not the results :)
I personally don't mind blog posts that have a bit of hyped-up publicity. It's thanks to groups like DeepMind and OpenAI that have captured public imagination on the subject and accelerated such interest in prospective students in studying ML + AI + robotics. If the hype is indeed unjustified, then it'll become irrelevant in the long-term. One caveat is that researchers should be very careful to not mislead reporters who are looking for the next "killer robots" story. But that doesn't really apply here.
We were very surprised that our model learned an interpretable feature, and that simply predicting the next character in Amazon reviews resulted in discovering the concept of sentiment. We believe the phenomenon is not specific to our model, but is instead a general property of certain large neural networks that are trained to predict the next step or dimension in their inputs
I think it says something very interesting about human language and information processing in general.
http://karpathy.github.io/2015/05/21/rnn-effectiveness/ towards the end has similar methodology and is 1.5 years old.
Hype is an interesting thing especially when it comes from laymen.
Agreed, as can attributing value to said hype.
An argument can generally be made for many things for why they "can be useful, in X situation" (like 'a layperson understand ML'), doesn't mean it has value in every context.
(Or even that it's a particularly good example for its contrived purpose - just that it could/might suffice if nothing better exists.)
Are you sure about that? We're talking about a model/robot which understand sentiments, and can generate fake reviews to boost fake products. I can easily see this being picked up by the AI hype journalists. In fact, this model could even be used for nefarious purposes.
Markov-chain generators have been around for a while, and have been used to throw off spam detectors. This should not stop research, but instead grow more research into adversarial usage of machine learning models.
Surely you mean AI journalist bots that can generate articles about how AI review bots generate fake reviews?
Some people can't stand to miss an opportunity to remind everyone about how smart they are.
However... one criticism I have of the article is that their first graph doesn't start the y-axis at zero, giving a false impression of how much their method improves on others.
They start with:
> Our L1-regularized model matches multichannel CNN performance with only 11 labeled examples, and state-of-the-art CT-LSTM Ensembles with 232 examples.
Hmm, that sounds pretty impressive. But then later you read:
> We first trained a multiplicative LSTM with 4,096 units on a corpus of 82 million Amazon reviews to predict the next character in a chunk of text. Training took one month across four NVIDIA Pascal GPUs
Wait, what? How did "232 examples" transform into "82 million"??
OK, I get it: they pretrained the network on the 82M reviews, and then trained the last layer to do the sentiment analysis. But you can't honestly claim that you did great with just 232 examples!
Which, now that I think about it, makes the human brain and its amazing adaptive general-game-playing abilities a bit less mysterious. Since we humans all have these huge corpuses of sense-data we've been receiving reinforcement signals about since birth, we've likely built up all sorts of low-level models which we just use to predict the world for reflex responses a little bit better and faster (speech models so we can respond to what people are saying even as they're still saying it, visual models so we can throw spears where lions are going to be instead of where they are, etc.) But those low-level predictive models make it nearly effortless to build higher-level models.
I wonder if we'd take a giant leap forward in AI if we just managed to scan+emulate a regular animal brain (say, of a rat), and then built the AI as a neocortex-equivalent for that brain. It would have instant access to thousands or millions of pre-trained low-level predictive models, which it could easily discover as having outputs correlated to success and thus "attach to" during its own training.
Most of the time, in neural net land, what people are doing with the fine tuning part is taking a model trained on looaads of supervised data, chopping off the head and using those features to train on smaller data. This OpenAI method is different in that it used patterns it learned on its own, instead of the recently more common technique of features extracted from a heavily label trained model to reduce the supervised learning burden in a nearby domain.
Arguably yes, this is an ancient technique but it has mostly been forgotten when it became clear that many problems are surmountable with a large enough helping of GPUs and a small moon's worth of data. OpenAI's is a good idea because it makes you say 'yeah that's obvious, pretrain a simple char rnn on loads of free text and oh wait, why has no one tried this before!?'
What is interesting here is that such a straight forward method compares so well to glittering methods that laboriously advanced the state of the art. What I also found surprising was that there was a 'neuron' that was tracking something very close to sentiment. Why?
A bit of thinking and I came to a simple idea. One way of looking at the LSTM in the practical setting (as opposed to a theoretically Turing Equivalent thing) is as a really big finite state rube goldberg machine. In learning to predict the next character, it makes sense that one set or part of a set of states it can enter/track is extremely correlated with what we humans call sentiment in review text.
In summary, the trained model can be thought of as a computable theory of amazon reviews that also works really well on IMDB reviews (and probably short but probably not sarcastic text reviews in general).
This feels a bit different, in that yesterday I would have had no strong intuition that "a char-rnn can detect text sentiment better than sota". Looking now, I can rationalize that idea. I get why it might make sense, but it was non-obvious.
Do you disagree with any of that? (and if you do, I'd love to see this distance of transfer in literature, its always cool to read up on these things)
Also the technique is quite novel: This is not pre-trained nets on labeled data, it is an unsupervised generative model.
Future research directions are exciting: Unsupervised prediction of the next frame in a video, and then being able to one-shot learn a wide range of visual tasks.
 https://arxiv.org/abs/1412.6056 "Predicting Deeper into the Future of Semantic Segmentation"
This is what my NLP company (Luminoso) does -- we train a domain-general model of word meanings on a lot of data, then do the last step on the probably-small amount of specific data you actually have.
Even customers who are knowledgeable about machine learning usually haven't heard of the idea before. They've been assuming that the only way to do NLP is to get millions of labeled examples. Or to get a thousand labeled examples and put them into the kind of off-the-shelf algorithm that needs millions of labeled examples, which of course goes poorly.
The main interesting thing is that none of the Amazon data was labeled, while the 232 labeled examples were.
For me this open my mind to new opportunities when training deep learning. For example I can do the same for images: train a network to recognize objects and later use the same network to predict sentiment or prettiness for example. And the best thing is that I don't need a lot of labeled examples for the second phase of training!
This LSTM simply learned to predict the data. It didn't learn some other supervised task.
This would be more similar to autoencoder pretraining, but even that is not quite the same.
With the 232 examples it learned What bad sentiment sentences look like and which words occur in them.
If I were perusing a dozen reviews I probably wouldn't have spotted the AI-generated ones in the crowd.
"I couldn’t figure out how to stop this drivel. At worst, it was going absolutely nowhere, no matter what I did.Needles to say, I skim-read the entire book. Don’t waste your time."
is there a sarcasm neuron in there too?
I think that'll change, eventually...
It would be interesting to see how it performed for other NLP tasks. I'd be pretty interested to see how many neurons it uses to attempt something like stance detection.
Data-parallelism was used across 4 Pascal Titan X gpus to speed up training and increase effective memory size. Training took approximately one month.
Everytime I look at something like this I find a line like that and go: "ok that's ncie.. I'll wait for the trained model".
This is a really cool example OpenAI has, but I don't know why I should ultimately care about their character model more than anyone else's if all we've got is their description of how cool it is.
I hope OpenAI defies their reputation for closedness and releases the model.
EDIT: in fact, weights were up at launch: https://github.com/openai/generating-reviews-discovering-sen...
Well an unsupervised technique that learns this much meaning from text is amazing! I meant it when I said this might supplement word2vec, and that would make it one of the most important breakthroughs in years.
The comments critical of OpenAI don't make a lot of sense. They have always been very good at releasing stuff, and my comment about waiting for a trained model should be read as jealousy over not being able to train it myself..
Does not compute.
Although in this case, they did post the weights quickly.
* Using large models trained on lots of data to provide the foundation for sample efficient smaller models is common.
* Transfer learning, fine tuning, character RNNs is common.
Were there any insights learned that give a deeper understanding of these phenomena?
Not knowing too much about the sentiment space, it's hard to tell how significant the resulting model is.
It says right at the top: "we get 91.8% accuracy versus the previous best of 90.2%" on a standard sentiment corpus. In addition, their method needs less training data than previous approaches.
* Were there any insights learned that give a deeper understanding of these phenomena?
The main appeal lies in the fact that a model trained on a (1) different and (2) very general task basically "in passing" also learned to predict sentiment (i.e., a specialized task that more or less arose from the domain the general model was trained on), and pretty much through a single neuron (out of the 4096 used). The authors speculate that this might be a general effect that could also be transferred to other prediction tasks.
The accidental sentiment neuron is a function of the model, distribution of the input dataset, and the optimizer finding nice saddle points. Insight into these foundational components would make these results amazing. It sounded like training on other datasets doesn't have the same sentiment properties, which provides a lever to explore these concepts more.
At the moment it feels like the Google cat neuron. It attracted a lot of intrigue but the individual contribution from that in terms of research was more on the infrastructure side, and few people seem to refer back to that publication at this point.
That said OpenAIs mission in itself doesn't necessarily require novel research. For example, the gym is fostering a competitive atmosphere for the community to work on RL which hopefully leads to more progress in the field.
Training a model for a month is difficult and if it has captured interesting phenomena it seems in the interest of the community to release the weights and model. It would be hard for the community to reproduce this without a month of compute and 83M Amazon reviews.
> We were very surprised that our model learned an interpretable feature, and that simply predicting the next character in Amazon reviews resulted in discovering the concept of sentiment.
And then they write:
> We believe the phenomenon is not specific to our model, but is instead a general property of certain large neural networks that are trained to predict the next step or dimension in their inputs.
So they can't explain why a phenomenon is occurring, but they think that it generalizes to other contexts.
I find it all very unconvincing. Is this kind of writing common in the deep learning literature?
I've been noticing a lot of work that digs into ML model internals (as they've done here to find the sentiment neuron) to understand why they work or use them to do something. Let me recall interesting instances of this:
1. Sander Dieleman's blog post about using CNNs at Spotify to do content-based recommendations for music. He didn't write about the system performance but collected playlists that maximally activated each of the CNN filters (early layer filters picked up on primitive audio features, later ones picked up on more abstract features). The filters were essentially learning the musical elements specific to various subgenres.
2. The ELI5 - Explain Like I'm Five - Python Library. It explains the outputs of many linear classifiers. I've used it to explain why a text classifier was given a certain prediction: it highlights features to show how much or little they contribute to the prediction (dark red for negative contribution, dark green for positive contribution).
3. FairML: Auditing black-box models. Inspecting the model to find which features are important. With privacy and security concerns too!
Since deep learning/machine learning is very empirical at this stage, I think improvements in instrumentation can lead to ML/DL being adopted for more kinds of problems. For example: chemical/biological data. I'd be highly curious to what new ways of inspecting such kinds of data would be insightful (we can play audio input that maximally active filters for a music-related network, we can visualize what filters are learning in an object detection network, etc.)
For context, Claude Shannon found that humans could model English text with an entropy of 0.6 to 1.3 bits per character (http://languagelog.ldc.upenn.edu/myl/Shannon1950.pdf)
Sentimental neuron sounds fascinating too. I didn't realize individual neurons could be talked about or understood outside of the concept of the NN. I am thinking in terms of "black box" its often referenced to in some articles.
Since one of the research goal for openai is to train language model on jokes, I wonder how this neuron would perform with a joke corpus.
Yes, I agree. I recall seeing such individual neuron analysis before in Karpathy's "The Unreasonable Effectiveness of Recurrent Neural Networks". He takes a char-rnn that was training to predict the next character for source code and finds neurons that have learned to do paranthesis/bracket opening/closing.
"The sentiment neuron within our model can classify reviews as negative or positive, even though the model is trained only to predict the next character in the text."
If you look closely at the colorized paragraph in their paper/website, you can see that the major sentiment jumps (e.g. from green to light-green and from light-orangish to red) occur with period characters. Perhaps the insight is that periods delineate the boundary of sentiment. For example:
I like this movie.
I liked this movie, but not that much.
I initially hated the movie, but ended up loving it.
The period tells the model that the thought has ended.
My question for the team: How well does the model perform if you remove periods?
This seems to have to do with a pretty deep understanding of grammar; the model waits until it the low-level neurons have something to pass up (decoding of a complete unit of meaning) before using that to update its sentiment neuron.
A lot of next-character or next-word prediction ends up working like this - internally, the model keeps some state and makes big changes to its understanding at points that have to do with the structure of the stream.
I think this work is interesting, although when you think about it, it's kind of normal that the model converges to a point where there is a neuron that indicates whether the review is positive or negative. There are probably a lot of other traits that can be found in the "features" layer as well.
There are probably neurons that can predict the geographical location of the author, based on the words they use.
There are probably neurons that can predict that the author favors short sentences over long explanations.
But what makes this "unsupervised"?
Could you clarify that statement? Are you saying that it was unusual for this group to find such a neuron? Also, I did not know that there are papers on how to incentivize neurons to correspond to interesting features. Could you please give me some references on those?
Had they created a next-move predictor for chess, they wouldn't have been surprised to find a neuron representing the aggressiveness of the player.
It's a good result on its own but the word "unsupervised" is a bit annoying.
For what it's worth, the paper predictably does a better job of covering the previous work and stating what their motivation was: "The experimental and evaluation protocols may be underestimating the quality of unsupervised representation learning for sentences and documents due to certain seemingly
insignificant design decisions. Hill et al. (2016) also raises concern about current evaluation tasks in their recent work which provides a thorough survey of architectures and objectives for learning unsupervised sentence representations - including the above mentioned skip-thoughts. In this work, we test whether this is the case. We focus in on the task of sentiment analysis and attempt to learn an unsupervised representation that accurately contains this concept. Mikolov et al. (2013) showed that word-level recurrent language modelling supports the learning of useful
word vectors and we are interested in pushing this line of
work. As an approach, we consider the popular research
benchmark of byte (character) level language modelling
due to its further simplicity and generality. We are also interested in evaluating this approach as it is not immediately clear whether such a low-level training objective supports the learning of high-level representations." So, they question some built in assumptions from the past by training on lower-level data (characters), with a bigger dataset and more varied evaluation.
The interesting result they highlight is that a single model unit is able to perform so well with their representation: "It is an open question why our model recovers the concept of sentiment in such a precise, disentangled, interpretable, and manipulable way. It is possible that sentiment as a conditioning feature has strong predictive capability for language modelling. This is likely since sentiment is such an important component of a review" , which I tend to agree with... train a on a whole lot of reviews, it's only natural to train a regressor for review sentiment.
This is where I fear the results will fail to scale. The ability to represent 'sentiment' as one neuron, and its ground truth as uni-dimensional seems most true to corpuses of online reviews where the entire point is to communicate whether you're happy with the thing that came out of the box. Most other forms of writing communicate sentiment in a more multi-dimensional way, and the subject of sentiment is more varied than a single item shipped in a box.
In otherwords, the unreasonable simplicity of modelling a complex feature like sentiment with this method, is something of an artifact of this dataset.
Does this mean we have unconsciously developed a language that exposes such relations?
"It is an open question why our model recovers the concept of sentiment in such a precise, disentangled, interpretable, and manipulable way. It is possible that sentiment as a conditioning feature has strong predictive capability for language modelling. This is likely since sentiment is such an important component of a review."
They go on to frame that as an important consideration for further work like this:
"Our work highlights the sensitivity of learned representations to the data distribution they are trained on. The results make clear that it is unrealistic to expect a model trained on a corpus of books, where the two most common genres are Romance and Fantasy, to learn an encoding which preserves the exact sentiment of a review."
I'm wondering if a "funniness" neuron could be discovered in a model trained on millions of jokes of various funniness, or what sorts of undiscovered meaning there is in other neurons in this model.
Is this linear combination between 2 different strings?