
Long Short-Term Memory-Networks for Machine Reading - jonbaer
http://gitxiv.com/posts/tfkjEgw9x4KSi2GnH/long-short-term-memory-networks-for-machine-reading
======
vonnik
Great paper. For anyone who needs a primer on LSTMs, this one has a GIF:
[http://deeplearning4j.org/lstm.html](http://deeplearning4j.org/lstm.html)

~~~
jimfleming
Chris Olah also has a great introduction:
[http://colah.github.io/posts/2015-08-Understanding-
LSTMs/](http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

------
nl
We're _so_ close to getting rid of feature engineering for text processing.

As someone who does this professionally, this is a good thing!

~~~
transpy
Can you ellaborate on this please? My goal is to work in text processing
professionally, I have the intuition that text should 'learn' its own
features. I'm learning Python, reading about machine learning and practicing
my coding all the time.

~~~
existencebox
Recently I had to build an algorithm that did a form of substring detection.
As part of this process I had to generate feature vectors for the actual model
to classify tokens of the string as part-of-substring or not. Prior to this
though, I had to do a few levels of tokenization, normalization, and
preprocessing, to get the raw text into a form that the substring classifier
could use the resultant data effectively.

Ideally if you have a good parse of the text from step 0, you don't need to do
nearly as much munging/processing yourself, and can just focus on the thrust
of your specific algo, rather than cleaning and generating the feature data to
drive it.

I'm personally very skeptical as to if this will make feature engineering
disappear entirely, if for no other reason than that we do a lot to tweak our
features other than just _getting_ them, whether this be second order
processing, aggregation, smoothing or transformation. That being said, as with
your parent post, I am VERY hopeful for new techniques to arise that can cut
into that overhead at least a little.

