

Pattern Recognition in Texts Using Complex Networks - Rod
http://arxiv.org/abs/1007.3254

======
sqrt17
Can someone who has more of a network theory background say why this would be
interesting?

From an NLP angle, both what they're doing (text classification) and how
they're doing it (constructing a co-occurrence matrix) don't sound
particularly novel nor do the network-theoretic properties they get from the
unweighted, undirected form of the co-occurrence matrix seem to give any
valuable insights.

As a comparison, see the 2009 workshop on text graphs
<http://www.textgraphs.org/ws10/index.html> or papers such as Gaume et al
(2007) Semantic associations and confluences in paradigmatic networks
[http://w3.erss.univ-
tlse2.fr/textes/pagespersos/gaume/resour...](http://w3.erss.univ-
tlse2.fr/textes/pagespersos/gaume/resources/Gaume_Duvignau_Vanhove_final.pdf)

Did I mention that the physics people totally ignore all the (interesting and
non-trivial) existing literature on the topic? It's a bit as if a CS/NLP
person would write a paper on an information-theoretic approach to physics
while totally ignoring the physics bits in it.

------
mark_l_watson
I just gave the PDF a quick read, and it looked useful enough to put in my NLP
permanent reading collection. There seems to be growing momentum in both NLP
research and the number of good papers that are freely available. Last month,
someone posted a link to "ICWSM – A Great Catchy Name: Semi-Supervised
Recognition of Sarcastic Sentences in Online Product Reviews" on HN - another
interesting and potentially useful paper.

------
hooande
Interesting that they used the research equivalent of a Minimum Viable
Product. From the paper:

 _Of course, more complicated semantic network models are certainly possible.
For instance, one could construct a weighted network. However, we sought the
simplest possible model which could distinguish between fictional and non-
fictional written storytelling._

