

Analyzing stylistic similarity amongst authors - lingben
http://markallenthornton.com/blog/stylistic-similarity/

======
mirimir
This is very cool work! Some years ago, I was interested in text mining. I
ended up playing with latent semantic analysis using Lucene etc. But that was
a largely random choice, driven by the availability of open-source software
and online discussion.

However, as cool as stylistic analysis is, I'm concerned about implications
for online anonymity (which I consider valuable). But maybe the risk is
limited by typical text length and false positive rate. I welcome suggestions
for further reading.

------
nicolewhite
Very cool. I wonder why the author decided to use igraph within R instead of
Python, as he was already using Python for the frequencies.

~~~
cronbachs_beta
Hi - I'm the author. Glad you liked it! I'm afraid I had no particularly good
reason for switching to R for the visualization other than that I'd used R for
network graphs in the past so I already had code written.

~~~
nicolewhite
I can understand that. I find it's easier to use igraph in R, especially if
you're going to be doing visualizations. igraph is alright in IPython
notebooks, but getting the visualizations to work is a bit of a pain, whereas
it works out of the box in RStudio.

I don't even really like igraph for visualizations, though. It's great for
graph algorithms like community detection, but for visualizations I'll usually
jump into something interactive like visNetwork. Check out this slide, for
example:
[http://nicolewhite.github.io/neo4j-presentations/RNeo4j/Visu...](http://nicolewhite.github.io/neo4j-presentations/RNeo4j/Visualizations/Visualizations.html#32)

~~~
cronbachs_beta
Yeah, igraph's visualizations aren't perfect. I've explored using d3 a bit in
some of my other work (e.g. [http://markallenthornton.com/blog/price-of-
flavor/](http://markallenthornton.com/blog/price-of-flavor/)). It's great for
interactive graphs, but it starts to tax the browser pretty heavily for larger
ones (though perhaps that's just my poor coding). Thanks for the visNetwork
suggestion - I'll have to check that out!

