

Should We Teach Literature Students How To Analyze Texts Algorithmically? - espeed
http://www.fastcolabs.com/3016699/should-we-teach-literature-students-how-to-analyze-texts-algorithmically

======
mathattack
_Gottschall believes that these problems (the humanities’ increasing
irrelevance, decreasing popularity, and the subjects ' inability to ever
answer questions conclusively) are not only connected, but can be solved by
implementing the kind of data mining tools and computational textual analysis
carried out by the likes of Patrick Juola._

I can't see this saving the humanities. I see it further burying the
humanities. The humanities live by asking deep fundamental questions about,
well, being human. Being upstaged by a computer hurts them more than helps.

I'm a big believer in the power of algos like the ones discussed. I'm also a
big believer that humanities are one of the few fields that can retain value
in a post-MOOC world. (You can learn accounting and programming online. Small
group discussions are best for Shakespeare.) But I think that algorithms in
literature will hurt the field.

~~~
Houshalter
One can have a discussion online. In person might be preferable but it's a
pretty good alternative.

I don't know much about the field but it doesn't seem like algorithmic
analysis would be terribly useful. But there are some really interesting
things that can be done.

And enough with the "computers vs humans thing". It's just a tool that helps
analyze doing things that would be extremely time consuming for a human, like
counting words, or looking for statistical patterns, or whatever.

------
azerty
I'd would argue the question is rather "CAN we teach literature ...".

tl;dr -> you can't in european, post-modernism-influenced universities,
because it's a math-less culture.

Not so long ago I was still a linguistics student, trying to specialize in
NLP. Eventually I dropped. I dropped when I was asked to apply a multivariate
model used to classify english texts in genres, to a corpus of texts written
in french. Mind you, i wasn't asked to follow a specific methodology and re-do
what had been done for english texts. No. I was asked to directly "translate"
english-specific linguistic features, and then use the associated weights of
the original model. I guess this would be the equivalent of trying to, say,
classifying pictures of cats using a model to classify pictures of cars, by
somehow translating cat pictures to car pictures beforehand. I have a few more
examples of this kind, but really there was little to no mathematical culture
in my uni linguistics dpt, where it can take 2 mins for a NLP researcher to
figure out 95% of 10000 = 9500. So I dropped.

I dropped yeah, but why did I even got into this cursus in the first place ?
Because, we, continental people, don't believe everything can be reduced to a
universal formalism. Especially in fields such as semiotics. There is always
deeper truths to whatever absolute formalism you bring on the table. A text is
an alchemy and, to us, it is unconceivable that some kind of periodic table of
semiotic-elements is ever to be discovered. Meaning has no pre-defined
ontology waiting for us to discover it. Adios Mendeleïev.

What does it lead to ? It leads to intellectual confrontation. Confrontation
against the american devil. We gave you a name : formalist-positivist. You're
in the wrong. We shall impale you eventually and thus we spit on your
dictatorial formalisms (even shannon's information theory, shhh...), even
though leaders as important as Saussure or Levy-Strauss called for a
mathematization of the field.

So why did I embark on this boat ? Because there was a promise. The promise to
study language in itself and by itself. You start by reading Saussure Cours de
linguistique generale, then you're asked to shed lights on his definitions of
langue and parole. It's hard. It's very hard because a) it's not formal b)
you're subject to the arbitrary judgement of whoever rates your work just like
in ... philosophy. So you dig, deeper and deeper, and at some point you figure
out : a) Saussure never wrote the aforementioned book (his disciples did, just
like ... Jesus) b) and he wrote in one of his last letter that he did not
believe there was any concept in linguistics grounded on solid basis. But they
don't really care. It's in the goddamn book. It's worth teaching, heh. And
that makes you want to quit. You might ask : "So it's like philosophy ? There
is no real foundation, it's just a field culture ?". I don't know. What I know
for sure is that I've NEVER had any philosophy courses (this includes
philosophy of language).

Hopefully there are some cognitive linguistics. With numbers and stats, with
stddevs and error margins. What are these things ? What's the role of these
meta-numbers in understanding the phenomenon being observed ? You ask about
it. You're replied it's not in the course. You insist because, well the devil
is in the details, and these numbers seem pretty important. Alas, it's too
complicated for a freshman. Finally, you're expected to do a mongoloidly
difficult exercise : to constitue a word-list for a priming experiment. You
work your ass off, taking inspiration from the little physics you know
(because THAT is real science) and deliver a 60 pages-long paper only to see
other students just dumped a one-page explanation. The professors frowns at
your work. Something must be wrong with you.

So, that's basically the state of linguistics in my country (France). It's not
really philosophy, it's not real science, it lies between heaven and
underground caves : right in a pool of mud.

My conclusion is that you can't teach algorithmic methods to literature
students here, because almost nobody in the field can. We want the
irreducibility of alchemy but still think we can reach this goal while still
getting away with Maths. Are we hoping we'll find a way to whisper magic words
at the cauldron and get results out of the process ? Probably.

It's really a shame, because there are plenty of good theories, unfortunately,
very few have the skills to implement them, and they are very badly marketed
(publishing in french for instance ...). And the post-modern flavor the whole
field has, is not even something bad. I recently read Deleuze' "What is
philosphy" and it's just amazing - of course, it's not crystal clear, but if
you can read between the lines, you start making connection with recent
advancements : vector space model, compositional distributional model of
meaning (Oxford quantum-categoretic grammars), dynamical models of cognition
(Polythechnique, Petitot), Fractal Patterns in Reasonning (Groningen,
Atkinson). All of them using advanced maths. Can I understand the details ?
Absolutely not. Does my guts tell me these guys have put their hands on
something promising ? Absolutely yes. But I suck at math, I suck, so I dropped
to avoid sucking at math even more.

Anyway it doesn't matter. I'm looking for a crappy 9-5 job, now, cobbling
together web-forms or something.

------
voyou
I'm surprised this article doesn't mention Franco Moretti[1], who is probably
the foremost literary scholar using algorithms to analyse texts. Moretti
advocates what he calls "distant reading" (the opposite of close reading),
which would allow you to study trends across the whole history of world
literature, rather than just in individual books one at a time.

[1]: [http://www.versobooks.com/authors/64-franco-
moretti](http://www.versobooks.com/authors/64-franco-moretti)

------
eseehausen
This sort of thing does nothing to undermine the very real criticisms raised
by literary theory. I understand that it's confusing and frustrating for
people not versed in the theory (much like the code made to process the texts
would be to somebody who does not know how to program), but the naive turn to
"objectivity" in this case isn't just a step back from today's theory- it's a
step back from New Criticism, which peaked in the mid-20th century.

That's not to say that NLP and things like this aren't vitally important. It
can be convincingly argued that they're more productive and important in terms
of immediate social good. However, I'm pretty sure we already have a field
(besides computer science) that does this sort of work: it's called forensic
science.

~~~
sdoering
Well for me actually having studied this, theories like Structualism, Russian
Formalism or Semeiotics actually are better suited for literature, then
theories not so rooted in "objectivity".

So no, not a step back (imho) - but could provide some good takes on
literature with interesting tools.

------
j2kun
This article isn't discussing pedagogy by any means, but whether Literature as
a _profession_ should take up algorithmic text analysis as a scientific tool
to promote or refute literary theories and analyze writing style.

I think the latter is good but teaching students to appreciate literature is
far more pressing then teaching them to tally up word counts to unmask JK
Rowling's latest book.

~~~
sdoering
Sadly this post isn't really about teaching students (from university)
anything on literature. If it were so, they would have to look at very
different theories/schools of study of literature.

There are quite a lot of schools/theories that could greatly benefit from a
technical supported empirical method.

For example inter-textuality[1], the school that practically says every text
has links to other texts, quotes, parts that are plagiarism, different stances
on the shared topics or are just in the same genre.

So yeah, Detecting hidden "links" between texts could give you interesting
insights, or might tell you, that there could be "hidden" texts, still unknown
to us, as they were lost.

Reading (and writing my final thesis on on of his books) a lot from Umberto
Eco showed me, what it means, that literary works are nodes in a network.
Reading "The Name of the Rose" (and not just watching the movie) really drove
home this point.

Especially, if you know a little bit about history or the basic structure of
the genre "detective stories" ;-)

[1]:
[https://en.wikipedia.org/wiki/Palimpsests:_Literature_in_the...](https://en.wikipedia.org/wiki/Palimpsests:_Literature_in_the_Second_Degree)
(the German version is much more elaborate)

------
MWil
Historical Literature students, sure.

They are still working out who authored the Federalist papers and they're
using NLP methods like those described to do so. Of course, you'd be surprised
the amount of effort that might have gone into obfuscating that information.

"Hey, Jefferson, you write like you're me or Adams, I'll write like I'm Adams
or you, and he'll write like..."

------
mrcactu5
if English majors won't do it -- I will. My guess is that machine learning can
approximate and formalize some of their delicate and thoughtful intuitions.

From what I have browsed in bookstores, graduate level English and Sociology
can resemble data structures in computer science textbooks

