

Humanities Data: A Necessary Contradiction - diodorus
http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/

======
acbart
I help teach a "Computational Thinking" undergraduate course for non-CS
majors. The topic at heart here is essentially Abstraction. Specifically "Data
Abstraction", since there's more than one kind of abstraction in a bigger
picture, but we don't try to tackle that nuance with our students.

We say that Abstracting an entity (the most vague word we've found, referring
to people, places, things, events, and even other abstractions) causes you to
lose details. That abstraction is contextualized to a stakeholder, since you
have to have some criteria for deciding what details you keep and lose. An
abstraction for a person would be different if it was for Facebook ("Name",
"Age", "List of Friends") or for a doctor ("Name", "Age", "Sickness").
Abstraction is powerful because it lets us concretize real, vague things into
something that a computer can understand and process ("Because computers are
very stupid!").

I often discuss how an abstraction loses a lot of valuable data. One student
was working with NFL games, representing them a set of properties like ("away
team name", "home team name", "away team wins", "home team wins"). That
representation ignores how many interceptions were made, the weather that day,
the feelings that the crowd had, how much effort the players put into training
- a lot of "humanist" data.

This article makes me wonder what goes through their minds. No one's
complained about making these abstractions - they've never suggested we're
ruining a game, a historical event, or whatever we're abstracting. But I think
maybe we should encourage the students to think about what happens when you
abstract something and what it means for the entity and not just the
stakeholder.

~~~
miriamkp
That's really interesting and helpful — thanks. I've heard CS people talk
about abstraction a lot, but hadn't realized they were using the term in the
way you describe.

~~~
acbart
I don't know how common my definition is among the CT community.
Frustratingly, few researchers operationally define the term (cardinal sin in
Education research!) and establish graded metrics based on it. I haven't met
any other researchers who have resisted our definition, but there are other
definitions. Repenning has one that is particularly... distinctive from my
own:

[http://www.cs.colorado.edu/~ralex/papers/PDF/SIGCSE10-repenn...](http://www.cs.colorado.edu/~ralex/papers/PDF/SIGCSE10-repenning.pdf)

------
mcguire
I would like to recommend Matthew Jockers' book, _Macroanalysis_ if you saw
something interesting in this article.

Jockers is, among other things, a scholar of Irish and Irish-American writers
in the 19th and 20th centuries (IIRC). Apparently, according to previous work,
Irish-American authors had "gone quiet" in the first decades of the 20th
century, a fact that needed explanation. One of Jockers' early results,
however, was that the quiet period was an artefact of the limited set of
authors studied---the "close reading" model that produces statements like,
"we’ve immersed ourselves so deeply in our source material that we’re attuned
to its nuances" also means that the majority of source material hasn't been
examined at all. In this case, a certain subgroup of authors, male and on the
east coast of the US, _may_ have stopped producing, but that output was more
than replaced by other authors.

