
Code is not Literature - antifuchs
http://www.gigamonkeys.com/code-reading/
======
ef4
Asking people what they've read "just for the heck of it" is the wrong
question, because _code is not linear_ , so it's extremely ungainly to read
without purpose.

But as soon as the skilled code reader has a purpose in mind -- a question to
answer -- he or she can rapidly find a meaningful narrative. Put into that
context, programmers read code constantly, and the more they read the better
they get.

So I don't like the "nobody actually reads code" claim. It's a strawman. When
I tell people to read code, it's always in the context of "pick something you
want to understand or fix, and read with that purpose in mind." Not "the Linux
kernel is like Moby Dick, you should really read it all."

~~~
girvo
Yes I agree. But, for myself, even (or especially) with a purpose I mind, I
_decode_ , not read per se. I've recently been working through the selfoss
source and adding some new features, and this is the tack I've taken through
it -- decode the source to find where the feature should be added. So I agree
with you, and the OP.

~~~
ef4
True, but that's just a question of density. There are books written in
English that are so dense that they too need to be decoded.

------
jacobolus
I think the author has a somewhat limited definition of “literature”, though
he ultimately comes to the right conclusion that code must be “studied”, not
“read”. It’s true, code is typically less linear than a pulp novel, but other
types of literature are also involved, with layered meanings, which must be
examined carefully, with reference material handy, and lots of flipping back
and forth between sections. For instance, poems, philosophical treatises,
historical analyses, and math textbooks must all typically be read this way.

~~~
NoodleIncident
Have you ever had a reason to read just part of a book? Pages 236-241 of Moby
Dick, with no intention to ever read the rest? That seems to be a crucial
difference between the two activities.

~~~
Zancarius
Moby Dick is probably a poor example. Something like the Iliad or the Bible
might be appropriate since extensive studies often focus on a few chapters or
verses. Yes, one normally reads the entirety of the work, but studying
specific parts seems more apropos.

------
bgilroy26
>It was just basically the way you solve some kind of an unknown puzzle—make
tables and charts and get a little more information here and make a
hypothesis. In general when I’m reading a technical paper, it’s the same
challenge. I’m trying to get into the author’s mind, trying to figure out what
the concept is. The more you learn to read other people’s stuff, the more able
you are to invent your own in the future, it seems to me.

I really enjoyed reading this article, but I would argue with its headline.
Based on the author's experience and the example from Donald Knuth, it seems
like the best way to read code is to go through it multiple times to the point
where you could reimplement it or provide complete documentation for it.

The literary analog for code reading might be a writing a scholarly reader's
companion to a book.

You can't write a secondary source for a work of literature by reading it once
through like a drugstore thriller or romance. A literary analyst would read
the book through completely >3 times and spend hours on certain key passages.
They would take extensive notes reconstructing the innerworkings of the
characters, the relationships between them, and key themes. Once the work has
been comprehensively understood, the scholar can write out in an expository
manner what is going on in the piece of literature, the same way that a
thoroughly digested piece of software can be rewritten based on the mental
model that develops as you read.

Obviously software and novels do not map completely one onto the other. I
think the key similarity is that they both can be created with sufficient
complexity to require taking multiple passes and following along with the
author, building something similar yourself in order to truly understand them.

------
thangalin
Why do we still embed natural language descriptions of source code (i.e., the
reason why a line of code was written) within the source code to the exclusion
of intrinsically linked separate documents?

[http://i.stack.imgur.com/JlUiE.png](http://i.stack.imgur.com/JlUiE.png)

The potential advantages include:

\- More source code and more documentation on the screen(s) at once

\- Ability to edit documentation independently of source code (regardless of
language?)

\- Write documentation and source code in parallel without merge conflicts

\- Real-time hyperlinked documentation with superior text formatting

\- Quasi-real-time machine translation into different natural languages

\- Every line of code can be clearly linked to a task, business requirement,
etc.

\- Documentation could automatically timestamp when each line of code was
written (metrics)

\- Dynamic inclusion of architecture diagrams, images to explain relations,
call-graph hierarchies, etc.

\- Single-source documentation (e.g., tag code snippets for user inclusion in
manual[s]).

~~~
shalmanese
Because we're stuck in a tyranny of flat text files as a representation of
code.

There have been countless proposals over the years for some kind of richer
file format for representing code and they have all been busts because so much
of our tooling, assumptions, interoperability and culture is centered on flat
text code that it's proven impossible thus far to switch.

~~~
krapp
But text files are easy to generate, easy to edit, and simple to read. What
alternatives are there which remain language and tool agnostic?

~~~
sp332
I can make a file format as tool-agnostic as you like, if it doesn't actually
have to have any features.

~~~
Crito
Being tool-agnostic _is_ a feature. The most important feature.

If there are other known features that are more important, they would have
taken off by now.

------
mbrock
Christopher Alexander, the architect who introduced the theory of "pattern
languages," wrote the introduction to Richard P. Gabriel's "Patterns of
Software." He says:

"In my life as an architect, I find that the single thing which inhibits young
professionals, new students most severely, is their acceptance of standards
that are too low. If I ask a student whether her design is as good as
Chartres, she often smiles tolerantly at me as if to say, 'Of course not, that
isn't what I am trying to do. I could never do that.'"

Then: "That standard must be our standard. If you are going to be a builder,
no other standard is worthwhile."

And so he asks the same thing about programming.

"But at once I run into a problem. For a programmer, what is a comparable
goal? What is the Chartres of programming? What task is at a high enough level
to inspire people writing programs, to reach for the stars? Can you write a
computer program on the same level as Fermat's last theorem? Can you write a
program which has the enabling power of Dr. Johnson's dictionary? Can you
write a program which has the productive power of Watt's steam engine? Can you
write a program which overcomes the gulf between the technical culture of our
civilization, and which inserts itself into our human life as deeply as
Eliot's poems of the wasteland or Virginia Woolf's The Waves?"

Maybe code is just _bad_ literature?

------
greenyoda
" _Once I’ve completely rewritten the thing I usually understand it pretty
well and can even go back to the original and understand it too. I have always
felt kind of bad about this approach to code reading but it 's the only thing
that's ever worked for me._"

This strategy may work for small programs, but it doesn't scale to large
programs. For example, most people aren't going to have the time to refactor
Firefox or the Linux kernel to figure out how they work.

Also, it's hard to tell a lot about a large program just by reading a listing
of the source code. Certain things about the code become much more obvious if
you step through the running code with a debugger. To extend the author's
analogy of a program being a scientific specimen: the code is a _living_
specimen whose _behavior_ can be studied, not just a dead specimen that can be
stained and looked at under a microscope.

~~~
skybrian
That's a good point about running it in the debugger.

However, even with a large program, sometimes I find it helpful to write a
smaller program that does much the same thing as a small part of it. For
example, last year I wrote a debugger frontend in Dart, based on the Chrome
DevTools debugger. Whenever I wanted to implement something I'd first look at
how the Chrome debugger did it.

Currently I'm working on a reimplementation of the React framework, also in
Dart.

------
scott_s
Every time I have a serious question about how something works in the Linux
kernel, I use it as an excuse to do a dive into the code: [http://lxr.free-
electrons.com/](http://lxr.free-electrons.com/)

I still look through other sources, including man pages, books and a lot of
googling. But sometimes I just want to _see_ what it is I'm dealing with. I do
this with all code bases I deal with. I think it's a good practice to get
into.

~~~
dugmartin
I like to read cross-referenced code too. I built "SherlockCode" a while ago
as a generic tool to browse code but I haven't done anything with it in a
while. Here is a sample of a symbol in a file in jQuery:

[http://sherlockcode.com/demos/jquery/#!/src/attributes.js:pr...](http://sherlockcode.com/demos/jquery/#!/src/attributes.js:prop)

------
golergka
This reminds me of how I listen to music.

If I meet a track that I really like, I don't just listen to it. I put it on
the decks, try to mix it with something else and listen how it interacts with
it. I put it on the grid, sample loops, hits and small sounds. If you don't
understand what I'm talking about, here's a video of Four Tet doing something
similar to Jackson's Thriller:

[http://www.youtube.com/watch?v=TUDsVxBtVIg](http://www.youtube.com/watch?v=TUDsVxBtVIg)

Sometimes I analyze it's structure, laying empty loops in mute tracks
alongside it. Sometimes I try to recreate synths that are used. Sometimes I go
to whosampled.com and try to recreate the sampling process.

I'm sure writers do the same with literature they read, too.

------
thebear
Perhaps the most important insight to be gained from this article is Abelson's
statement that "a lot of times you crud up a program to make it finally work
and do all of the things that you need it to do, so there’s a lot of
extraneous stuff around there that isn’t the core idea." There is an old blog
entry by Joel Spolski that elaborates on this phenomenon:

[http://www.joelonsoftware.com/articles/fog0000000069.html](http://www.joelonsoftware.com/articles/fog0000000069.html)

------
scottcha
I agree with the author that code may not be literature. Taken from the
opposite line of reasoning there have been movements in the past to make
literature more like code. Specifically I thinking of Oulipo (which included
Calvino as probably the most famous) on bringing new structures to literature
including some generative ones which could be thought of as programming or
combinatorics.

[http://en.wikipedia.org/wiki/Oulipo](http://en.wikipedia.org/wiki/Oulipo)

~~~
thedudemabry
A great group blog that keeps tabs on interactive and generative literature is
[http://grandtextauto.org/](http://grandtextauto.org/). I highly recommend it.

I wonder how blurred the lines can truly become between code and literature,
though. If a piece of code is primarily intended to be read and discussed,
does that make it literature?

------
thebear
I would be interested to know what the OP thinks about stepping through code
as opposed to reading it. To me, reading code and stepping through it in a
debugger are two complimentary ways of understanding it. I call those the
static and the dynamic way of viewing code.

~~~
rfnslyr
Interesting. Does something exist where I can visually trace through how my
javascript code is being executed? It would be wonderful to have a visual
representation.

~~~
lesserknowndan
Most browsers have a "Development Tools" option that provides a number of
tools including a step-through debugger similar to those in
XCode/Eclipse/Visual Studio.

I personally use the development tools in Safari most of the time.

~~~
rfnslyr
Really? Why Safari in particular?

~~~
lesserknowndan
I tend to use Safari (on Mac OS) as my go to browser, so the simplest answer
is probably familiarity as corresponding functionality exists in Chrome and
Firefox which I've had no problems using when needed. That said, IE is not so
good.

I think Chrome and/or Firefox provide more support for live editing of the
current web page, however I've never made much use of that functionality.

------
auggierose
I try to read as little code as possible. If I have to read the code of
somebody else (other than code review) it is usually because the code contains
some flaw; only rarely because I genuinely don't know how the code does what
it claims to do.

Don't read code. Read papers. Build a model of your algorithms etc. in your
mind. Describe this model in a wiki. Translate the model into interfaces. Then
write the code that implements those interfaces.

------
taeric
Of course, the whole point of "literate programming" is to provide hints and
structure for the human reader. This is done not by creating some structure in
the code that makes sense to the compiler and a human, but by breaking up a
program into pieces that are put together later.

I feel that this is really nothing that a good compiler couldn't do with a
higher level language today. However, in doing so I would wind up with a
heavily polluted namespace of helper methods and such that really don't help
me understand what I was trying to do.

So, in the vein of reading code. I've only read a few sections of "The
Stanford Graphbase," as I just got it a couple of weeks ago, but I can already
tell this would have been a much better introduction to a few graph algorithms
than I had in my undergrad.

Further, all of the "literate" programs I have written have been much easier
for me to jump back into. Precisely because I have much of my "decoding"
notes. So, code isn't literature, because we don't write it with a narrative
for humans in mind. But, there is no real reason we couldn't.

------
jackfoxy
I like the OP's summation that we should approach code reading as code
decoding. My interest in literate code and readable code has recently
accelerated in conjunction with my interest in code correctness. I think the
way forward in both these contexts is through functional programming.

In particular I, and the IT shop at Tachyus, have chosen F# as the way to go
forward for a number of reasons. Sticking to readability, F# (and other FPs to
a greater or lesser extent) allow production code that "reads" more
expressively in terms of conveying what the code is actually accomplishing to
the reader (and to the compiler) rather than the frequently tangled
instructions to the compiler on _how_ to accomplish the task coming from
traditional imperative and OO languages. F# also has some very useful tools to
emit a form of literate code that produces publication ready HTML or MD,
[http://tpetricek.github.io/FSharp.Formatting/](http://tpetricek.github.io/FSharp.Formatting/)
(This project will soon be accepted as a top-tier project by the F# Software
Foundation, [http://fsharp.org/](http://fsharp.org/)) It may not be _to the
letter_ of Knuth's idea of literate programming, but certainly in the spirit.

I did _read_ some code lately. Actually I had to go so far as stepping through
it in the debugger to properly _decode_ it, [http://jackfoxy.com/transparent-
heterogeneous-parallel-async...](http://jackfoxy.com/transparent-
heterogeneous-parallel-async-with-fsharp/) (the code snippets here have tool-
tips in my article, just one of the features available with
FSharp.Formatting), but this is really the exception in F#. The vast majority
of code is easily accessible to any programmer of reasonable quality (with
proper introduction to FP) in any IT shop. The deeper functional stuff like
Continuation Passing Style and Applicative Functors (e.g. heterogeneous
parallel async) in most cases is already available in core libraries. And when
not a literature search and/or getting in touch with the FP community helps.

------
pjmorris
I think code is sort of a combination of literature, ToDO and shopping lists,
and directions to somebody's house, written from your own perspective. There
are recurring themes and characters, but it can get lost in a sea of detail.

------
yareally
Code can be literature in specific cases, such as the Shakespeare programming
language. It's just a whimsical, esoteric language, like lolcode, but it reads
like the Immortal Bard himself was an early adopter of learning to program.

Example of a conditional statement:

    
    
      Juliet:
        Am I better than you?
    
      Hamlet:
        If so, let us proceed to scene III.
    

[http://en.wikipedia.org/wiki/Shakespeare_(programming_langua...](http://en.wikipedia.org/wiki/Shakespeare_\(programming_language\))

~~~
Solarsail
Shakespeare is only superficially readable. In reality, the logic executed by
the program is very nearly unrelated to the way the program reads to a human.
Essentially it's a cute syntax over the same old FORTRAN, much like lolcode.
It was a missed opportunity to represent logic as meaningful literature.

------
freshhawk
Seems like a worse metaphor to me, naturalists don't examine specimens in
order to learn how to make better animals but that's precisely the reason
coders are expected to improve by reading code (although I always assumed that
"reading code" meant reading it over and over to get a detailed understanding
but apparently that was just me?).

What is the goal of reading literature we're talking about? We're mixing up
reading a book for pleasure and gaining a deep understanding of a piece of
literature to become a better writer.

Reading a piece of code or a book once is not going to do anything to your
skillset as a producer, at least books are specifically written to be read
once for pleasure. The equivalent for code would be using a piece of software,
not reading the code once.

If you want to be a better writer then you get a deep understanding of a piece
of literature, the same applies to code. I have recently read a lot of code,
because I was debugging/modifying a library I was using (the Requests lib in
Python). It's very nicely written and I did get some good ideas from it, but
it was work.

I don't think the metaphor is flawed at all. I think that this was a result of
coders thinking that people would get better at writing by reading literature
or that this was the point of literature seminars. I guess a lesson in
understanding other disciplines at least a little bit before trying to take
lessons from them?

------
jeffdavis
It's a cultural, psychological, linguistic mixup more than anything else.
People do read code all the time, they just hesitate to respond to a question
like "what code have you read recently". It's hard to answer that question in
english without implying that you have completely read a program (rare) that
was completely written (in other words, "finished", which is even more rare).

If you asked a different question, like "explain how you read code in the
course of a typical project or experiment" you will get a ton of examples.
They might describe how they look to understand the basic data structures, and
then imagine some sample data flowing through the algorithm to understand the
purpose, and then examine the details, edge cases, and interactions to see why
some non-obvious choices were made. Then they might describe how they use this
to find what parts of the code should be generalized, specialized, or extended
to fit new functionality.

It might be interesting to incorporate code reading into an interview to see
the strategies that people use. It would be quite difficult to make it a fair
question, though, because patterns vary widely and it often takes more than an
hour or so to adapt.

~~~
henrik_w
I think it is understood that we all read code during a typical project. I
think what the OP is referring to is code reading for the purpose of improving
your skill in general, to get exposed to code that you wouldn't normally see
by just working as usual.

------
0xdeadbeefbabe
Having just implemented a specification where the spec was less useful than
some source, I'd say that literature is not code for sure. And as someone who
has read literature, though I wasn't an english major like the author--you
know it seems like they encourage english majors to treat writing as a
specimen--it seems true that code is not literature either, it doesn't even
compare for entertainment value for example.

~~~
thedudemabry
That's an interesting point, that literature can also be approached as a
dissection. But I think that Mr. Seibel's observation still holds. Literature
is primarily intended to be read and discussed, while code is primarily
intended to perform a job. One lends itself to an experiential conversation
and the other very contextual explaination.

------
jdnier
Great quote from the article: "But then it hit me. Code is not literature and
we are not readers. Rather, interesting pieces of code are specimens and we
are naturalists. So instead of trying to pick out a piece of code and reading
it and then discussing it like a bunch of Comp Lit. grad students, I think a
better model is for one of us to play the role of a 19th century naturalist
returning from a trip to some exotic island to present to the local scientific
society a discussion of the crazy beetles they found: 'Look at the antenna on
this monster! They look incredibly ungainly but the male of the species can
use these to kill small frogs in whose carcass the females lay their eggs.'"

------
m0nastic
I imagine getting a bunch of people to sit around and project a cookbook
recipe up on a screen.

------
amasad
As someone who also tried to hold code reading groups I agree 100% with the
conclusion.

The first code reading session I held, I chose underscore.js and it was a
successful code reading session, because -- unlike most libraries and programs
-- a functional utility library was a nice linear read with mostly self-
contained functions. However, when we got to more complex programs and
libraries with more code to handle accidental complexity (e.g. handle browser
and DOM inconsistencies, or UNIX fragmentation etc) it was considerably harder
to read and the presenter found themselves jumping between different code
paths and functions like they were debugging the program.

------
markm208
The main problem I see is that code is read left to right, top to bottom (for
the most part) but it is rarely, if ever, written that way. The order that
decisions are made is almost as important as the decisions themselves. But, we
lose almost all of that order or 'context'. Worse, although we can place
comments in the code, we cannot attach comments to the evolution of code.
Evolutional comments could describe why things are changing in the proper
context and make reading code a lot easier.

~~~
cousin_it
> we cannot attach comments to the evolution of code

I guess Google has spoiled me. When reading code, I constantly look at its
development history - commit messages, diffs and line-by-line "blame", linked
bugs and code review threads. If you have good tools for that, there's much
less need for inline comments.

------
snorkel
Code is not literature because literature only contains the highlights worth
knowing where code has to provide the comprehensive instructions for
everything to operate.

A good code reader should be like a tour guide, and a good tour guide doesn't
visit every single building and street in a neighborhood but rather describes
the historical context of the neighborhood and then visits a few interesting
places.

------
famousactress
_" The point of such a presentation is to take a piece of code that the
presenter has understood deeply and for them to help the audience understand
the core ideas..."_

I do get lots of value out of that. My favorite example is Beazley's GIL talk:
[http://www.youtube.com/watch?v=Obt-
vMVdM8s](http://www.youtube.com/watch?v=Obt-vMVdM8s)

------
henrik_w
For me, the fastest way of understanding code is a mix (back and forth) of
_reading_ it and _running_ it. Some questions on how it works are more easily
answered by running it and seeing what happens, while other questions are
better answered by reading (e.g. what are all the possible cases here?).

------
vorg
At the very least, when we program we're writing a story to whoever might need
to understand it later on.

------
dspillett
Oh, but it could be:
[http://en.wikipedia.org/wiki/Shakespeare_(programming_langua...](http://en.wikipedia.org/wiki/Shakespeare_\(programming_language\))

------
NAFV_P
May I bring up that whether or not you read code, you ain't gonna read it
literally. Imagine following all jump statements with out fail. That's the
machine's job, not yours.

------
ktf
Some literature really does need to be "decoded" in a similar way, however.
(Speaking as someone who recently finished _Ulysses_...)

------
n1ghtmare_
Emmm, so there are now "Code reading groups" ? I need to get with the times.
Is this a new thing ?

------
fredgrott
Oh my effing hell

This is how I code and read code..damn and I thought I never would see the day
where someone finally got it..

------
dschiptsov
I am not English major, but I am pretty sure that the idea of becoming a
writer by reading pieces of other people's texts is wrong. This is simply not
enough. There is a "second component" in good writing, and it is not just
about language usage.

One could read Selinger or Pamuk or Sartre or Hesse, to realize that this
second component is much more important, while masters like Nabokov whose
speciality is playing with words might show you that wording is also
important.)

The transition from reading to writing ones own texts, not imitating or copy
pasting is also not clear, and, of course, one never could become a good
writer only by excessive reading. Writing and speaking are different cognitive
tasks from reading or listening.

So what? Reading of good code is important, it teaches style, how to be brief,
concise, precise. But where to find the good code? Well, the recursive list
functions in Scheme are worth reading. Some parts of Haskell Prelude are worth
reading, some macros of Common Lisp, etc.

The code of "the top writers" are worth reading. Code from PAIP or On Lisp or
SICP are obvious examples, while some code, like from Practical CL which is
mostly a mechanical translation of OO stuff only adds more confusion.

So, reading "good" code is still the must, the same way that reading Catcher
In The Rye or Zen And Art Of Motorcycle Maintenance or Atlas Shrugged is still
the must.

But programming is about writing, which means expressing ones own ideas and
realizations and understanding, so one must have these in the first place.

In this sense programming is like writing a poetry - it must emerge and form
in ones mind before it could be written down. The best poetry is written
exactly like this - committed to the paper suddenly as it emerges, without any
later changes.

This reflects the process of "emergence" of ideas or profs in a mind of
scientists who are continuing to persue a problem for years - suddenly it is
here, as if it came from subconscious. It seems that the best code, like these
classic Lisp procedures or parts of Prelude has been written this way.

Of course, reading Java is as meaningless as reading graphomans or some lame
and lenthy political pamphlet in a third-rate newspaper.)

~~~
mbrock
"In this sense programming is like writing a poetry - it must emerge and form
in ones mind before it could be written down. The best poetry is written
exactly like this - committed to the paper suddenly as it emerges, without any
later changes."

Are you sure that's true? Can you cite some examples?

Lisp is famous for its interactivity: the read-eval-print loop, SLIME, Lisp
Machines, Emacs, etc. Avid Lisp hackers even edit code inside of running
systems. The "bottom-up approach" to programming (as advocated by Paul Graham)
is almost the opposite of what you describe, isn't it?

Generally speaking, I think both programmers and poets work in a dynamic way
with their texts: moving stuff around, seeing what works, doing experiments,
asking others, etc.

That's one reason why Knuth's idea of literate programming seems so academic
and remote for most programmers: how are you going to keep all of that text
up-to-date when you start refactoring?

~~~
dschiptsov
I would say that there is no contradiction with bottom-up approach, and it was
popularized before PG by SICP lectures with image manipulation DSL for making
these beautiful recursive image patterns.

Your each iteration in a bottom-up process could be based on a small insight
after thinking about a subproblem. Later one just re-uses ones own
realizations and adapts them to new requirements.

Also I think that it should be not just linear bottom-up process, but
recursive one, when you regularly "call yourself" with the old problem, but a
"new you, evolved with experience". Starting from the bottom, from basic
building blocks is crucial. The only "addition" is that nothing will be set in
stone and you should come back to "simplify" and refactor even what is at the
very bottom.

I also never advocated Knuth's idea or that whole programs should be printed
as books (while some procedures such as map or append are worth to be printed
and framed).

As for poetry, well, I thing almost every youth wrote some in his late teens
or early twenties, and yes, I told it wrong, not a whole poem emerges in ones
mind, but a few central passages, the main scheme, to which some ornaments
could be added later.

