
You Don't Read Code, You Explore It - doty
http://prog21.dadgum.com/194.html
======
ChuckMcM
It is an interesting observation. I expect it is the difference in expertise
though. The author compares reading code snippets in Dr. Dobbs (which dates
them probably 10 or 15 years ago) to reading stories.

Here is the thing, everything you read in a story is supposed to convey
imagery of things like what you may have already experienced, they are already
internalized, you "see" them when you read as if you were there.

Allow me to use another popular space as an example, music. When you are first
reading sheet music, you see notes on a stave, key signatures, different
shapes representing different durations. At first you mechanically take that
understanding and laboriously turn it into actions on your instrument. But
after a while, if you do it enough, the shapes become recognizable as rhythms,
the tones in the staves become tones not symbols, and then you stop "reading"
music, you look at it and you can hear what it will sound like. And by that
time you can make your instrument do what ever you hear.

Coding is not entirely different, at some point you don't see syntax, you see
algorithm, you see inter-relationships of data structures, you see flow. After
a number of years of coding I got to the point where I could see what code was
doing pretty easily (except for obfuscated code which is always jarring on
first look). I stop seeing code loops and start seeing iterative processing,
if statements are branches on a path.

Anything in words or symbols, is code for something else. Whether its a murder
mystery, a symphony, or a sorting algorithm the words and symbols are there to
express the idea inside your head you can understand it, I think it is all
reading though :-)

~~~
coolsunglasses
Have you learned Haskell yet? I think you'd find it's a very rich way to see
and think about code if you did.

Starter guide here for any interested:
[https://gist.github.com/bitemyapp/8739525](https://gist.github.com/bitemyapp/8739525)

~~~
ChuckMcM
I've got the book 'Learn you a Haskell for great good' and I've gone through
it. But to be honest I've I have always resonated more with structural
languages than functional ones.

~~~
coolsunglasses
LYAH is slow and doesn't communicate the compelling parts of Haskell at all.
There is no functional/structural dichotomy. Structural isn't a category of
programming languages. There are no programs without structure. You may be
mistaking syntax ( {} blocks vs. s-expr ) for having something to do with
semantics.

It's also possible somebody has told you Haskell is a declarative language a
la Prolog. It is not.

Try this:
[https://gist.github.com/bitemyapp/8739525](https://gist.github.com/bitemyapp/8739525)

Specifically:
[http://www.seas.upenn.edu/~cis194/lectures.html](http://www.seas.upenn.edu/~cis194/lectures.html)

------
curtis
I've pretty much come to the conclusion that you don't understand code by
reading it. Understanding someone else's code is almost always a reverse
engineering exercise. It's often necessary to actually run the code repeatedly
to understand it. This should really come as no surprise. It's likely that the
guy that wrote the code didn't write it all at once, but rather wrote it in
increments testing along the way. If he couldn't write it without testing
every little bit, it's unlikely that you can read it without doing the same.

~~~
mden
Good observation and I completely agree. On a slight tangent I've been
spending time thinking if I had the ability to change the high school
curriculum, how would I do it? So far the thought that has had the most appeal
to me is to introduce "reverse engineering" as a core subject. I don't mean
reverse engineering in the software or hardware sense, I mean it in the sense
of a collection of logical techniques for figuring out the logic behind
something phenomena or problems. In some sense "forward engineering" requires
reverse engineering from the conceptual finish of a product to its current
nonexistent state. Science is the reverse engineering of the laws of nature.
But rather than have schools teach any of those skills, they teach
encyclopedic knowledge. Knowledge is useful, but without the skills to analyse
and use it, it's just wasted effort and brain space.

------
spankalee
And this is one of the main reasons I massively prefer type annotations,
statically analyzable languages and IDEs over dynamically typed languages,
very loose languages and plain text editors.

Not being able to see what types code deals in, jump to definition, and find
usages makes me feel crippled when exploring a new codebase.

One of my wished for programmers everywhere is that tools like Github and
BitBucket start analyzing projects and letting you navigate better. I think
that could save thousands of engineer hours.

This topic alone is a huge reason I started using Dart and eventually joined
the team. Trying to figure out how a very large JavaScript codebase works is
so incredibly painful, doing so in Dart is incredibly easy. This is also where
very reflective libraries like Guice go wrong, and why, in my opinion, meta-
programming should be used very carefully and sparingly.

~~~
omarhegazy
Code navigation is hard in plain text editors, but definitely doable. I think
the cost of setting up code navigation and jump-to-definition in Emacs is
lower than the cost of dealing with how slow and bloated IDEs are. But, hey,
to each his own.

Although I think Light Table is trying to be like what you're saying,
incredibly liberating in it's ability for code navigation. But it's still
young.

As for type annotation and static analysis ... you don't need to have static
analysis for what you're talking about. Yes, static analysis works best for
static languages, but you can have code auditing in dynamic languages. An
auditor can dig through and profile your stuff while tests are running and
learn about interactions between various entities in your code. And if you're
doing anything in a dynamic language properly you're doing a lot of tests.

~~~
spankalee
Having to analyze a running program increases the work and complexity for
index code by an _incredible_ amount. When you can simply design a language in
such a way as to be analyzable in the first place, I don't see why you'd want
to require so much more work out of your tools, and especially your tool
writers.

Take my wish for Github to be more navigable. In a statically analyzable
language they can run the analyzer and update their index on every commit. If
they have to run tests then they have to basically add an entire continuous
integration service which is a _lot_ more work, resources and security risk.
Why make it harder than necessary?

~~~
omarhegazy
You're right. I guess I'm just a huge sucker for the whole dynamic languages
with a REPL thing. I mean, Ruby is just so goddamn neat. The stuff that takes
me 2 lines in Ruby can take me 10-15 in Java or C, and the difference only
gets bigger as the program gets larger.

Maybe I should try out static languages "done right" (type inference, etc.)
before I give up on them.

------
coolsunglasses
One of my ideas I've been kicking around for the last year or two has been to
write a book on learning to read code properly. I would submit that one of the
greatest weaknesses of your typical programmer is that they don't know how to
read other peoples' code.

On another note, being able to read other peoples' code is one of the
strengths of Haskell. The Functor/Monad/Monoid/etc stuff becomes a way to
know, based on a common vernacular, exactly what kind of interface is being
exposed and what sort of data structures you're working with.

~~~
jfsurban
I know one book about code reading: "Code Reading: The Open Source
Perspective". I have just read the first chapters but it's mostly C/C++ I
think.

------
userbinator
I've read a lot of code (both source and otherwise) and found that in many
cases one of the biggest barriers to understanding the system as a whole is
actually abstraction/indirection -- or more precisely, the often-excessive use
of such. Following execution across multi-level-deep call chains that span
many different files in different subdirectories feels almost like
obfuscation, and all the pieces that are required to understand what happens
in a particular case (important when e.g. looking for a bug) are scattered
thinly over the whole system such that it takes significant effort to collect
them all together.

In my experience, the majority of existing codebases I've worked with tends to
be this way, although there are exceptions where everything is so simple and
straightforwardly written that reading them is almost an enlightening
experience.

------
Chris_Newton
Program comprehension seems like a fascinating research field if you enjoy
figuring out why we do the things we do as software developers. It seems that
we do indeed explore code, but we do so in somewhat systematic and predictable
ways, which in turn might give us ideas about how to _write_ our code to be
more readable.

If anyone is interested, I suggest the publications of Anneliese von
Mayrhauser and A. Marie Vans from the mid-90s as a possible starting point.
They did a lot of work to reconcile earlier theories and paint a more unified
big picture. Václav Rajlich is another name to search for, with several
interesting publications in the early 2000s.

------
kaeluka
This is related to something that I pondered about yesterday. In main stream
languages, imperative programs require you to read the whole thing, while
functional programs allow you to look at a function and derive its meaning
only from its implementation: there can't be another influence on the outcome,
as there are no side effects and types are -- in ML/Haskell family members --
precise. So referential transparency, in a sense helps you to read code depth-
first, only looking at the relevant bits now.

As soon as you branch off from basic type systems and add in, for instance,
subtyping or type classes or existential quantification, this ability is
getting weakened. Now, in order to understand what a piece of code does, you
need to understand some context: 'which implementation is it?', 'what do the
possible implementations have in common?'. The effects of this small
complexities add up until there's too much information to be kept in
biological memory at the same time, the oldest thunk of information gets
purged.

I do think that typed and pure languages have big advantages here: the
information that I need is available immediately from looking at the type of
the reference to it. If the types aren't funky, I can assume that it will
terminate, throw no exception, not suffer from data races -- I can exclusively
think about _what_ it does, not how (btw, I'm not having one specific
reference language in mind right now, I'm just thinking about what would be
possible).

------
necubi
Here's a challenge for somebody: create a code review/pull request tool that
helps you really _understand_ the code changes. Some IDEs do an ok job of this
for static code (IntelliJ for Java, Emacs haskell-mode), but I've never seen a
good tool for giving insight into how a diff is changing a program at the
structural level.

~~~
beliu
Might I suggest looking at [https://sourcegraph.com](https://sourcegraph.com)?
Currently it's just code search, and doesn't do code review. But it does
_understand_ the code at a function/class/module level. And that makes it
fundamentally different from a lot of other code reading tools out there that
just treat code as text. For code search, that means it can do things like
show you how other people actually call a given function. For code review, it
would make it possible to reason about changes to functions/classes/packages
instead of lines of text. (Disclaimer: I'm one of the creators of Sourcegraph;
would love to hear your feedback)

~~~
bosie
"For code search, that means it can do things like show you how other people
actually call a given function."

i just tried it and that is a pretty big claim.

e.g.
[https://sourcegraph.com/code.google.com/p/go/symbols/go/code...](https://sourcegraph.com/code.google.com/p/go/symbols/go/code.google.com/p/go/src/pkg/net/http/cgi/Handler:type/.examples?page=2)

Am i wrong or isn't this pretty much what Intellij calls "Usage"? At least I
don't see the difference, could you please explain it?

general feedback: the website is too slow. it takes me a couple of seconds to
load each page, rendering the page pretty much useless.

~~~
beliu
Thanks for the feedback; we're working on site speed as a top priority. The
examples are similar to the usages feature from your IDE, but 1) the examples
are drawn from all the open-source code we've indexed online and 2) we support
non-statically-typed languages like Python, JS, and Ruby.

------
thom
Part of my difficulty reading large code bases is there's often no
particularly good entry point to start from. Towards the end of my time in OO
languages I was struck by Reenskaug and Coplien's DCI [1] - many people just
don't model a system's use cases as first class concepts. The most important
part of a system - what it actually does! - doesn't get explicitly mentioned
in a lot of people's code.

[1]
[http://en.wikipedia.org/wiki/Data,_context_and_interaction](http://en.wikipedia.org/wiki/Data,_context_and_interaction)

------
softbuilder
I rewrite code to explore and understand it. This actually started as a bad
habit - "Oh, I can't believe this person did X, I'm going to change that to
Y". Do that enough and you quickly figure out exactly _why_ they did X, and
they were either smarter than you or much more familiar with the problem
domain. But now you know a little more, you have a little more respect for the
code, a little more humility, etc..

Nowadays I take it as a given that I'm probably wrong, but I start rewriting
anyway. Worst case (and most common case) I have to toss the code. But I
learn. Plus there's a different place your brain goes when you feel like you
control the code vs. looking at it behind glass.

~~~
plorkyeran
I see rewriting code to understand it as basically the ultimate form of taking
notes while reading something, since you can run your "notes" and verify that
they're actually correct, and you can't get away with just glossing over
things you don't get.

------
anigbrowl
I'm baffled by the absence of tools for diagram generation from code and
things like contextual highlighting in, well, every code editor. Even code
folding seems pretty primitive. Bret victor & the LightTable team are
proposing some significant innovations, but most IDEs make me feel like I'm
trying to explore a room through a keyhole.

------
beliu
I find it surprising that in 2014, most of the tools we use to read code treat
it mostly like any other text. With the exception of some IDEs for some
languages (e.g., Eclipse and Java), very few apps for browsing code actually
understand its structure, i.e., the hierarchy of symbols and namespaces and
the implicit graph defined by module imports, function calls, type references,
etc. We're relying more and more on external, often open-source libraries, and
are therefore spending more and more of our time reading through other
people's code. Yet the tools don't seem to have caught up.

------
collint
Code is easy to read if you understand the domain.

Code is nearly impossible to read if you don't understand the domain.

~~~
hackinthebochs
Yup. This is really all that needs to be said. We need tools that can help
bridge this gap. Good code visualization could help here. I'm surprised
there's such little activity in this area.

------
javert
My technique for reading code is to find out where execution starts and go
from there, following what the code does at runtime.

If you do it any other way, it won't necessarily make sense. This is really
the only way to do it. (Though I'd be interested in hearing other
perspectives.)

This was a hard-won lesson for me because we programmers tend to make the
control flow of our programs start at the _bottom_ of source files.

------
rmthompson
There are many things that you read that are neither enjoyable, nor easy to
understand. Especially at a cursory read. That doesn't make the word less
appropriate, nor does it make the word explore any more appropriate. I don't
explore a quantum physics textbook, nor do I explore a journal article on
tubulins. I read, I jot down notes, and I read some more.

------
georgeoliver
Don't professional readers (i.e. writers, lit critics, or just close readers)
also read literature differently than casual readers?

~~~
vanderZwan
I'm not in any of those professions, so I can't say for sure, but at the very
least I assume they understand the medium itself at such a deep level they
automatically break down any story they read to different stages. For example:
the deeper, structural level, the broader social context the work was created
in, the literary context (other works explicitly and implicitly referenced),
and so foth.

I know something like that happens with me now when I look at art after going
through art school - Micheal Parson's "Talk about a painting: A cognitive
developmental analysis" is an excellent paper on the topic. AFAIK it's only
available behind a paywall though:

[http://www.jstor.org/discover/10.2307/3332812?uid=3738736&ui...](http://www.jstor.org/discover/10.2307/3332812?uid=3738736&uid=2&uid=4&sid=21103680072071)

------
sparkie
I like Tim Daly's viewpoint that we shouldn't just be writing _code_ , but we
should write _manuals_ \- giving high level overviews of our problems and
specifying their implementation details inside the manuals. His talk "Literate
Programming in the Large"[1] covers why.

The example he gives in his talk is the Axiom algebra system[2], which was
revised to use literate programming style - the source code, with usage
examples, is contained entirely within the books.

[1]:[https://www.youtube.com/watch?v=Av0PQDVTP4A](https://www.youtube.com/watch?v=Av0PQDVTP4A),
[sldes]: [http://daly.axiom-
developer.org/TimothyDaly_files/publicatio...](http://daly.axiom-
developer.org/TimothyDaly_files/publications/DocConf/LiterateSoftwareTalk.pdf)

[2]:[https://en.wikipedia.org/wiki/Axiom_%28computer_algebra_syst...](https://en.wikipedia.org/wiki/Axiom_%28computer_algebra_system%29)

------
dugmartin
I wrote this a couple of years ago to help read and explore code:

[http://sherlockcode.com/demos/jquery/](http://sherlockcode.com/demos/jquery/)

It's recently seen a spike of interest and I've started working on it again.
The beta sign up link is still active if you are interested in getting
updates.

------
girvo
Yes, yes, yes! I agree wholeheartedly. Exploratory programming is my new
favourite weapon for learning about new codebases, new languages, new
everything.

I just started a new job at a really interesting agency. I got put on to a 12
month old project, a huge web application, that started life overseas, moved
back here to Australia, and according to git-blame has then moved through the
hands of nearly 15 developers, a solid 70% don't work here anymore (most were
contractors).

So, the codebase is a mess. But, with Xdebug and a neat client for it that
gives an interactive console when you hit a breakpoint, two weeks later I'm
already understanding the twists and turns far better than I ever hoped for!

------
shurcooL
I agree, I think we need better tools for exploring code. [1]

I'm currently envisioning (and trying to build) something where the types/func
definitions are hyperlinks, and they jump to definition in an overlaying
window similar to when you navigate in Spotify (the web based player). So you
can quickly explore something without losing context.

[1]
[https://twitter.com/shurcooL/status/156526541214457856](https://twitter.com/shurcooL/status/156526541214457856)

~~~
ludwigvan
RapGenius for reading code?

~~~
shurcooL
I've heard a lot about it, but I never knew what it was. I just looked it up.
Pretty neat. Yeah, definitely a similar idea to some degree.

------
naland
For him has been in the field more than thirty years, this author speaks too
easy about some important figures. I stop 'reading' or 'exploring' and start
writing —all the CSworld is a scene and MS and Int, merely players. However, I
downgrade the article.

------
halayli
I think there should be a university course that teaches how to read code; the
good, the bad and the ugly.

------
known
You Don't Read Code, You Debug It.

------
ttflee
The way to explore a bunch of codes is really dependent to how much time and
effort one can pour into:

\- To grep it

\- To debug it

\- To read over it

\- To rewrite it

------
psychometry
We know. It's just an expression.

------
DonHopkins
The road to fail is paved in goto intentions.

