
Nobody's just reading your code - chauhankiran
http://akkartik.name/post/comprehension
======
bsimpson
One of the things that sets a good programmer apart is the willingness to
fearlessly dig into someone else's code. I've heard it said before that you're
responsible for every line of code you ship to your users. It follows that you
shouldn't treat your dependencies as black boxes. Getting a stack trace that's
3 or 4 levels deep in Django/React/<insert library here>? Dig under the covers
and understand what's happening. Include that information in a bug report, and
better yet, open a PR offering to fix it.

You'll learn a whole lot and be more effective if you let your curiosity
expand beyond code you/your team has written.

~~~
curtis
> ... willingness to fearlessly dig into someone else's code.

I've done this many times. The problem is that it takes a lot of time, and
there's no way you can dig through more than a fraction of a large codebase.
You gotta pick your battles.

~~~
nathanaldensr
Pragmatism wins the day, for sure. I'll happily read someone else's code
(e.g., an open-source GitHub repository I rely on) if it's necessary to
troubleshoot issues or implement a feature. There needs to be a _reason_
(i.e., motivation) to read code.

~~~
deckard1
On github and the node/JS ecosystem, the way this will play out:

    
    
      1. Spend hours digging through layers and layers of crufty JS.   
      2. Find the issue (and the package responsible for the issue).   
      3. Find out there is a Github Issue for the issue.   
      4. Find out there is a pull request for the issue.   
      5. Find out the pull request is sitting in position #35 of #128 pull requests going back to 2015.   
      6. Go binge drinking.

~~~
rudasn
7\. (next morning) Fork the project, apply your PR and any others you like,
and use your fork in your codebase.

8\. (optional) Thank the creators and maintainers for the time and effort they
put into it.

~~~
yoz-y
This is not always an optimal solution. Because by doing this you have
essentially became a maintainer of a dependency you initially wanted to 'just'
use.

~~~
RobertRoberts
This is sadly why I keep my dependencies to a minimum.

After coding for almost 20 years now, full time, I have never regretted not
using a dependency. But I have regretted using one multiple times.

(the regret is like a walk of shame during the "separation and cleanup" phase
at the end, which makes me question the "got work done faster" phase at the
beginning....)

~~~
bryanrasmussen
In 20 years you've never started work on something and after a bit said damn
it I think I'll use that library that other guy developed?

If you've ever switched out dependency X for dependency Y I suppose you
regretted not using Y to begin with.

If you've ever stopped working on your own solution to a problem and instead
used a solution provided in a library didn't you regret not just using the
library from the beginning?

~~~
RobertRoberts
> _In 20 years you 've never started work on something and after a bit said
> damn it I think I'll use that library that other guy developed?_

nope, I am just saying what I said, I never regretted not using one, and have
regretted investing in a few that later caused more headaches than they were
worth.

When we switched from inhouse code to a dependency, we never regretted doing
it on our own to start with because there was so much insight gained from this
and many other side benefits (like direct control, intuitive understanding of
how our code worked, etc...).

But when you have a dependency you add that you later have to work around, you
simply can't fix in the same way can your own code, and you just hate the mess
in a totally different way.

When it's your own code, you can fix anything. And replace it a piece at a
time if need be. With a dependency, there's usually catches, hacks or work
arounds, or conflicts built up over years that finally have come to a head.
And it sucks to fix and in many cases if I had waited even a little while, or
did more research at the time, I would have picked a different dependency or
none at all.

There are a lot of dependencies we use, I don't think you can run a business
properly these days without them. They just are not a part of our core systems
anymore. We use them for tertiary systems and addons, things that a
replaceable. Then our core systems can't be hijacked by stuff like the latest
NPM debacle.

~~~
Ace17
Yeah, this also matches my (+10 year) experience as a professionnal dev ...
with one exception, though: the "maintainance-can-of-worm" packages. These are
packages which, by design, can never be considered as finished, and
periodically need to be updated to stay relevant in _your_ application.

There are three reasons for this:

1) Those packages implement an ever-growing pile of tricks/heuristics to
convincingly solve problems from the domain. They include: physics engines,
SMT solvers, compilers/jitters/optimizers, computational geometry packages,
video encoders ...

2) Those packages implement an unification layer over an ever-growing/evolving
set of underlying APIs/protocols/formats. They include: SDL, ncurses, curl,
ImageMagick ...

(these are not to be confused with "bug-factories" packages, which might not
solve a complex problem, but still require constant updating to fix the
current bugs, and benefit from the new ones).

~~~
wolfgke
> There are three reasons for this:

> 1) [...]

> 2) [...]

Where is the third reason?

~~~
LasersGoPew
Classic off-by-one error.

------
alister
Reading code just for fun was a thing in the early and mid years of UNIX (eg.,
7th edition or System V). People eagerly passed around faint 10th generation
photocopies of Lion's printout of and commentary[1] on the UNIX source code.
The annual USENIX conference had a popular short course in which they went
through the entire UNIX kernel line by line. Why was that a thing back then
and not now? Certainly UNIX was an amazing piece of work, but there are
amazing works today -- the modern browser for example.

I think that the difference is that the totality of the UNIX kernel was
comprehensible for a single mind. The UNIX kernel in 1983 was less than 20,000
lines of code, almost all in C, and more than 75% was not machine-
dependent.[2] The amazing software of today, something worthy of reading for
fun, is literally millions of lines of code, written in multiple languages you
don't know, and evolving so fast that whatever you read might be completely
different a few months later. There's no joy in knowing a tiny part of that.
It would be like reading 10 pages from the middle of a novel, a novel so big
that you know you'll never finish it.

[1]
[https://en.wikipedia.org/wiki/Lions%27_Commentary_on_UNIX_6t...](https://en.wikipedia.org/wiki/Lions%27_Commentary_on_UNIX_6th_Edition,_with_Source_Code)

[2]
[https://en.wikipedia.org/wiki/History_of_Unix#1980s](https://en.wikipedia.org/wiki/History_of_Unix#1980s)

~~~
Klathmon
This is still done today, just with other software.

I've read blog posts that go through the entire redux javascript library line
by line, or review an entire chunk of React.

Even just yesterday here on HN was a submission that just contained shaders
from Wolfenstein.

~~~
Cthulhu_
There's a category of articles out there nowadays that do "$library in $num
lines of code", reducing popular libraries or frameworks down to their core -
filtering out all the edge cases, so to speak. I've seen them for Angular,
React, Redux, etc. I think that a lot of frameworks and libraries would do
good to have a "core" codebase somewhere, something pure but unoptimized and
without edge cases that are easily digestable, that explain the core idea of
it.

~~~
passivepinetree
Do you mind linking me to the React one? I was unable to find it, but maybe my
Google-fu is just not good today.

------
jesperlang
Reading code always felt awkward, slow and ineffective. The cognitive load is
huge, you have to compute some things while memorizing others and at the same
time keep track of flows. Thats why I always have pen and paper while diving
into a new big code base, you simply can't keep it in your head. But something
we forget often is... the goal of reading code is almost always to
_understand_ a system or some part of it. And reading code for me is a
terrible way to do it.

Code is just one level of abstraction and one way of seeing things. We should
have multiple ways of "reading" a system depending on what level we want to
understand it. No, I don't think that rushed, out of date "system
documentation" you wrote will do. I still think this is not solved in a
satisfying way. We could find inspiration from things that works well, just
imagine how google maps lets you look at the world at many different levels,
from continents down to street level. Also it provides different views of each
level, street maps are useful for tourists, terrain maps for hikers and
satellite images for someone who want to study vegetation. How you will "read"
a system depends on what you are trying to understand.

~~~
stenecdote
As the author of the post, just wanted to reply and say I'm totally open to
the notion that the solutions I proposed will not solve the problem. I'm much
more sure about the problem than I am about the solution.

A google maps zoom system for code sounds awesome, although I have no idea how
that might work.

~~~
lugg
Brackets inline editor comes to mind. Just replace CSS definitions with
function definitions (I think it might already do this for ?)

The zooming functionality you're thinking about probably wouldn't be very
practical. (Zooming in down to opcode or out to high level framework calls.)

But simply being able to inspect definition inline would do.

------
mjevans
One of the reasons I love golang* is how the entire source code for the
standard library is just a few easy clicks away from the documentation on how
to use it.

As an example: When I want to do something and the interfaces I'm seeing
provide friction (XML parsing not /quite/ handling normal, slightly incorrect,
HTML) I can get a better idea of what the library is doing behind the scenes.
I got an example of how to use other exposed interfaces of the library as a
tool kit for my own iteration (which added some state tracking and cleanup of
that mess) instead of relying on it to decode everything and dump it back for
me.

Other parts of the library are similar, maybe it does 90% of what your use
case needs, and you can extend those interfaces with your own library to add
the corner cases that are required in your specific use case.

*(even if it can be tedious at times because it's sort of a mid-level language; it abstracts some REALLY bread and butter things that probably can't be lived without, but it doesn't shield you from true complexity pitfalls)

~~~
goldfire
Rust's library documentation has this feature as well; each item has a [src]
link that takes you straight to the code being documented; see for instance
[0], and you'll find the link to the right side of the heading. This even
works for crates not in the standard library because it's actually a feature
provided by rustdoc, the standard documentation generator; for example, you
can get to the source for functions in serde (the most popular Rust
serialization library) directly from its documentation [1].

[0] [https://doc.rust-
lang.org/std/vec/struct.Vec.html#method.tru...](https://doc.rust-
lang.org/std/vec/struct.Vec.html#method.truncate)

[1]
[https://docs.serde.rs/serde_json/ser/fn.to_string_pretty.htm...](https://docs.serde.rs/serde_json/ser/fn.to_string_pretty.html)

------
antirez
This is a bit western centric maybe, because I see Chinese developers reading
tons of code, to the point that I receive an incredible amount of Redis PRs
about conceptual bugs that can never happen in practice, since some Chinese
developer is reading the code and doing the math in her/his head.

~~~
heavenlyblue
Any case you could share a link to one of these, if possible? A matter of
curiosity.

~~~
__s
First PR I saw on redis GH:
[https://github.com/antirez/redis/pull/4714](https://github.com/antirez/redis/pull/4714)

That same user also made
[https://github.com/antirez/redis/pull/4685](https://github.com/antirez/redis/pull/4685)

This may be?
[https://github.com/antirez/redis/pull/4622](https://github.com/antirez/redis/pull/4622)

This was apparently critical, but also kind of an edgecase:
[https://github.com/antirez/redis/pull/4568](https://github.com/antirez/redis/pull/4568)

& one more from someone else:
[https://github.com/antirez/redis/pull/4568](https://github.com/antirez/redis/pull/4568)

------
quickthrower2
I love reading code written in Elm.

I always dive in and have a look. The reason I love it is that Elm is quite
restrictive, there is a way of doing things and you can't deviate too much.
Therefore reading someone else's code is usually pretty easy.

In addition in Elm you can clone and run an Elm package and see the examples,
do some "time travelling debugging" all within 30 seconds :-), again because
there is one way of building and debugging things. No "Oh no browserify! I
normally use webpack" or "WTF all that global npm shit I need to install"
moment.

Just once per computer:

    
    
       npm install elm
    

Then do this

    
    
       git clone ...
       elm-reactor
    

And open your browser to localhost:8080

I'd probably say Elm is more Zen than Python!

------
kazinator
We read code for all kinds of purposes other than to make an intended change.
Like

How the heck does this behavior (good or bad, expected or not) come about?

Is this correct behavior assured in every case?

What are all the possible values of these poorly documented configuration
parameters or other inputs?

Will this apparently working operation break under concurrency?

Why is this so slow for these inputs, or under these conditions? / Why is this
so unexpectedly fast?

Is this doing nasty covert things I don't want, in addition to the overt
functionality that I want?

Is this using a particular library or OS feature to do something obvious, or
did they roll their own?

How will this scale in time, memory use or whatever for certain large inputs
that are too impractical to actually supply just to answer this question?

...

~~~
stenecdote
Hi kazinator, thanks for your comment (I'm the post's author). When I
originally wrote the post, I actually had a whole section about "reading for
questions", which discussed how the best way to read without modifying a
program is to read to answer focused questions you have about the codebase. It
kind of got lost when I shifted focus to active interaction versus passive
exploration, but I agree that reading with questions in mind is far superior
to just reading the code to understand it generally.

------
encoderer
I’d like to spend more time reading great code. I think I’d get something out
of it.

Possibly more important, though, I’ve always had a fearlessness to dive into a
codebase. I’ve built a little thing called Cronitor and recently I was giving
a talk on building our first server agent in Go. Long story short, there were
a few people surprised that I studied a couple crond implementations to figure
out some important details for proper monitoring and I was surprised they were
surprised. I don’t write much c or any c++ but as long as you approach it with
some fearlessness and a simple plan you can learn from unfamiliar codebases.
Spend up to an hour figuring out how a project is structure (hopefully a lot
less) and then just skim/grep to find a spot probably close to what you’re
looking for. Spread out from there.

The best way to learn a new language is to build something useful and learn
what you need along the way. I think studying code can be done the same way.

~~~
bsimpson
Another way to approach this: review other people's code at work (e.g. code in
another language or for another product). Even if you don't know the syntax of
a language, you can probably follow enough of it to learn something
interesting. A commit is going to be a much more easily-digestible unit than a
whole codebase, and you can go ask the author to explain anything that you
don't understand. Plus, over time you might understand enough to help out on
that project, or help integrate it with yours.

------
abcd_f
How about some good _annotated_ code? :)

[http://www.kohala.com/start/tcpipiv2.html](http://www.kohala.com/start/tcpipiv2.html)
\- 15K line implementation of TCP/IP stack from BSD. Fantastic book, reads
like a novel.

[http://www.pbrt.org/](http://www.pbrt.org/) \- fairly detailed write up on
building a pretty decent 3D renderer

------
markpapadakis
Studying codebases is one of my rewarding hobbies. I ‘ve learned more from
that practice( [https://medium.com/@markpapadakis/interesting-
codebases-159f...](https://medium.com/@markpapadakis/interesting-
codebases-159fec5a8cc) ), than from most technical books I ‘ve read (by the
way, most technical books aren’t particularly great).

------
pjungwir
I agree most programmers are too quick to write before they read, but I'm not
sure why it has to be "for fun". Reading code takes effort, just like
understanding any new complex thing. The primary virtue here is _patience_.
It's a lot like debugging actually: the goal is to understand.

Setting aside the "for fun", of course I read my co-workers' code, but also I
read my dependencies' code quite often. In particular when you have something
that is meant to be extended, the docs never completely cover all the use
cases, and the quickest way can be to start reading.

Some examples from my own recent work are reading code from the Ruby gems
devise, devise_token_auth, and spree, and also reading Django class-based
views (all things meant to be extended). One trick I've learned in Ruby is `cd
$(bundle show spree_core)`. From there you can grep, find, read, even add your
own `puts` or whatever.

Actually I _have_ been reading some code for fun: Postgres! I have an
extension that adds temporal foreign keys, and I'm slowing porting it from
plpgsql to C. Reading the code for standard foreign keys is very helpful. Same
thing for a lot of other extensions I've written in the last few years. It has
helped me a ton to read others' extensions and the core code.

~~~
Kagerjay
This is what I've been working on for the past few months. I've been writing a
lot of spaghetti code in the past and come back to find how poorly I organized
it. I'm suppressing the urge to immediately write code, but rather, look at
the big picture first, digesting the information, even if it means prolonging
writing anything for a day or several hours.

I usually write pseudocode first before doing anything now. Either on paper,
as code comments, or as a list. Sometimes I find myself writing tests on paper
as well, it helps out a lot too. A programming language is just a means to an
end, what they all have in common is a set of logic to follow.

Then I debug. I just fork the code, comment things out, add a little bit of
functionality, etc. When I have a grasp on it write some TDD

I find the easiest codebase that I refer a lot is the todomvc, or any similar
ones that have different implementations (differnet languages) for the same
general solution. All the core logic between languages is mostly the same, so
its easy to dig through and understand a different programming language / how
its generally organized.

One thing I remember learning about from experience polyglot programmers. Its
extremely helpful to have an existing codebase you've written (such as
todoMVC), and using that same example / porting it over in a different
language to learn that languages nuances.

------
peterburkimsher
"we usually read when we want to edit... That hacking produces better
comprehension than passive, linear reading fits with what we know about
learning... solid understanding emerges from active exploration, critical
examination, repetition, and synthesis. Hacking beats passive reading on three
out of four of these criteria."

This doesn't just apply to learning programming languages - also learning
Chinese.

[https://pingtype.github.io/docs/failed.html](https://pingtype.github.io/docs/failed.html)

I've had much more success since writing my own tool to translate the text I'm
interested in, rather than being spoon-fed some patronising phrases to recite
from a textbook.

------
djhworld
The thing about code reading for me is you have to have decent tooling.

It's pointless reading code in Github, especially for a codebase you are not
familiar with, you need to get the codebase open in an IDE to be able to dive
in quickly and back again to build the map out in your head. I don't think you
can read a portion of code from top to bottom without having tooling....unless
the piece of code is really short and just one or two files

~~~
wkz
The one online tool that I use for this kind of work is Woboq Code Browser.
It's only C/C++ projects and not many are freely available. The Linux kernel
is though and I spend a lot of time in that code base. glibc, gcc and llvm are
also available.

[https://code.woboq.org/](https://code.woboq.org/)

~~~
guruz
Which projects would you like to see on code.woboq.org? Maybe we can add them.

------
dizzystar
Sometimes reading code saves your butt. I've had to make very minor changes to
a code base to make it compatible with Python 3, and I've had to add things to
HTML libs to prevent execution to JavaScript.

Taking an open source library at face value can sometimes put you in a
precarious situation. For me, it's an exercise of following the errors or
trying to break the code. I'm not really sure if that counts as reading.
Reading a large code base, in my experience, calls for a pencil and paper.
It's a lot more work than passively reading and trying to hold all the
information in my head. Reading code isn't the same as reading a novel.
Perhaps better guidance on what actively reading code means is in due order?

------
fghtr
[https://www.fsf.org/blogs/community/who-actually-reads-
the-c...](https://www.fsf.org/blogs/community/who-actually-reads-the-code)

"Two-and-a-half months later I received an email from someone who not only
managed to find the comment, but also managed to guess the code had to be
rot13'ed."

------
mattbierner
On this spectrum, there's also "read by refactoring"
([https://www.jamasoftware.com/blog/read-by-
refactoring/](https://www.jamasoftware.com/blog/read-by-refactoring/)). You'll
need good refactoring tooling and good versioning (as you'll almost certainly
want to discard many of your trial refactorings) but I've found that the
approach helps me evaluate how my mental model matches up with reality. It
also generally leads to a more readable codebase too

------
mannykannot
I must disagree with this sentiment, though I understand where it is coming
from:

"Clean, solidified abstractions are like well-marked, easy-to-follow paths
through a forest — very useful if they lead in the direction we need to go,
but less useful when we want to blaze arbitrary new paths through the forest."

Using the same analogy, if you want to blaze a new trail through the woods,
you need a map of the territory, and a map for this territory takes the form
of higher-level abstractions with clear, explicit semantics and minimal but
effective, and unambiguous, interfaces.

One of the biggest problems in modifying or reusing existing code is the risk
of breaking higher-level consistency, for example by violating an implicit
constraint that is necessary for correctness (the biggest Ethereum errors have
been of this form, where the implicit assumptions included when and how often
initialization is performed, and that a certain library will be present.)

If the clean code movement has been delivering code that works as step-by-step
instructions for getting from A to B, but not as a map, it may be because
breaking code down into small functions and classes is easier, more visible
and more measurable than making coherent abstractions with consistent and
minimal interfaces at a higher level.

~~~
stenecdote
To push back on this point a little bit, I think we agree that a map is one of
the most useful things you can have but disagree on what the right type of map
is.

While I'm not anti-abstraction, I think people often impose poorly chosen
high-level abstractions on top of messy lower-level components (akkartik calls
this Authoritarian High Modernism in a comment
([https://lobste.rs/s/gtxi5y/nobody_s_just_reading_your_code_h...](https://lobste.rs/s/gtxi5y/nobody_s_just_reading_your_code_how_we#c_0joqn2))
on Lobsters). This approach leaves readers having to understand a bad
abstraction and its internals, a map, its territory, and inconsistencies
between the two.

As I understand it (please correct me if I'm wrong), you want people to do the
work to find better high-level abstractions. I agree that it would be good if
we could do this more often. I just also think we can find other ways to give
people maps that work with our existing, messy codebases that don't have good,
minimal abstractions already.

~~~
mannykannot
Yes, the start of your last paragraph states what I would like to see, and the
statement I took issue with specifically dismisses 'clean, solidified'
abstractions, not messy, failed attempts at it, yet I am also very well aware,
from personal experience, that we often have to deal with code that is not
like that (even though, in some cases, written by people who thought they were
doing SOLID work.)

The question is what to do about it, especially given that abstraction and the
separation of concerns are not suddenly going to fix things after all these
years of being more honored in the breach. I do not think that there is a
programming style that leads to the writing of understandable code without
having a coherent vision of the big picture, because I do not think you can
understand the purpose of code in the small without knowing its place in the
big picture (except insofar as the author has successfully separated the
independent concerns, which brings us back to my original point about the
value of abstraction.)

One can still modify such code, but the less you understand about it, the more
likely it is that you will make mistakes, and there is also a real possibility
that the result will be harder to understand than the original - this vicious
circle is one reason why much-modified code is usually difficult to
understand.

While I don't hold out much hope for programming style to save us from this
situation, I think we might do better with tools to help us understand code.
In particular, I sometimes find myself doing program slicing by hand (what
ways are there to get from A to B that modify X, for example), and there is
some tool support for doing so, though availability is spotty at best.

------
supermatt
"But you can’t expect a map to tell you what questions to ask, and it makes no
sense to read a map linearly from top to bottom, left to right."

But you can read/view a map as an illustration to get an overall view of the
layout of things and how they are connected.

I read code. It surprises me that people deem themselves an expert in some 3rd
party lib without actually understanding the internals.

------
jmadsen
I had to use a js library for something today.

npm install downloaded 113 new packages. I'm not even a javascript programmer;
I'm saddled with being "full-stack"

I'm supposed to read all that every time I need a new library? Or write it
myself, when I have absolutely no idea how to build something that
complicated, if I had time, negating the whole point of OSS in the first
place?

~~~
aepiepaey
The point made is to read code to become a better programmer. That doesn't
mean you have to read all code you ever use.

------
Moodles
In a similar vein, I often wonder just how many people actually check
cryptographic algorithms. We'd all probably agree that it is good not to roll
your own crypto algorithms, or even in implementation of known crypto such as
AES. But how many experts in the world are there for AES and secure hashing? A
lot of the major MD5 and SHA1 work all came from Marc Stevens' team. How many
experts are there in this field, really? "Provable security" is another issue:
Shoup, Menezes and Koblitz have all expressed concern about the current state
of the art, I think.

~~~
simias
I'm far from an expert in cryptography but I've had to dig into OpenSSL's
codebase on more that one occasion and every time it left me with a deep sense
of uneasiness. It's really not what I would deem good code: too many macros,
too many potentially confusing API conventions (inconsistent return values in
case of error, unclear resource lifetimes, ...), not enough comments and
documentation...

It's not the worst C codebase I've ever seen (far from it) but for something
as critical I'd put the bar very high it terms of code quality and OpenSSL
doesn't even come close.

------
W0lf
I wouldn't have thought to be in the minority when stating that I actually
read a lot of foreign code out of curiosity. Especially when dealing with
architectural decisions, I like to dig into codebases I consider to be good
and professional (a rarity in the professional/proprietary development world
unfortunately). For instance, reading the chromium source code or mozilla for
that matter, give you a lot of good input since both codebases are
sufficiently complex and deal with a lot of layers and stacks.

~~~
albertgoeswoof
How do you know a codebase is good before you invest time digging into it?

~~~
W0lf
I don't. There are a few indicators though like the overall structure of the
source files, how they are laid out on the file system. Minor details like
inconsistent formatting. Typically if one or more of those indications is
messed up, I'm pretty sure the rest of the codebase is also not in a good
state. I still do skim through some files to confirm my expectation though.

------
henesy
One of the best parts of the Plan 9 system is that all of the
kernel/command/library source is on hand and readily available to consult at
any given time.

Jointly, an important detail is that the complexity of said source is kept to
a sane minimum and general style trends mean that most, if not all, of the
source is formatted similarly and legibly. The system is so compact that you
can keep most of the system source in your head at one time if you really need
to.

~~~
henesy
A fun tidbit from the Plan 9 compilers only really able to be found by reading
the source:
[https://groups.google.com/forum/#!msg/comp.os.plan9/uMF7A4gk...](https://groups.google.com/forum/#!msg/comp.os.plan9/uMF7A4gkJsw/kNWBQ8U93LcJ)

------
mark_l_watson
One thing that helps me make reading code in code reviews less passive is to
add occasional temporary print statements and then run the unit tests again
that cover the code I am reading. Seeing variable values satisfies some
curiousity. Sometimes I will use a debugger to step through code I am
reviewing but for some reason I prefer print statement to satisfy any
curiousity I have about how the code works.

------
your-nanny
Do I sit back in my comfy chair on an evening and just read through some code
repo? Well, actually, yeah, I do... Though not terribly often. Do I sit back
and read through programming bloggers' explaining their code? Yes, pretty
often, and based on HN, that's pretty true of a lot of programmers.

Still I'm more likely to find myself browsing wikipedia for math and cs

------
svat
Peter Seibel, the author of the _Coders at Work_ book mentioned in the post,
actually discussed this in a blog post: [http://www.gigamonkeys.com/code-
reading/](http://www.gigamonkeys.com/code-reading/)

It's not quite true that _none_ of the programmers interviewed in the book
routinely read code for fun. In his blog post, Seibel mentions the exceptions:

> First, when I did my book of interviews with programmers, Coders at Work, I
> asked pretty much everyone about code reading. And while most of them said
> it was important and that programmers should do it more, when I asked them
> about what code they had read recently, very few of them had any great
> answers. Some of them had done some serious code reading as young hackers
> but almost no one seemed to have a regular habit of reading code. Knuth, the
> great synthesizer of computer science, does seem to read a lot of code and
> Brad Fitzpatrick was able to talk about several pieces of open source code
> that he had read just for the heck of it. But they were the exceptions.

The Knuth exception is notable: he _does_ routinely read code for fun. In the
book he mentions things like

> I’ve got lots of collections of source code. I have compilers, the Digitek
> compilers from the 1960s were written in a very interesting way. They had
> their own language and they used identifiers that were 30 characters long
> but very descriptive, and their compilers ran circles around the competition
> at the time—this company made the state-of-the-art compilers of 1963 or ’64.
> And I’ve got Dijkstra’s source code for the THE operating system. […] I
> collected it because I’m sure it would be interesting to read if I had time.
> One time I broke my arm—fell off a bike—and I had a month where I couldn’t
> do anything much, so I read source code that I had heard had some clever
> ideas in it that hadn’t been documented. I think those were all extremely
> important experiences for me.

And then a passage quoted in the post about how he reads code, with a Fortran
compiler as example (not repeating here it's in Seibel's blog post:
[http://www.gigamonkeys.com/code-reading/](http://www.gigamonkeys.com/code-
reading/) ), after which the interview (and the book) ends with Knuth saying:

> don’t only read the people who code like you.

(The examples in the book of people who last seriously read code years ago
also interesting, e.g. Douglas Crockford mentions some programs he read, and
there are multiple pages in the book about Guy L. Steele reading the TeX
program.)

Anyway, back to Knuth: I think that, the way he reads and digests code (or
even papers: [http://blog.computationalcomplexity.org/2011/10/john-
mccarth...](http://blog.computationalcomplexity.org/2011/10/john-
mccarthy-1927-2011.html?showComment=1319546990817#c6154784930906980717)), it
addresses a lot of the points the linked post makes. Even when “just reading”
code, or anything for that matter, you _are_ supposed to be doing the things
the post mentions in “hacking versus passive reading”: active exploration,
critical examination, synthesis.

For example, Knuth read the code of ADVENT (aka _Colossal Cave Adventure_ ),
and loved the code (not just the game itself) so much that he rewrote it to
share with others to read, in his own preferred literate-programming (CWEB)
style that matches the way he thinks
([http://www.literateprogramming.com/adventure.pdf](http://www.literateprogramming.com/adventure.pdf)).
This definitely doesn't sound like “passive reading” to me.

Nevertheless, great post.

~~~
stenecdote
Upvoted for careful scholarship and a useful addendum (I'm OP)!

I sometimes wonder whether a lot of Knuth's greatness comes from doing more of
the stuff everyone knows they should do but don't. If you read this interview
([https://github.com/kragen/knuth-
interview-2006](https://github.com/kragen/knuth-interview-2006)) with Knuth,
he talks about how he was nervous he wouldn't be able to learn calculus so
just decided to do all the problems instead of just the assigned ones.
Unsurprisingly, partly because he's Knuth and we all know Knuth can do math,
he ends up really good at calculus: > But Thomas’s Calculus would have the
text, then would have problems, and our teacher would assign, say, the even
numbered problems, or something like that. I would also do the odd numbered
problems. In the back of Thomas’s book he had supplementary problems, the
teacher didn’t assign the supplementary problems; I worked the supplementary
problems. I was, you know, I was scared I wouldn’t learn calculus, so I worked
hard on it, and it turned out that of course it took me longer to solve all
these problems than the kids who were only working on what was assigned, at
first. But after a year, I could do all of those problems in the same time as
my classmates were doing the assigned problems, and after that I could just
coast in mathematics, because I’d learned how to solve problems. So it was
good that I was scared, in a way that I, you know, that made me start strong,
and then I could coast afterwards, rather than always climbing and being on a
lower part of the learning curve.

~~~
svat
Yes exactly. It's always inspiring to see how (in his case) you just start
doing things simply and methodically with focus, and eventually you get very
far and become better than anyone else.

------
kruhft
Somebody read my code once and commented on my improper use of 'mod' or
something like that. It then turned into a long discussion on how they might
be incorrect, and so on and so forth on Reddit, back and forth on Lisp.

I love Lisp. It's a great language with great programmers.

------
Stwerner
I've been thinking about this idea for a while and love the reimplementation
idea. I really want to start live-streaming myself coding (or pair
programming) and have been trying to come up with some ideas that would end up
being interesting to watch.

One idea was taking a popular open source tool with a good test suite and
trying to get the tests to pass, but I think checking code out at a point
before a new feature is implemented and reimplementing it could end up being
really interesting. Could even make it more interactive by sharing the
checkout point so viewers can do an implementation themselves and doing a
stream walking through different interesting implementations.

------
dmurthy
A few years ago in my previous company there was a major push to convert a lot
of the codebase to Python. A byproduct of this conversion was an easier way to
build UI using predefined widgets. Yet no one was using this functionality due
to the mindset of years of TCL. I honestly hated TCL. So I took it upon myself
to explore the entire new code base and go about building UI panels. Initially
I just couldn't understand how the code mapped out until one day I just
printed out some of the code and read it repeatedly till it made sense. The
whole exercise was an eye opener.

------
OutThisLife
Best way to figure out someone else's codebase, IME, is to just dive in heavy
and start breaking things. Figuring out what's connected to what and why by
removing chunks of code that you need to understand.

------
z3t4
I read the source code for the libs all the time, mostly to see what methods
and properties they expose. I learn something new every time. This is
JavaScript/NodeJS. I don't miss compiled .dll's

------
chukye
Once I propose for a team to read the code of Redux before adding it to the
codebase (if you don't know, redux is a very small libraries, very easy to
read). But no one seemed to care, they was just saying things like "Well, I'll
read this blog post that teach us how to use it", and I was like: "BUT, if you
read the code You will know _exactly_ how to use it! Why not go to the source
instead of read other people opinions?!"

------
gwbas1c
Reading code is difficult and time-consuming. To really understand a function
you need to try it with different inputs, either with interactive testing or
in a debugger; AND you need to know what the functions it calls does.

The second part, knowing what the code the function is calling does, is nearly
impossible when they are other functions in the program itself.

------
INTPenis
I love to dig through peoples code. Unfortunately I know myself enough to know
that it can be a giant time sink if I let myself go.

In swedish there's a phrase called "snöa in", to get snowed in literally, and
that's what happens to me.

So after enough years in the biz I try to hold myself back from this habit.

------
superasn
By corollary why it's also so important and terrifying to release open source
projects too.

You're oftening wondering who is gonna dig into this code of yours and think
about your coding, esp. when you're not following the best practices and TDD.

------
mar77i
If you want to exercise reading concise and easily understandable standard C
code, grab a toy project from suckless.org. Even though I write my code more
defensively in some ways, that stuff is really straightforward to hack on.

------
SilentCrossing
Interesting read, but I find comparing code to a map to simplistic. I would at
least say that code is more like a puzzle of a map. You may or may not
understand each puzzle piece and where it fits it. That is my experience with
reading code.

Code is like Ogers and Ogers like Onions. They have layers of complexity :P

------
TheMagicHorsey
I read code to learn all the time. And I don’t get paid to program.

------
poindontcare
The only way to be sure you understand everything is to learn quantum
mechanics perhaps string theory too just to be safe. But in practice modesty
dictates we limit to learning as much as we can to get the job done.

