
Ask HN: how do you read code? - ktharavaad
As a computer programmer, I've realized that one of the best way to improve myself is to read the code written by the masters of the art and try to emulate them. This is helped by the enormous amount of opensource code out there.<p>However, along the way, the very act of reading code has become a stumbling block in my journey.<p>I'm posting this because I want to get a perspective of how other programmers approach this problem. When faced with a huge chunk of code, how do you guys read it? do you read it line by line? do you guys put it into an IDE, look at the outline and simply jump into functions you are interested in?<p>Do you read through the "main" function first and then branch out to the utility functions or do the reverse where you read the utility functions and subroutines first and then figure out how they are put together?<p>please share with me some of the tips and tricks of code reading that you have discovered.
======
wheels
I don't just read code for the hell of it, I'm usually trying to get something
specific out of it, so usually I try to figure out where the core of what I'm
reading is at (starting with grep), occasionally cutting away the surrounding
code and perhaps irrelevant comments with my editor.

That said, I'm not downplaying reading code; it's an important skill. But you
don't have to just _read_ the code of the masters; you can jump in and add to
and fix bugs in stuff from a wealth of OSS projects and there you'll get
feedback from those folks too as you get involved.

I personally get a much better understanding of a block of code if I treat it
as a living thing and hack around with it seeing what breaks it and what makes
it go.

------
psyklic
In my experience, writing a lot of your own code teaches you MUCH more about
good design than looking at others'. Most of the design process is actually in
the reasoning behind WHY a certain process was employed, which is not easily
evident from code alone!

But since you asked:

(1) Run the code and understand what the software DOES. If a library, write a
driver program which tries out most of the functions.

(2) Look at the high-level classes (or the major organizational structures,
e.g. source files) and figure out the purpose of each (e.g. CLogger -> logs
errors to a file or the display)

(3a) (the fun way) Come up with a cool, simple feature you'd like to add, or
something simple you'd like to change. Do it. Repeat.

-OR-

(3b) (the boring way) Find the main() function and step through the code with
a debugger, stepping over function calls which are self-explanatory. Step into
ones you're unsure about.

In reality, this all depends on how well the code is written. Understanding
some code just isn't worth the price of admission.

~~~
nostrademons
You can't do either in isolation. If you just write code, you end up
ghettoizing yourself into a style that works for you but has little to do with
how others program, and misses out on all the nifty tricks they've discovered.
If you just read code, you don't understand why the author employed the tricks
they did, as you say.

I usually alternate between the two: find a good piece of code. Then try to
modify it to do something different. Then look at the style of the code around
the parts you touched, and see if there's a better way to do your
modification. Then modify something else in the general vicinity, and repeat.

When I started at Google, I basically just hacked the Google webserver to
display "Hello world" in various parts of the result page until I understood
what it was doing enough to start contributing to my team.

------
gills
I don't find 'main' and friends to be very meaningful), but sometimes it's ok
to start there, if you don't have anything in particular to find. And
structure is usually an artifact of behavior, so I almost always start with
trying to understand the behavior.

I like to first familiarize myself with the overall program flow. With an IDE
I can usually do this with 'called from' or similar views; if I can run the
code, throwing stack traces paints a nice picture of potentially-interesting
control paths, taking note of interesting details along the way.

Other than that, there's usually something I'm looking for when I dive into a
piece of code. So I find that function or something that looks like the right
area and start poking around the places it goes and the places it comes from,
see what parameters are moving around, what side effects are taking place,
etc.

I usually try to build a mental model of what _I_ would build if I were trying
to solve the same problem. It's usually not spot-on, but just doing a little
thinking ahead of time helps me navigate the program more easily, predict what
is going to happen, and be aware of recognizable handholds.

I try to keep an eye out open for abstractions and patterns so I can keep a
sightly more terse mental model and hopefully fit more of the program in my
head. If it's a really big chunk of code I'll keep pencil and paper handy and
map out the program's design as a I go.

Oh yeah -- keep docs handy. Every now and then they actually help.

------
carlosrr
Github is very good for reading code since it can be done through the browser.
If I am going to be using a particular library, I take a look at the commit
history and read the code on the more interesting commits.

Also, hanging out on IRC rooms is very helpful. Many people ask for code
reviews before committing to the libraries and you get to see how the library
evolves and what are the design decisions behind the changes.

------
bluishgreen
Profile the code. Once you understand where the time is spent, its probably
the main loop of the program or atleast will give you a clue to the main
function. Ofcourse you can ask someone or read the docs, but profiling it has
the added advantage of giving you the map of implementation to the intended
algorithm.

Further when you profile you can get a call graph nicely drawn out using many
utils available online. This instantly gives you a big picture.

I also second stepping in, adding feature, fixing a bug etc.

------
llimllib
You need this book, it's excellent: <http://www.amazon.com/dp/0201799405> .

~~~
GeneralMaximus
+1 for this book. It starts with diving into simple UNIX tools (echo, wc) and
then moves to more complex programs such as Apache. It does not tell you _the_
way to read code, but it has several nifty techniques that any developer
diving into a huge codebase might like to know.

------
lallysingh
Main's usually pretty useless for nontrivial apps. Usually some sort of
internal/library framework structures the system.

These usually work well together:

1\. Some way to cross-reference source files. e.g. ctags.

2\. Source-code level documentation. Doxygen, javadoc, whatever you have.

3\. Stack traces of the running app. Set breakpoints and get stack traces. If
your tools will let you do this, get periodic stack traces (e.g. once every
10th of a second) for a while. Look at what the code's doing.

I just went through two systems totaling 1.5 million lines of code.

First hint: don't even think of going through all the code. Instead, figure
out what's important to you, and start putting together a demand-paged map of
what you need to know.

Second hint: get a paper notebook and start writing things down soon. You'll
save yourself a _lot_ of re-researching facts that way.

------
swombat
I wrote a brief article on the topic:

<http://www.swombat.com/recognising-and-learning-good>

I think that the best way is not just to read, but to try to recreate the code
or part of the code using the original code as a guide. It's fascinating how
many decisions you'll come to that were taken in the original code too.

The good thing about doing it this way is that you'll really understand why
those decisions were made. Imho you learn the most this way.

------
safetytrick
I need to read more code... but when I do get the time to read code I never
start with "main" I start by searching for a feature I am interested in and
then branching out to see how this feature fits in with the whole. I normally
use N++ and open a huge number of files at once, its important to me to be
able to switch between files quickly and I don't want the bloat of an IDE.

------
geedee77
This is a really good question and one that I'm having trouble answering. I
find that I don't read through code in a definable way, it's more organic. I
think it really comes down to (as previously mentioned by someone) why I'm
reading through it.

Example, at work sometimes I need to read through other peoples code for
reviews / bug fixes / enhancements. In that case I'll read the bits that
matter or that I'm not sure what they're doing. Other times I might read, for
example, the source to jQuery or a site or something to find out how something
being done. In that case I'll try and follow where it does things by jumping
between function as the compiler / interpreter would.

Where I read code is, again, something that isn't hard and fast. I'll more
often than not just open it in a text editor (notepad as I'm a Windows man)
but sometimes I'll load it into an IDE if it's particularly long or I'm having
trouble following it in notepad.

Good question though!

------
russell
Reading code for large systems is rarely fun. I usually start with a specific
objective, a bug or feature, to teach myself the system. I root around until I
arrive at a solution. The first bug is expensive, but they become cheaper as I
learn. I have tried systematic top to bottom examination, but I find my mind
quickly turns to glue.

Good tools are essential. If you are programming Java, use eclipse. I hated it
for the first year, but now I tolerate it. Whatever your language, use its
IDE. I am also partial to UltraEdit. Write some good tools. I wrote a version
of unix find for Windows among other things.

If you want code that's fun to read try the Python Cookbook.
<http://code.activestate.com/recipes/langs/python/> is a good resource.

------
sh1mmer
I start by cleaning code a lot.

I've found that often it's better to make the code more legible (indenting,
IDE, etc) than just trying to parsing it without.

I would say unless the code is already in great shape normally tidying it up a
bit means comprehension happens much more quickly.

------
arebop
My last job involved a lot of code review. I usually started with static
analysis. Cyclomatic complexity is crude, but it seems like a reasonable
heuristic for identifying important or interesting portions of the code.

~~~
bmj
I think this is a really good way to learn, particularly if you can review
code written by a more senior programmer. If you work on the same project, you
have a good understanding of the problem domain, so the review provides
insight into potentially different ways of approaching the problem.

------
muon
For a new code base, that I will be working on \- First read documentation \-
On A3 paper, I draw what I understood, to me picture gives more clarity \-
Understand the files/directory/toolchain structure etc. \- Use ctags and
cscope \- Navigate by functionality

For others, I skim through books/blogs/opensource code

------
rubentopo
Depends on the language, if I'm reading Java code, then I'll use eclipse
(ctrl+h is much better than ctrl+right click, when searching for a methods
definition, it's the only way i can follow code that uses interfaces a lot).

If you're reading C code, vim + cscope + grep work great.

~~~
rubentopo
This is an excellent article on the subject, hope you enjoy it:

<http://mags.acm.org/communications/200810/?pg=38>

------
karim
I use emacs and its find-tag feature to jump to the definitions of functions I
don't understand.

------
kstenson
If the code has any unit tests starting from there can be a great diving
point.

~~~
jwickett
Forget unit tests, if the code lacks documentation, I'd first look at the
functional tests. The goal is to read the associated comments to determine
what functionality is actually being tested. From that, you'll be able to
understand what the module/program is actually suppose to do.

------
tptacek
Usually with a cross-referencer, like Doxygen.

~~~
dreur
I use Eclipse Ressource finder (Ctrl+shift+r), Type finder(Ctrl+shift+t) and
outline (Ctrl+shift+o) to jump to interesting parts.

------
known
I use GDB to read code.

