
Ask HN: What code samples should programmers read? - _Microft
What are examples of code that ever programmer should have read and analysed because they show a particularly deep insight into the problem, a creative attempt at a solution or otherwise outstanding proficiency?<p>Goal is to learn something from excellent pieces of code, much like it is practised in art by studying Old Masters, pieces of literature or musical compositions.<p>PS: I admit to exaggeration in the headline &#x2F; first paragraph (&quot;should have&quot;) btw ;)
======
daliwali
Here's an overlooked one: your dependencies.

Every programmer should read the (presumably open source) code they depend on,
but almost nobody does it. Some look at documentation, but not everyone even
does that. What you may find is that some of the code you depend on is
garbage, and may even motivate improvements.

This may be less feasible for giant messes such as most front-end tooling and
frameworks that exist today.

~~~
seanwilson
> Every programmer should read the (presumably open source) code they depend
> on, but almost nobody does it.

Has this ever been beneficial? I generally just check if a library is popular
and maintained and that's always been enough for me. If the code really is
horrendous nobody is going to be using it and if you always read the hundred
of thousands of lines of code you depend on you'd never get anything done.

~~~
daliwali
Popularity is not an indicator of anything other than popularity. McDonalds is
the most popular burger sold in the world, that does not make it good.

Length is an important consideration and most never consider it, partly why
software is notorious for getting bloated over time. At some point it would be
easier to rewrite code from scratch with less complexity and overhead.

~~~
horsawlarway
You sound very close to advocating a "Not invented here" approach.
([https://en.wikipedia.org/wiki/Not_invented_here](https://en.wikipedia.org/wiki/Not_invented_here))

There are some rare cases where that makes sense (Keyword: RARE), but most
times I've seen developers take that stance, it leads to longer development
times, less consistent code, more maintenance work, and generally pretty
shitty outcomes.

Popularity is ABSOLUTELY an indicator of things other than just popularity.
However you need to understand why the thing is popular and vet that against
your needs.

Finally: McDonalds IS good. I don't eat there often, but they provide a
consistently satisfactory experience for their customers. There's a reason you
can find McDonalds nearly everywhere on the planet.

It's popular to diss on companies that have become large iconic chains, but
they're large iconic chains for a DAMN good reason. Plus, they're solving
supply chain issues you aren't even considering, much like large popular
libraries are handling use-cases and problems you don't even know about.

Can I make a better burger? Sure. Can I make a better burger for the same that
McDonalds charges, as consistently as McDonalds does? Fuck no.

The same probably applies to your software.

~~~
khedoros1
> Finally: McDonalds IS good. I don't eat there often, but they provide a
> consistently satisfactory experience for their customers. There's a reason
> you can find McDonalds nearly everywhere on the planet.

McDonald's is consistent, predictable, cheap, generic, and ubiquitous.
Sometimes those are good qualities. "Good enough" is usually good enough,
after all.

Regionally, there are a half-dozen choices that I'll take over McD's: Less
iconic world-wide, but as cheap or cheaper, just as consistent, and more
pleasing to the palate. I think that's closer to what they were advocating:
Not necessarily building it in-house, but also not taking popularity as a
reliable indicator of code quality (beyond a bar of minimum acceptability).

~~~
horsawlarway
If it hadn't been for the last sentence:

>At some point it would be easier to rewrite code from scratch with less
complexity and overhead.

I'd agree, and probably wouldn't have commented.

But I think being all of these things:

>consistent, predictable, cheap, generic, and ubiquitous.

Means you have a damn good product. No one is claiming you can't get something
better, but come on, I'd love to have software that's consistent, predictable,
cheap, generic, and ubiquitous.

If those words describe a popular library, and you decide instead to build
your own thing in-house... you better have a really _REALLY_ good reason for
doing it.

~~~
khedoros1
> Means you have a damn good product. No one is claiming you can't get
> something better, but come on, I'd love to have software that's consistent,
> predictable, cheap, generic, and ubiquitous.

Note that I never said "consistently good" or "predictably reliable". There
are a lot of things in the world that are known to be low-quality, single-use
crap that are _still_ everywhere.

> If it hadn't been for the last sentence:[...]

I think it depends on their idea of where "at some point" lies.

------
mafribe
For those who understand networking: the Mirage TCP/IP networking stack [1] in
pure OCaml is a must. It's an object of extreme beauty, and possibly the most
eloquent argument for types, types inference and algebraic data types I can
think of. The TCP state machine is mostly specified at type level [2],
preventing numerous potential bugs in one fell swoop.

Reading this code is probably most enlightening if you have already written
networking protocols.

NB: this has nothing to do with OCaml, other comparable languages with ADTs
(Scala, Rust, Haskell, F#) would be similarly suitable.

[1] [https://github.com/mirage/mirage-tcpip](https://github.com/mirage/mirage-
tcpip)

[2] [https://github.com/mirage/mirage-
tcpip/blob/master/lib/tcp/s...](https://github.com/mirage/mirage-
tcpip/blob/master/lib/tcp/state.ml)

~~~
int08h
Only looked at link [2], but wow. That state machine is brilliant.

Don't know OCaml but am learning Rust and I see what you mean about the
universality of how types make this possible.

Thanks for pointing this out.

------
elorm
Check out 500 Lines or less
[https://github.com/aosabook/500lines](https://github.com/aosabook/500lines)

I just can't recommend it enough. All the projects are open source, so you can
review the source code and still be walked through the code by book. You'd
learn why the programmers made certain trade offs and how the applications
became better of for it

~~~
andai
500 lines or _fewer_ ;)

~~~
waqf
500 lines or fewer lines; but 500 lines or less code.

~~~
songshu
Yes! I often find myself making this point -- i.e. that the supposed problem
indicates that the reader has made an incorrect assumption about an imaginary
elision. Cf. "5 items or less [stuff]", "$5 or less [money]".

------
petercooper
This doesn't entirely match the criteria in your question but I think dipping
into _the standard library_ for your language (if it has one) is a good idea.
It's not only useful to see what you're using but it can also show you how
language experts use the language (which can be useful stylistically) and be
an archaelogical exercise in the history of the language and its priorities
(particularly true with Ruby's stdlib, I found).

~~~
real-hacker
Agree. I sometimes read Python standard lib for fun.

------
fenomas
I'm not a C person, but I've often heard that the Quake source code is good,
practical-not-necessarily-elegant code that's worth emulating.

[https://github.com/id-Software/Quake](https://github.com/id-Software/Quake)

~~~
H4CK3RM4N
Wasn't Quake the one where they did a really good approximation of he inverse
square root because they needed it?

~~~
dmurray
That was Quake III [https://stackoverflow.com/questions/1349542/john-carmacks-
un...](https://stackoverflow.com/questions/1349542/john-carmacks-unusual-fast-
inverse-square-root-quake-iii)

------
jonsen
IEEE CODE OF CONDUCT [PDF]:

[http://www.ieee.org/about/ieee_code_of_conduct.pdf](http://www.ieee.org/about/ieee_code_of_conduct.pdf)

ACM Code of Ethics and Professional Conduct:

[https://www.acm.org/about-acm/acm-code-of-ethics-and-
profess...](https://www.acm.org/about-acm/acm-code-of-ethics-and-professional-
conduct)

~~~
nailer
> We will not engage in (list of bad things) via cybertechnology or otherwise.

cybertechnology?

~~~
andai
COMPUTERMAGIC

------
mtreis86
I am learning lisp and was recommended to read "Paradigms of AI Programming"
as the examples are all given in lisp.

The later book is "Artificial Intelligence A Modern Approach" which is written
with pseudocode examples. The site includes other languages than lisp; python,
java, js, scala, and c#.

PAIP lisp code:
[http://www.norvig.com/paip/README.html](http://www.norvig.com/paip/README.html)

AIAMA code: [https://github.com/aimacode](https://github.com/aimacode)

------
shezi
Following Handmade Hero is always a good idea.

[http://handmadehero.org/](http://handmadehero.org/)

~~~
malydok
Wow, this is impressive. The dedication required is immense, over 380 episodes
recorded already.

~~~
Tyr42
Yeah, there's a lot of them. I think the best parts so far (I'm at 300ish) was
the dll hot reloading, the software pixel renderer, with SSE instructions, and
some of the stuff around how the debug machinery works.

------
golergka
Fabien Sanglard wrote great reviews of various codebases, including git, Doom
3 and others.

[http://fabiensanglard.net/](http://fabiensanglard.net/)

------
Walkman
The Architecture of Open Source applications:
[http://aosabook.org/en/index.html](http://aosabook.org/en/index.html)

These are very detailed, very well written articles from well-acknowledged
developers about their OSS project.

------
jacquesm
Hashlife and anything else implemented by Norvig, he's one of the best
programmers that I've had the pleasure of reading code from.

Plenty of tricks to be learned there, as well as fantastic structure.

~~~
abecedarius
Agreed, his code shows craft at every level.

Is a hashlife by him on the web? The first page of google results only turned
up an older comment by you.

~~~
jacquesm
It used to be, let me see if I can dig it up.

Grr, nope :( That sucks.

Ok, so try this instead:

[http://norvig.com/sudoku.html](http://norvig.com/sudoku.html)

Not quite as satisfying but it gives you the general idea, his whole approach
is elegant and super direct.

~~~
abecedarius
Oh well, thanks! I'll read the Sudoku solver after I get around to writing my
own -- it helps to keep you from just nodding along.

~~~
jacquesm
I sent an email to see if I'm either mis-remembering or it got taken down.

Ok, received answer from Peter Norvig, I must have mis-remembered who wrote
what I read, but in the interest of completeness here is a life implementation
by Peter Norvig (but not hashlife):

[https://github.com/norvig/pytudes/blob/master/Life.ipynb](https://github.com/norvig/pytudes/blob/master/Life.ipynb)

~~~
abecedarius
Nice. I saw one working like that in Clojure and rewrote it in my own Lisp:
[https://github.com/darius/squeam/blob/master/eg/bag-
life.scm](https://github.com/darius/squeam/blob/master/eg/bag-life.scm) (the
bag is like Python's Counter).

------
simon_acca
All of the norvig Jupyter notebooks:
[http://norvig.com/ipython/README.html](http://norvig.com/ipython/README.html)

------
nxc18
Really, no one has said Linux yet? This is very surprising, given how often it
is put on a pedestal for its excellent design...

Realistically this is because it is in fact a big messy pile of 'at least it
works'.

Still, it is worth studying as an example of a work that was architected to
support open source contribution from thousands of developers.

~~~
qb45
So which parts did you study and actually understand?

I have contributed to device drivers a few times, for example. And I wouldn't
really recommend this part of Linux for learning. Maybe they work but code
readability is often neglected and it's not unusual to see whole functions
without a single line of comment or files/modules without even a short
explanation of what they are trying to achieve.

Maybe the core is better.

------
franzwong
You should read the code of a open source project that you often use. You
might not have the "ah ha" moment when reading something unfamiliar.

------
mpfundstein
ffmpeg source code :-) its beautiful C code and you will learn how to build a
maintainable, modular system with just the tools that C gives you.

~~~
ktta
Doesn't ffmpeg have a reputation for bad code merge practices? I've heard
libav[1], a fork of ffmpeg has much better focus on code quality. Note that
ffmpeg frequently merges any new code from libav. So ffmpeg is almost a
superset of the both.

For anyone wondering why this is, here is a good explanation.[2]

[1]:[https://www.libav.org/](https://www.libav.org/)
[2]:[http://blog.pkh.me/p/13-the-ffmpeg-libav-
situation.html](http://blog.pkh.me/p/13-the-ffmpeg-libav-situation.html)

------
db48x
A metacircular interpreter for Scheme. You can even watch a lecture where this
is presented ([https://ocw.mit.edu/courses/electrical-engineering-and-
compu...](https://ocw.mit.edu/courses/electrical-engineering-and-computer-
science/6-001-structure-and-interpretation-of-computer-programs-
spring-2005/video-lectures/7a-metacircular-evaluator-part-1;) see also part
2).

~~~
_Microft
Thanks! By the way, there's a trailing semi-colon on the link that needs to be
removed for it to work.

~~~
majewsky
Fixed link: [https://ocw.mit.edu/courses/electrical-engineering-and-
compu...](https://ocw.mit.edu/courses/electrical-engineering-and-computer-
science/6-001-structure-and-interpretation-of-computer-programs-
spring-2005/video-lectures/7a-metacircular-evaluator-part-1)

------
Animats
Fang.[1]

Fang is a utility program for UNIVAC 1108 computers, written in 1972. UNIVAC's
EXEC 8 had threads and async I/O for user programs, decades before UNIX. The
machines were shared-memory multiprocessors. FANG uses those capabilities to
parallelize copying jobs. The UNIVAC mainframes had plenty of I/O parallelism
and many I/O devices, so this was a significant performance win.

See especially "schprocs". Those are the classic primitives from Dijkstra: P,
V, and bounded buffers. That technology predates Go by 40 years. Here's
Dijkstra's P function:

    
    
        .
        .
        .         DIJKSTRA P FUNCTION
        .
        .
        .         LA,U      A0,<QUEUE>
        .         LMJ       X11,P
        .         <RETURN>                      X5 DESTROYED
        .
        P*        TS        QHEAD,A0            LOCK THE QUEUE
                  LX        X5,QN,A0            LOAD QUEUE COUNT (note: load)
                  ANX,U     X5,1                BACK UP THE COUNT (note: Add Negative, i.e. subtract)
                  SX        X5,QN,A0            REPLACE THE COUNT IN THE QUEUE (note: store)
                  TN        X5                  DO WE NEED TO DEACTIVATE HIM ? (note: Test Negative)
                  J         PDONE               NO.  SKIP DEACTIVATION (note: Jump, i.e. branch)
                  ON        TSQ=0               (note: this is an assembly-time ifdef)
                  LX        X5,QHL,A0           LOAD BACK LINK OF QUEUE
                  SX        X5,QHL,X4           PUT INTO BACK LINK OF ACTIVITY
                  SX        X4,QFL,X5           CHAIN ACTIVITY TO LAST ACTIVITY
                  SA        A0,QFL,X4           CHAIN HEAD TO NEW ACTIVITY
                  SX        X4,QHL,A0           MAKE THE NEW ACTIVITY LAST ON QUEUE
                  CTS       QHEAD,A0            RELEASE PROTECTION ON QUEUE HEAD
        SCHDACT*  DACT$     .                   DEACTIVATE PROCESS (note: system call)
                  OFF
                  ON        TSQ                 (note: for later version of OS with alt wait fn)
                  C$TSQ     QHEAD,A0            WAIT FOR C$TSA (note: system call)
                  OFF
                  J         0,X11               RETURN AFTER ACTIVATION
        .
        PDONE     CTS       QHEAD,A0            UNLOCK THE QUEUE (note: not a system call, just a macro. Stores 0.)
                  J         0,X11               RETURN
    

(Notes:

Instruction format is

    
    
       OPERATOR  REG,OFFSET,INDEXREG
    

The "TS" instruction is "Test and Set". That's atomic. If the flag is already
set, an interrupt occurs and the OS does a thread switch. CTS just clears the
flag. Later versions of the OS support C$TSQ and C$TSA, where the OS queues
waiting test and set operations.

X4 is the "switch list", the local data for the thread.)

[1]
[https://www.fourmilab.ch/documents/univac/fang/](https://www.fourmilab.ch/documents/univac/fang/)

------
pjc50
I don't think there's an "every", because not every programmer can read every
language and there isn't even a common union of languages that you can
universally expect.

In that regard Donald Knuth's work in the fictional assembler MIX for TAOCP is
worth reading - it's at one remove from any real system. Knuth's TeX source is
also quite unique.

------
sigjuice
As a Linux user, I find it very useful to set up the source and debug symbol
repositories for the distribution that I am running. This way I have a large
body of code easily accessible which matches what I am actually running. I
usually grep for error/log messages, or examine core dumps or hung/misbehaving
programs using gdb.

------
emodendroket
I don't think there is such a thing. Learning about common algorithms is a
good idea but programs aren't novels. People pay a lot of lip service to
reading code but practically nobody does it.

[http://www.gigamonkeys.com/code-reading/](http://www.gigamonkeys.com/code-
reading/)

~~~
travmatt
I disagree in part, I read lots of code.

Typically if I don't know how to solve a problem / use a library, I'll go onto
github and search out projects either by their dependencies or structures I
know must be present in the code I want. Then I'll pick out 20-30 projects and
compare and contrast. Funnily enough, I read most all of the source code to
Etsy's 'Artsy' iPhone app because I wanted to get a sense of how a
professional shop structured their iPhone code.

As for the understanding the code, the author is very correct in that it's
tough to implicitly understand what's happening in the code. But to that point
I use Reveal to peek inside iPhone apps, or node-nightly + chrome devtools to
walk through js code. Since my goal isn't to understand the whole of a
program, rather parts I'm interested in, it's worked out quite well.

I've always wanted to read code for very secure applications (like Signal or
SecureDrop), but I fear I don't understand crypto/infosec enough to understand
the context the code was written in.

~~~
emodendroket
I don't know, perhaps my perspective is warped by starting in Windows
development, but I pretty much never do anything like that to evaluate a
library.

------
Clubber
A half joking response (only half), I would say your own code you wrote a year
ago.

~~~
ChicagoBoy11
Every time I've done that I can never quite figure out if the guy who wrote it
is a tremendous genius or a complete fool.

------
hendry
plan9 sources or
[http://git.suckless.org/sbase/](http://git.suckless.org/sbase/)

~~~
pyroinferno
Why you're at it: [https://bitbucket.org/inferno-os/inferno-
os/](https://bitbucket.org/inferno-os/inferno-os/)

~~~
happy-go-lucky
> Why you're at it:

Not to sound pedantic, but did you mean "While you're at it:"?

------
kowdermeister
Read something that matters to you, some library or snippet that you use a lot
or you depend on it. Context matters a lot in getting motivated to comprehend
things. Otherwise you will just browse some random code with little to no
connection and lack of understanding how the authors got there.

One source code I look into from time to time is Three.js:
[https://github.com/mrdoob/three.js/](https://github.com/mrdoob/three.js/) to
discover more details over the documentation.

------
zachwill
One of the pieces of code I read through that helped me the most: Beautiful
Soup. If you're a Python developer, I wholeheartedly recommend you read
through it -- although I do think Leonard has removed some of his more
disdainful comments from BS2. When I originally read it, I loved how you could
really feel how much he hated all the XML/HTML parsing gotchas. I think it's
the only time I've laughed out loud reading through code because of humorous
comments and TODO notes to self.

------
donatj
Golang itself. It's soooo readable and contains so much wisdom.

------
EliRivers
Fast inverse square root as seen in Quake (as mentioned in other comments)

[https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overv...](https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code)

------
EternalData
Thanks for bringing this thread of thought to light! Sometimes I look through
Stack Overflow not only for solutions to particular problems, but particularly
elegant solutions in a proactive manner. I often just browse through profiles
of people with tons of karma.

------
kybernetikos
The book Beautiful Code has some examples along these lines with interesting
commentary too.

------
QuotesDante
The D3 source code. Since this is made of around 30 modules, one might start
with
[d3-quadtree]([https://github.com/d3/d3-quadtree](https://github.com/d3/d3-quadtree)).

------
evanwolf
I love this. Writers ask the same question. Teaching literature and
composition starts with reading the works of greats who game before, learning
what made them great, and incorporating their skills into your own tool box.

------
mmphosis
[http://www.linuxfromscratch.org/lfs/view/stable/](http://www.linuxfromscratch.org/lfs/view/stable/)

------
steverit
[http://norvig.com/spell-correct.html](http://norvig.com/spell-correct.html)

------
taw55
Lugaru Gametick. A succesful indie game written by a highschool student.

[https://hg.icculus.org/icculus/lugaru/file/97b303e79826/Sour...](https://hg.icculus.org/icculus/lugaru/file/97b303e79826/Source/GameTick.cpp)

~~~
cercatrova
Have you seen Overgrowth, it's a spiritual successor by the same guys. They
have very interesting development videos on YouTube.

[https://www.youtube.com/user/WolfireGames/videos](https://www.youtube.com/user/WolfireGames/videos)

------
kruhft
Read the source code to Emacs. It has everything a growing programmer needs:
portability, compilers and interpreters, user interface(s), language design,
build systems...

[http://www.gnu.org/software/emacs](http://www.gnu.org/software/emacs)

------
kenoyer130
[https://www.amazon.com/Framework-Standard-Annotated-
Referenc...](https://www.amazon.com/Framework-Standard-Annotated-Reference-
paperback/dp/0768682088)

Even if you hate c#/java an amazing book on how to write solid clean code.

------
oddthink
Whenever I look at the Tcl source code, I'm impressed by how clean it is:
[https://github.com/tcltk/tcl](https://github.com/tcltk/tcl)

------
notburnt
I remember having a few aha moments when reading through the Redis code.

~~~
rents
Which version did you read? I would be interested in giving it a look through,
but I expect the newer version must have grown in complexity.

~~~
notburnt
Don't exactly remember but I think it was right before/around 2.X.

------
likelynew
smallpt: Global Illumination in 99 lines of C++:
[http://www.kevinbeason.com/smallpt/](http://www.kevinbeason.com/smallpt/)

------
rijoja
sox is kinda nice. Read a lot of it ages ago but I would like to revisit it
know with my marginally improved math skills.

------
navyad
find popular github repositories for your tech and explore them for fun !!

------
pyroinferno
I am surprised no one said Linus Torvald's double pointer problem

