Ask HN: What open source project, in your opinion, has the highest code quality? - chefqual
======
akavel
I hold the source code of Go standard library & base distribution (i.e.
compiler, etc.) in very high regard. Especially the standard library is, in my
opinion, _stunningly_ easy to read, explore and understand, while at the same
time being well thought through, easy to use (great and astonishingly well
documented APIs), of very good performance, and with huge amounts of (also
well readable!) tests. The compiler (including the runtime library) is
noticeably harder to read and understand (especially because of sparse
comments and somewhat idiosyncratic naming conventions; that's partly
explained by it being constantly in flux). But still _doable_ for a human
being, and I guess probably significantly easier than in most modern
compilers. (Though I'd love to be proven wrong on this account!)

At the same time, the apparent simplicity should not be mistaken for lack of
effort; on the contrary, I feel every line oozes with purpose, practicality,
and to-the-point-ness, like a well sharpened knife, or a great piece of art
where it's not about that you cannot add more, but that you cannot _remove_
more.

~~~
eptcyka
I'd agree, but only as far as aesthetics go. When you have to understand the
time complexity and runtime characteristics of the standard library sorting
algorithms, I think Go does a very bad job - the standard `sort.Sort(data
sort.Interface)` will run poorly if the data is already mostly sorted. I
expect these kinds of things to be documented properly.

~~~
jehlakj
I was pretty certain most libraries shuffle then quicksort. No need for
documentation. Does go not do this?

~~~
xapata
Ever heard of Timsort?

~~~
dgacmu
People were still finding bugs in common implementations of timsort as of 3
years ago. It's not unreasonable to stick with a somewhat more conservative
choice for a core library function until there's more reason to have
confidence in the implementations of timsort.

~~~
arayh
This comment got me looking into timsort bugs. This was a really interesting
read: [http://envisage-project.eu/proving-android-java-and-
python-s...](http://envisage-project.eu/proving-android-java-and-python-
sorting-algorithm-is-broken-and-how-to-fix-it/)

------
tzury
SQLite.

and for this reason alone!

[https://www.sqlite.org/testing.html](https://www.sqlite.org/testing.html)

    
    
        As of version 3.23.0 (2018-04-02), the SQLite library consists of approximately 
        128.9 KSLOC of C code. (KSLOC means thousands of "Source Lines Of Code" or, in 
        other words, lines of code excluding blank lines and comments.) 
    
        By comparison, the project has 711 times as much test code and test scripts - 
        91772.0 KSLOC.

~~~
danmaz74
Automated testing is useful and good. But I really feel it's reached a lever
of fetishisation that is quite concerning.

Testing code is code which needs to be written, read, maintained, refactored.
Very often nowadays I have to wade through tests which test nothing useful,
except syntax. Even worse, with developers who adopt the mock-everything
approach, I often find tests which only verify that the _implementation_ is
exactly the one they wrote, which is even worse: it makes refactoring a pain,
because, even if you rewrote a method in a better way which produces exactly
the results you wanted, the test will fail.

So, the ratio of testing code vs implementation code is a completely wrong
proxy for code quality.

EDIT: I'm not criticising SQLite and their code quality - which I never studie
- but the idea that you can judge code quality for a project just by the ratio
of test code vs implementation code.

~~~
faitswulff
They actually have to test to that degree to follow aviation standards
(DO-178b [0]) because they're used in aviation equipment.

Dr. Hipp said he started really following it when Android came out and
included SQLite and suddenly there were 200M mobile SQLite users finding edge
cases:
[https://youtu.be/Jib2AmRb_rk?t=3413](https://youtu.be/Jib2AmRb_rk?t=3413)

Lightly edited transcript here:

> It made a huge difference. That that was when Android was just kicking off.
> In fact Android might not have been publicly announced, but we had been
> called in to help with getting Android going with SQLite. [Actually], they
> had been publicly announced and there were a bunch of Android phones out and
> we were getting flooded with problems coming in from Android.

> I mean it worked great in the lab it worked great in all the testing and
> then [...] you give it to 200 million people and let them start clicking on
> their phone all day and suddenly bugs come up. And this is a big problem for
> us.

> So I started doing following this DO-178b process and it took a good solid
> year to get us there. Good solid year of 12 hour days, six days a week, I
> mean we really really pushed but we got it there. And you know, once we got
> SQLite to the point where it was at that DO-178b level, standard, we still
> get bugs but you know they're very manageable. They're infrequent and they
> don't affect nearly as many people.

> So it's been a huge huge thing. If you're writing an application deal ones,
> you know a website, a DO-178b/a is way overkill, okay? It's just because
> it's very expensive and very time-consuming, but if you're running an
> infrastructure thing like SQL, it's the only way to do it.

[0]: [https://youtu.be/Jib2AmRb_rk?t=677](https://youtu.be/Jib2AmRb_rk?t=677)
"SQLite: The Database at the Edge of the Network with Dr. Richard Hipp"

~~~
Nokinside
SQlite is very high quality software, but they use DO-178b "inspired" testing
process. As far as I know they don't have version of software that is or can
be used in safety critical parts despite their boasting.

They say in their site that:

> Airbus confirms that SQLite is being used in the flight software for the
> A350 XWB family of aircraft.

Flight software does not imply safety critical parts of avionics. It can be
the entertainment system or some logging that is not critical.

~~~
SQLite
Correct. The key word is "inspired". Multiple companies have run a DO-178B
cert on SQLite, I am told, but the core developers did not get to participate,
and I think the result was level-C or -D.

While all that was happening 10+ years ago, I learned about DO-178B. I have a
copy of the DO-178B spec within arms reach. And I found that, unlike most
other "quality" standards I have encountered, DO-178B is actually useful for
improving quality.

I originally developed the TH3 test suite for SQLite with the idea that I
could sell it to companies interested in using SQLite in safety-critical
applications, and thereby help pay for the open-source side of SQLite. That
plan didn't work out as nobody ever bought it. But TH3 and the discipline of
100% MC/DC testing was and continues to be enormously helpful in keeping bugs
out of SQLite, and so TH3 and all the other DO-178B-inspired testing and
refactoring of SQLite has turned out to be well worth the thousands of hours
of effort invested.

The SQLite project is not 100% DO-178B compliant. We have gotten slack on some
of the more mundane paperwork aspects. Also, we aggressively optimize the
SQLite code base for performance, whereas in a real safety-critical
application the focus would be on extreme simplicity at the cost of reduced
performance.

However, if some company does call us tomorrow and says that they want to
purchase a complete set of DO-178B/C Level-A certification artifacts from us,
I think we could deliver that with a few months of focused effort.

~~~
moron4hire
I just bought a copy of DO-178C after reading these posts here and the
Wikipedia article on it. $290, but if it's good, it should be worth it, right?

~~~
SQLite
I haven't seen -C only DO-178B, though I'm told there isn't much difference.
It is not a page-turner. It took me about a year to really understand it.

------
moviuro
OpenBSD and Co. (OpenSSH, etc.)

* [https://cvsweb.openbsd.org/src/](https://cvsweb.openbsd.org/src/)

* [https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/](https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/)

* [https://www.libressl.org/](https://www.libressl.org/)

* [https://cvsweb.openbsd.org/src/usr.sbin/ntpd/](https://cvsweb.openbsd.org/src/usr.sbin/ntpd/)

~~~
rurban
for sure not. OpenBSD makes no attempt to use proper performance which is
critical for a kernel. there are so many naive ad-hoc data structures and
algos, it's a shame to walkthrough.

~~~
codemusings
How is performance related to code quality. That makes no sense. If anything,
if you had to inline ASM for example, the code would suffer from readability.

~~~
iainmerrick
Shouldn’t good performance be one of the goals of good code?

~~~
moviuro
Depends on the design goals. If you want secure code, you'll make it readable.
Here's true(1):

    
    
      :
    

A single "noop" in a 755 file.

A C true would be:
[https://cvsweb.openbsd.org/src/usr.bin/true/true.c?rev=1.1&c...](https://cvsweb.openbsd.org/src/usr.bin/true/true.c?rev=1.1&content-
type=text/x-cvsweb-markup)

Here's a much faster true(1) if you need it:
[https://github.com/coreutils/coreutils/blob/master/src/true....](https://github.com/coreutils/coreutils/blob/master/src/true.c)

~~~
josefx
How is the coreutils true faster?

I would expect the openbsd true to be the fastest, it doesn't need to spawn a
subshell and it doesn't do more than the posix specification requires (afaik
--help/\--version should be ignored).

~~~
moviuro
Experience shows it's faster. It's just weird, but it's like that.

    
    
      time { for i in $(seq 1 10000); do /path/to/true; done; }

~~~
iainmerrick
What are you comparing against what there? Two C executables with the same
compiler flags on the same OS?

------
kristopolous
NetBSD.

Why? I was able to do substantial changes to the kernel when I was a teenager
(late 90s), mostly on my first try. There was no giant wall of abstraction I
had to climb over or some huge swath of mutually interacting code I had to
comprehend. There was also nothing that required fancy code navigation and the
creation of something like the ctags database in order to find out what on
earth was happening.

No action at a distance or lasagna style dereferencing or mysterious type
names that are just typedef'd and #define'd around dozens of times back to
something basic like char. No fancy obscure GNU preprocessor extensions or
exotic programming patterns.

Nothing had obtuse documentation that tried my patience or required much more
than enthusiasm and basic C knowledge.

I did things like got a wireless card working from code written for one with a
similar chipset and got various other things like the IrDA transmitter on my
laptop at the time to do a slattach and thus work as a primitive wireless
network - all in the late 90s.

I likely had no idea what, say, the difference between network byte order and
host byte order was at the time or how the 802.11b protocol worked or what a
radiotap header was or any of that. The separation of concerns was so good
however, that none of that knowledge was actually needed.

Compare that to say, the Qualcomm compatible WWAN I just dealt with over the
past few weeks where I needed to have in-depth knowledge of an exhaustive
number of things (very specific chipset and network details) to get a basic
ipv4 address working. Then I needed to read up on GNSS technology and NMEA
data to debug codes over USBmon to get the GPS from the wwan working. Then
after I had the qmi kernel modules doing what I wanted and the qmi userland
toolsets, I had to write some python scripts to talk to dbus to get the data
from the modemmanager that I needed in order to log the GPS. All the
maintainers of these pieces were very nice and helpful and I have nothing
negative to say. This is just how it usually is these days.

Back then however, I wasn't a good programmer, I was likely pretty terrible in
fact but with the NetBSD codebase I was able to knockout whatever I wanted
every time, fast, on a 486.

I miss those days.

~~~
akavel
What's your relation with it nowadays? I'm very curious about NetBSD but never
tried it yet. I sincerely wonder what's your opinion on it now, and why you
speak about the situation only as "those days" now? :)

~~~
kristopolous
I have no idea, haven't kept up with it. I'd recommend 1.x (<=4) any day
though, simply for the education alone.

I don't really use it these days because I need systems that future cheap devs
can maintain and once you enter userland it takes commitment and time I simply
don't have to stay with netbsd.

Debian permits me to usually not have to care and that's pretty invaluable

------
hyperman1
Its older than some HN posters, but the GPLed DOOM source code was one I
liked.

The performance reached by the game was considered impossible until Carmack
did show us otherwise. So I expected lots of ASM and weird hacks, especially
as compiler optimization wasnt as good as it is today.

Surprise, surprise, the thing was easy to read, easy to get going, easy to
port, reasonablye documented . It has shown me what a goog balance between
nice code and usable code is.

If you want tho browse: [https://github.com/id-
Software/DOOM](https://github.com/id-Software/DOOM)

~~~
NoSirRah
FYI, "Surprise, surprise" is generally meant sarcastically. I think you mean
"Surprisingly".

[https://www.merriam-
webster.com/dictionary/surprise,%20surpr...](https://www.merriam-
webster.com/dictionary/surprise,%20surprise)

~~~
hyperman1

      I can speak English!  I learn it from a book!
    
    

Though not always very good. Thanks for the bugfix.

------
bsandert
This is not necessarily about the code, but I've been really impressed for a
while by the lodash project and its maintainer's dedication to constantly keep
the number of open issues at 0. Any issues get dealt with at record speed,
it's quite a sight to see.

[https://github.com/lodash/lodash/issues](https://github.com/lodash/lodash/issues)

~~~
rootlocus
Not necessarily relevant, but 15% of the issues are labeled "wontfix".

~~~
adimitrov
With such a big project, being quick to hand out wontfix isn't necessarily a
bad thing. To be honest, seeing as this project is used by a huge part of the…
rather diverse JS crowd, 15% wontfix is _astoundingly low_.

------
curtisz
Strictly talking about code quality, I will nominate RCP100, which is a small,
virtually unknown, now-abandoned routing software written in C [0]. I started
programming with C way back in the 90s, and this is one of only _two_ projects
I can recall being immediately struck by the beauty of the code (Redis being
the other). I know almost nothing about the author but he seems not to want to
be known by name. You can browse the source on Github [1], which I uploaded
myself, since you can only get a tarball from sourceforge. Anyway, as someone
else mentions, C is usually a mess, but RCP100 struck me as beautiful.

[0] [http://rcp100.sourceforge.net/](http://rcp100.sourceforge.net/)

[1]
[https://github.com/curtiszimmerman/rcp100](https://github.com/curtiszimmerman/rcp100)

~~~
hansoolo
Maybe you just send the guys an email ;)

~~~
curtisz
I actually did send fan mail to the author, heh, thinly-disguised as a
courtesy to let them know that I mirrored their project on Github.

------
zubairlk
I'm surprised no one has mentioned the Linux Kernel!

[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/)

It is quite clean when you consider the task that it accomplishes.

Being able to compile across multiple architectures/endian-
ness,32/64-bit/scale up/down from server/desktop/router/phone while accepting
contributions from thousands of people..

~~~
aortega
It's not clean at all. Thousands of different styles, no single convention on
function-naming, etc.

Want a clean kernel, go look at the BSDs.

------
brentjanderson
Going to throw Elixir Lang into the mix.

\- The tooling is excellent.

\- The code is well-documented and readable.

\- The core team committed to never needing to introduce breaking changes.

The Elixir community tends to produce work that is actually considered "Done".
An elixir package is not stale when it hasn't seen a commit in a few months.
Instead, the feeling is: "It's feature complete and only needs maintenance
from here on out."

[https://github.com/elixir-lang/elixir](https://github.com/elixir-lang/elixir)

~~~
flaque
> The core team committed to never needing to introduce breaking changes.

Is this why Elixir seems to have many different ways of doing the same thing
though?

~~~
JamesUtah07
I think that's one reason. The other is that classic erlang (Elixir is built
on top of the erlang beam vm) sometimes does things one way but elixir has a
more elegant way of doing the same thing, however, in elixir you can still
call into erlang libraries to achieve the same thing if that's more familiar
to you.

------
binalpatel
Scikit Learn comes to mind - not just because I can dig into the source code
and immediately know what's happening, but also for the stellar documentation
that goes above and beyond telling your what the functions do.

For example their Cross Validation documentation is amazing:

[http://scikit-
learn.org/stable/modules/cross_validation.html...](http://scikit-
learn.org/stable/modules/cross_validation.html#cross-validation-iterators)

------
andygrunwald
When we take the language into consideration, unwound like to mention Redis.

Often codebases written in C are a a mess to understand, a mess to read. The
Redis Source Code is understandable even without deep knowledge of C

~~~
christophilus
Yep. I was going to say Redis and SQLite. Both are really well commented. They
almost read like a manual.

~~~
andoma
Although still in beta, I'd like to add BearSSL to the mix of well written and
documented C libraries. In particular compared to the OpenSSL "documentation".
It's also nice to see an TLS implementation without any memory allocations at
all.

------
maaaats
I'd prefer if people said _why_ they consider the code good, instead of
throwing out a bunch of random projects.

------
miguendes
Python: I really like requests, scikit-learn, the Path module from the
stardard library, Keras, Django.

C: Redis, SQLite, LUA.

Java: Joda Time, Guava

~~~
chillydawg
Joda Time is one of my all time favourite libraries.

After struggling with JVM stdlib time nonsense, JodaTime was a breath of fresh
air and actually made programming with time fun.

~~~
hellofunk
Java 8 time module is now considered the replacement for JodaTime for new
projects. It is separate from the older Java time libraries, and fixes many of
the problems in Joda. Give it a try!

------
guidovranken
ARM mbed TLS [1], Amazon S2N [2], nginx [3] have a super consistent code style
throughout and are prime examples of how C application programming should be
done (in my opinion).

[1] [https://github.com/ARMmbed/mbedtls](https://github.com/ARMmbed/mbedtls)

[2] [https://github.com/awslabs/s2n](https://github.com/awslabs/s2n)

[3] [https://github.com/nginx/nginx](https://github.com/nginx/nginx)

~~~
ac
+1 for s2n. It's one of the select few C codebases that is actually a pleasure
to read.

------
_the_inflator
There are too many in very different domains and languages.

However, I opt for jQuery here. It is one of the greatest examples of how
constant refactoring and thoughful usage of design pattern get you a very long
way.

If you are designing JavaScript libraries, pls have a look at jQuery. So many
great design decisions aka great code quality.

~~~
AmericanChopper
Pushing all dom manipulation through global evals seems like the exact
opposite of thoughtful design to me. I have a long list of places where I want
to implement strict CSPs, but can’t purely for minor use of jQuery.

------
jaequery
Sequel, a database ORM for Ruby:
[https://github.com/jeremyevans/sequel](https://github.com/jeremyevans/sequel)

The quality of the code is amazing, it's simple to use and even simpler to
look through the docs to reason about.

I also want to praise the author of the library (Jeremy Evans), his support
through the IRC is second to none, you can talk directly with him pretty much
on a daily basis.

And even after 8+ years, the project is still constantly being updated (last
commit 4 days ago). I haven't seen too many project of this calibre especially
when it is ran mostly by a single person.

------
ChrisRackauckas
Julia. Julia / Julialang is so pedantically tested and the names are pretty
meticulously chosen. The algorithms in Base are almost all generic and handle
a very wide variety of inputs without catering to them. If you want to learn
Julia, along with good software engineering, looking at the Base library is
quite recommended.

~~~
tlamponi
Did not look to much into it but at least a packager from Alpine Linux does
not think Julia's compiler ecosystem is clean/easy to work with:
[http://lists.alpinelinux.org/alpine-
devel/6248.html](http://lists.alpinelinux.org/alpine-devel/6248.html)

But as said, I did not really checked this claim for validness myself...

~~~
ChrisRackauckas
Julia requires patched versions of things like LLVM in order for all tests to
pass because upstreaming bugfixes take time. This has given some Linux package
managers an issue since they try to build using system LLVM/OpenBLAS/etc. with
the known bugs. I agree this does cause some distribution problems, but as a
scientist and mathematician I do like that the standard distribution of Julia
uses the most numerically correct versions (as of current knowledge) of the
dependencies as it can, and has a test to identify known potential issues. To
me this is good practice.

But anyways, I was talking about the Julia Base library and its numerical
routines. I just look at the Julia code and don't touch the build systems.

------
nazri1
Does assembly count? Prince of Persia's source code (not really open
source...): [https://github.com/jmechner/Prince-of-Persia-Apple-
II](https://github.com/jmechner/Prince-of-Persia-Apple-II)

One look at any of the assembly files and you can get a sense of how properly
organized the source code is.

~~~
bovermyer
Thanks for that! I love looking at historical game code.

------
mpasternacki
I'm a bit surprised nobody mentioned qmail yet:
[https://cr.yp.to/qmail.html](https://cr.yp.to/qmail.html)

~~~
informatimago
I don’t know about qmail, but postfix sources are really nice.

~~~
tptacek
They're better than most C software of the era, but not better than qmail ---
qmail has a better vulnerability record than Postfix does (perhaps because it
does less, but that's beside the point).

------
potta_coffee
Granted I haven't read much open source code but when I was working in Flask,
I found the source code to be awesomely clear and well-documented. I actually
learned quite a bit about Python by reading Flask code. Also, no-one could
explain "g" in a way that made sense, but the source code made it obvious.
Would recommend reading it if you're into Python at all.

~~~
83457
g?

~~~
jxub
The global request object if I recall properly.

------
dredmorbius
I'd like to suggest;

1\. Don't simply list projects.

2\. Give some notion of _why_ you're nominating code.

3\. A sense of what you consider to be quality.

Enough to spark discussion, inquiry, or comparison. Doesn't have to be much.

This is rudimentay. But affords purchase;
[https://news.ycombinator.com/item?id=18037815](https://news.ycombinator.com/item?id=18037815)

This does not:
[https://news.ycombinator.com/item?id=18038047](https://news.ycombinator.com/item?id=18038047)

(Both reference the same project.)

------
philliphaydon
Both Vue and PostgreSQL. Both have great code base. Amazing documentation. And
amazing communities.

~~~
chiefalchemist
Nice to see you include / mention docs and community. I believe a code-based
product has a UX. That UX is the code (with comments), documentation and
community. That UX is your (i.e., a dev / engineer) end to end experience with
"the product." It's not simply the code.

Put another way, there's more to a product that's easy and sensible to work
with than code quality.

------
graki
I'm suprised nobody cited TeX from Knuth. It's an absolute standard in quality
of implementation, documentation and computer science background. Perhaps
unsurpassed.

------
jacques_chester
I definitely admired PostgreSQL's code when I first looked at it.

Projects written in C require a fair amount of care and discipline to be
scaled up to larger codebases and teams. PostgreSQL is such a codebase.

I've also seen various parts of Spring's codebase and found all of it to be
consistently solid and careful. They take a lot of care to structure carefully
and comment immaculately.

Disclosure: I work for Pivotal, which sponsors Spring. Which is why Spring is
highly visible in my working life.

------
nojvek
Typescript

Even though it’s a fairly complex transpiler, the authors did a good job
modularizing and leaving lots of contextual comments on what each part does.

Also typescript baseline tests are a simple but very effective way to get lots
of coverage on the compiler.

I’ve read source code for Babel, typescript, coffeescript and flow. Typescript
architecture stands out.

Typescript not only does fascinating things like magical code completion
abilities and great tooling for IDEs but their codebase has been an
inspiration for me to build better front end code.

I may be a bit biased since I’ve worked at Microsoft before.

~~~
ioddly
I found the TypeScript type checker pretty hard to read through, though it may
be my lack of, well, almost any knowledge about type theory. I didn't dig much
into the other parts of the codebase however. What parts of it do you enjoy
reading?

~~~
nojvek
While submitting a PR, the parser, Lexor and emitter were fairly easy to
understand.

------
daniel-levin
LLVM and associated projects such as clang. Bazel is good too. OkHttp and
Retrofit by Square.

~~~
tom_mellior
I've been working with LLVM for a few years and I still find the code
difficult to navigate and badly documented. And every single function's
argument list is a random jumble of pointers and references (almost all
arguments _should_ be references, but many aren't).

~~~
anarazel
Indeed. And it's not just medium to low-level stuff that's not well
documented, it's the high-level stuff too. I personally don't mind that much
if I have to spend a few minutes to understand something on a a very local
scope, but if the bigger picture is unclear, that's quite bad. For LLVM one
largely has to grep for a bunch of other users and try to figure it out from
that.

While I think it has some clear deficiencies, I found a lot of e.g. the
optimization passes in GCC a lot easier to read. It's probably above par, but
e.g. [https://github.com/gcc-mirror/gcc/blob/master/gcc/gimple-
ssa...](https://github.com/gcc-mirror/gcc/blob/master/gcc/gimple-ssa-store-
merging.c) is really well explained imo.

------
helium
The requests codebase is really well written and it has a beautiful api

[https://github.com/requests/requests](https://github.com/requests/requests)

~~~
Drdrdrq
I like Kenneth's comments:
[https://github.com/requests/requests/blob/master/requests/pa...](https://github.com/requests/requests/blob/master/requests/packages.py)

------
kbr2000
Tcl! See
[https://www.tcl.tk/doc/engManual.pdf](https://www.tcl.tk/doc/engManual.pdf)
to start to understand why. (For code written in Tcl itself there's also some
proposed conventions:
[https://www.tcl.tk/doc/styleGuide.pdf](https://www.tcl.tk/doc/styleGuide.pdf))

------
a-saleh
I really liked the clojure core, I read it quite a lot when learning the
language.

I have heard good things about sqlite, and some day, I plan to read it :-)

------
unixhero
Dolphin Emulator

[https://dolphin-emu.org/](https://dolphin-emu.org/)

~~~
delroth
We try to keep up, but the truth is that it's a 15 years old C++ codebase
implementing some weird hardware in even weirder ways. We're far from where
we'd want to be code quality wise -- close to no automating testing
infrastructure, code is full of module-level globals, inconsistent
conventions, etc.

~~~
swsieber
How would you even test an emulator except manually? It seems like automated
website testing, but even worse. I guess screenshots + scripted input?

That seems like it'd be terrible to try to get running reliably.

------
zengid
The JUCE C++ library is very nice:
[https://github.com/WeAreROLI/JUCE](https://github.com/WeAreROLI/JUCE)

------
garyclarke27
I’m no C expert so I’m somewhat guessing, to me, PostgreSQL source looks
remarkably clean, well structured and nicely commented.

------
itsoggy
The Quake 3 source was fairly good...

~~~
charlchi
As a C beginner getting into writing larger projects, especially in that sort
of context, the quake source has been my reference on how to structure my
code.

------
batteryhorse
I was going to say the GNU version of /bin/false and /bin/true, but I actually
took a look at the source and it is terrible.

~~~
panic
The original /bin/true is probably the highest quality code ever written, but
unfortunately I don’t think the license is OSI approved:
[http://trillian.mit.edu/~jc/%3B-)/ATT_Copyright_true.html](http://trillian.mit.edu/~jc/%3B-\)/ATT_Copyright_true.html)

~~~
hyperpallium
Gosh, it seems like copyright lawyers will stop at nothing.

------
mixedbit
Python core libraries have great code. You can open pretty much any module and
be able to understand the source without much context.

~~~
akvadrako
I don't know how you can say this. The standard lib isn't even very pythonic,
let alone "great" along other dimensions.

~~~
blattimwind
Agreed. Almost every time I've looked deeply into stdlib code I was surprised
by how hard to follow it is and how frequently antipatterns are employed.
Doubly so for anything near a C module.

I consider the Python stdlib in a similar vein as the C++ stdlib or Boost:
Yes, some useful bits in there, but (1) lots of rot (2) you don't want to have
your code look anything like it.

------
noir_lord
In PHP land where I spend time for work.

Hands down Symfony.

------
utahcon
Golang and Kubernetes have been highly regarded as high quality. I
particularly found the Golang code for Kubernetes to be well documented and
well architected.

------
jimhefferon
Knuth did a good job on TeX, and it has been closely examined for many years
since so there are very few bugs.

~~~
informatimago
The TeX language itself, and the logs and error messages of TeX are so bad,
that I would hardly believe it.

~~~
svat
Are you sure you aren't thinking of LaTeX?

TeX (plain TeX, not LaTeX) has phenomenally good logging and error messages
IMO — everything you need is there, each error message comes in a “formal” and
“informal” form and points you to exactly the place the error happened, and
TeX lets you fix things on-the-fly without restarting the program. All this of
course assumes you use TeX the way it is described in the manual (The
TeXbook). The experience is opposite with LaTeX, so I find it worth giving up
all the convenience of LaTeX just for the wonderful experience with TeX.

As for “the TeX language”, there is no such thing. As Knuth has said many
times, TeX is designed for typesetting, not programming. Sure it has macros to
save some typing, but if you're writing elaborate programs in it (as is nearly
inevitable if you're using LaTeX) you're doing something wrong. Knuth said:

> When I put in the calculation of prime numbers into the TeX manual I was not
> thinking of this as the way to use TeX. I was thinking, “Oh, by the way,
> look at this: dogs can stand on their hind legs and TeX can calculate prime
> numbers.”

But of course LaTeX does every such thing imaginable :-)

More on TeX not being a programming language:
[https://cstheory.stackexchange.com/a/40282/115](https://cstheory.stackexchange.com/a/40282/115)

On the TeX error experience:
[https://news.ycombinator.com/item?id=15734980](https://news.ycombinator.com/item?id=15734980)

~~~
yesenadam
_Plain TeX_ is different to _TeX_ :

..."virgin" TeX...knows just primitive commands, no macros. Plain TeX is the
set of macros (developed by Knuth) which makes TeX usable in everyday life of
a typist. ... The available commands can be classified into primitive commands
and macros. ... The "virgin" TeX knows only the primitive commands. ...
Formats (plain TeX, LaTeX, etc.) extend TeX's vocabulary by defining macros.
...For example, plain TeX defines macros \item, \rm, \newdimen, \loop, etc.
Plain TeX defines about 600 macros.

[https://tex.stackexchange.com/questions/97520/what-is-
plain-...](https://tex.stackexchange.com/questions/97520/what-is-plain-tex)

~~~
svat
Yes of course; see this answer I wrote about typesetting with “virgin” TeX:
[https://tex.stackexchange.com/a/388360/48](https://tex.stackexchange.com/a/388360/48)
(it's not easy). “Virgin” TeX is never (and was never) used by typical users,
and is used only by the system administrator (or these days, the people behind
the TeX distributions) to pre-load formats (like plain or LaTeX).

Knuth wrote both the TeX program and the “plain” set of macros; when you start
`tex` it is with `plain` that it starts up, and _The TeXbook_ describes both
the TeX program and the plain format without being careful to distinguish what
comes from where (you have to look at Appendix B to see the proper definition
of plain.tex), so when we speak of TeX as Knuth intended/imagined it to be
used, it is plain TeX that is meant.

------
begriffs
OpenBSD

[https://www.openbsd.org/goals.html](https://www.openbsd.org/goals.html)

------
DanWaterworth
SQLite

------
zevv
Lua: [https://www.lua.org/](https://www.lua.org/)

------
cellover
To me that would be Appleseed rendering engine.

[https://github.com/appleseedhq/appleseed](https://github.com/appleseedhq/appleseed)

Even though I can't code C++, I can read it here and understand most of it
(besides the maths).

------
pa7ch
I particularly like reading code from Upspin (upspin.io). Its probably
partially because I think the project design is interesting and write go.
Regardless, its a great ground up Go project by some of the original Go
authors and contributors.

Very well organized code and it feels like they got the project off the
ground, fixed bugs for a few months, and now have largely trailed off from
maintaining it largely because it just works (I use it) which lends some
credibility to their coding style. Of course, I'd like to see the project
evolve conceptually, but, right now it does what it says it does reliably for
a project that hasn't even cut a single release.

------
silur
radare2 -
[https://github.com/radare/radare2/](https://github.com/radare/radare2/) More
GNU than actual GNU sources, more UNIX than the linux kernel. Huge codebase
but extremely easy to get involved with, orthogonal design with no compromise
on speed. Best codebase I ever encountered

------
gameswithgo
PostgreSQL and Quake3 are good candidates. Both are C codebases which are
surprisingly readable even by relative novices.

------
gorb314
I think musl libc [1] has good quality code. If anything their build system is
great. It makes the code much easier to navigate.

[1] [https://www.musl-libc.org](https://www.musl-libc.org)

~~~
izabera
[https://git.musl-
libc.org/cgit/musl/tree/src/string/strcspn....](https://git.musl-
libc.org/cgit/musl/tree/src/string/strcspn.c#n14)

[https://git.musl-
libc.org/cgit/musl/tree/src/stdio/vfprintf....](https://git.musl-
libc.org/cgit/musl/tree/src/stdio/vfprintf.c#n561)

ah yes, good quality code

~~~
AndyKelley
I still think musl overall is quite readable, but my goodness, that switch
statement in your second example. What a monster. I didn't think it was
possible to be this confusing without the preprocessor.

------
ckorhonen
On the JavaScript side I've enjoyed reading the code for Backbone and
Underscore, helped also by the awesome in-line documentation. Very easy to see
what is going on.

Also big fan of Sidekiq for similar reasons.

------
tmilard
BabylonJs is a wonderful clean code. Made to write 3D on the Web.
[https://www.babylonjs.com/](https://www.babylonjs.com/)

------
epynonymous
most people are talking about clean code, good design constructs, but i feel
that many are missing the point, we’re talking about code quality here, design
is the grit and grind that all developers go through to develop great
software, certainly there are better designed software projects out there that
leaves them more maintainable and prone to less bugs, but the fact of the
matter is that for complicated code, designs go through many iterations and
refactorings over time e.g. linux kernel, all software projects have bugs,
even well designed or well tested software. but the significance of good
testing and good processes are not being highlighted here, unit testing, code
coverage, functional testing, end to end testing，scale testing, performance
testing, code review, fault injection, debuggability, test automation, static
code analysis, etc, i am shocked not to see lots of discussion on these things
(aside from the sqlite mention) and testing techniques. probably a more
developer friendly crowd here at hn, but testing is a significant and game
changing part of what separates developers from great developers.

------
TekMol
I like the Laravel framework. It has a clean style to it.

------
jedberg
Postgres.

~~~
sgt
Definitely agree with this. Both the documentation and code are of excellent
quality. Others that come to mind are sqlite and zeromq.

------
bsaul
my first experience with high quality code was with tge quake2 engine.

i was both amazed by the simplicity of the architecture (a huge single event
loop), and the attention to code presentation and indentation.

~~~
SmellyGeekBoy
Interesting to see so many John Carmack projects in this thread. He's a good
candidate for "best programmer of all time", if there were such a thing.

------
i_feel_great
Gambit, Chicken, Racket, Chez and Guile Schemes

------
otakucode
Does 'Physically Based Rendering' count? It's a book... which is also source.
It was written as only the 2nd work of true 'Literate Programming' that I know
of. I believe Knuth wrote a book about TeX which was the first example. But
basically it is prose interleaved with source, readable as a book.

------
jMyles
Twisted. Not only highly organized and sensibly delineated, but also a lot of
fun to read - borderline comical at places.

~~~
iso-8859-1
How do you think the asyncio (formerly Tulip) sources compare?

~~~
jMyles
asyncio is more modern, more stylish, and more concrete.

Twisted is more timeless, more patterned, and more self-aware.

I can imagine Twisted's asyncio reactor becoming its default (and the Twisted
flow control slowly declining in importance), but Twisted's protocols, control
structures, and execution models becoming more popular.

Twisted has undergone a great resurgence in quality engineering since asyncio
became more viable - this was surprising to me, but is actually probably
reasonably consistent with the way the historical influence of the standard
library.

Overall, I think that Twisted is a great project; I almost always reach for it
when my python codebase becomes mature enough to need more thoughtful
abstractions around network I/O.

------
fapjacks
Actually, I think early versions (like from pre-1.0 through maybe 1.5 or so)
of Docker had some very high quality code and was also very pleasing to look
at. It was very clean and super approachable and readable, and I felt sort of
like how the NetBSD commenter felt as described in their comment.

------
mehrdadn
"Highest" I don't know, but "code whose quality I look up to", then:

For C: Process Hacker and some similar code that is designed like and written
around Windows kernel APIs:
[https://github.com/processhacker/processhacker/blob/master/p...](https://github.com/processhacker/processhacker/blob/master/phlib/basesup.c)

For C++: Some of the Boost code, and stuff like it, such as P-Stade Oven:
[https://github.com/himura/p-stade/blob/master/pstade/pstade/...](https://github.com/himura/p-stade/blob/master/pstade/pstade/oven/detail/concat_iterator.hpp)

For others: (need to look later, I forget)

------
robbick
Can't say I've seen enough to be confident on the best library but redux
([https://github.com/reduxjs/redux](https://github.com/reduxjs/redux)) is just
so simple, and has great, readable/understandable code.

~~~
icc97
In Dan Abramov's excellent egghead redux course [0] he implements the
`createStore` from scratch which is the core of redux, it's simple enough to
post here:

    
    
      const createStore = (reducer) => {
        let state;
        let listeners = [];
    
        const getState = () => state;
    
        const dispatch = (action) => {
            state = reducer(state, action);
            listeners.map(listener => listener());
        };
    
        const subscribe = (listener) => {
            listeners.push(listener);
            // unsubscribe
            return () => {
                listeners = listeners.filter(l => l !== listener)
            };
        };
    
        // populate initial state
        dispatch({});
    
        return { getState, dispatch, subscribe };
      };
    
    

[0]: [https://egghead.io/lessons/react-redux-implementing-store-
fr...](https://egghead.io/lessons/react-redux-implementing-store-from-scratch)

------
coldnose
After spending about a month of concerted effort pouring through the zlib
sources, looking for vulnerabilities, I can say that zlib is the most
astonishingly bug-free code I've ever seen. But in the conventional
understanding of "code quality", it's pretty bad.

~~~
wrasee
Julian Storer (JUCE library) did a talk on code quality using zlib as an
example. Might be interesting to you if you've not seen it already.

[https://www.youtube.com/watch?v=SIAAvv1O7Gg](https://www.youtube.com/watch?v=SIAAvv1O7Gg)

------
beefhash
I'd have to go with Monocypher. It makes very tasteful use of comments,
functions and macros to maximize readability and clarity.

[https://github.com/LoupVaillant/Monocypher](https://github.com/LoupVaillant/Monocypher)

------
rileyraver57
Toybox by
Landley([https://github.com/landley/toybox](https://github.com/landley/toybox))
is probably the best example of a modern c implementation I have ever seen.
Surprised no one has mentioned it yet.

------
ezequiel-garzon
I admit I don’t have the knowledge to make my own assessment, but I’ve read
some downright poetic praise on djb’s work [1], and more than once.

[1] [https://cr.yp.to/](https://cr.yp.to/)

------
Dowwie
For _Rust_ , many say that the regex crate sets a high standard for
excellence: [https://github.com/rust-lang/regex](https://github.com/rust-
lang/regex)

------
markpapadakis
I study codebases as a hobby. I highly recomend Seastar, Folly, Aeron and
Disruptor, SQLite, PostgresSQL, LMDB, Tensorflow, Hashicorp’s vault, and the
Linux Kernel projects as prime examples of high quality codebases.

------
chubot
For clean C++, I like leveldb (a key-value DB library) and re2 (a regex
engine). Random files from each of them:

[https://github.com/google/leveldb/blob/master/table/table_bu...](https://github.com/google/leveldb/blob/master/table/table_builder.cc)

[https://github.com/google/re2/blob/master/re2/nfa.cc](https://github.com/google/re2/blob/master/re2/nfa.cc)

------
macco
I really like the source code of prosemirror:

[https://github.com/ProseMirror/](https://github.com/ProseMirror/)

It's not typical js, but very good none the less

------
Theodores
The open source code I know from web development has to be fixed with various
hacks - PHP and the frontend javascript that goes with it. Therefore the code
I know is not 'highest code quality'. If it was 'highest code quality' then I
would not know the code.

Therefore the highest code quality is likely to be in projects where I do not
have to go under the hood, e.g. the Chromium project where all contributors
are vastly more educated and capable than myself.

------
eloycoto
Libuv is one of my favourites
[https://github.com/libuv/libuv](https://github.com/libuv/libuv)

------
partycoder
Good code bases that inspired larger projects: MINIX, KHTML

------
schaefer
with respect to the C++ Language: there was a book published in 1996. Large-
Scale C++ Software Design by John Lakos. He's about to publish the second
edition of the book while also expanding it's reach to span two volumes.

Anyhow, while we await the publication of that book, John has been working at
bloomberg. some of the code written there has been published to github[1].
He's also done a five hour lecture series [2] available on safari-online (paid
service) that cover the topics of his book, and introduce the open source
bloomberg repo as an example of code written in that style.

I can't offer you a review as I've just found this all myself, but I'll be
eagerly studying it along with some of the other items mentioned here.

[1] [https://github.com/bloomberg/bde](https://github.com/bloomberg/bde)
[2][https://www.safaribooksonline.com/videos/large-scale-c-
livel...](https://www.safaribooksonline.com/videos/large-scale-c-livelessons-
workshop/9780134049731/)

------
epynonymous
linux kernel, purely the reasoning being that it’s probably one of the most
used pieces of software out there, along those lines, probably the kernel
libraries and user libraries like libstdc that are a part of it. i dont know
how the linux kernel is tested, but i know that production testing of the
kernel on different platforms, at large scale is probably the most used open
source in the market.

------
TangoTrotFox
I would not judge things on aesthetic quality, but simply on results. In
general code faces difficulties that grow exponentially with with time, size,
and the number of contributors. Millions of lines code, thousands of
contributors, decades of development and it's still at the top of its game? In
spite of its complete lack of aesthetic appeal, that's the Linux kernel.

------
sparkling
High quality code and one of the best APIs i ever used:
[https://github.com/requests/requests](https://github.com/requests/requests)

Best source code layout, architecture, maintainability:
[https://github.com/rg3/youtube-dl](https://github.com/rg3/youtube-dl)

------
agentultra
As far as C++ code goes, the Lean Prover is really well maintained:
[https://github.com/leanprover/lean](https://github.com/leanprover/lean)

I'd also say GHC is quite good.

And Pandoc as well.

I don't think I can compute enough variables to consider the "highest"
though... so the aforementioned are only examples of what I think are good.

------
edoo
The Linux kernel of course. In userland I have to say lib QT. I've used a lot
of APIs and QT is always a pleasure to work with.

~~~
SmellyGeekBoy
I'm a Linux fanboy myself but come on - we're talking about nearly 30 years'
worth of commits from thousands (tens of thousands?) of developers.

The only thing I can say is that with this in mind it's actually a lot better
than I'd expect - testament to Linus's iron fist, perhaps.

------
nojvek
Redis. I have to say antirez not only is an amazing engineer but from the way
the code is written, you can see he is a very clear thinker.

I hold Redis codebase as an example of what good C code should be. On the
other hand opencv codebase as an example of what C could should not be. Opencv
codebase is really inconsistent with quite a bit of unreadable spaghetti
sauce.

------
moneysconcerned
CVEdetails.com lists the number of (reported) vulnerabilities by year for
software projects that have a CVE identifier.

Here's bitcoind: [https://www.cvedetails.com/product/22744/Bitcoin-
Bitcoind.ht...](https://www.cvedetails.com/product/22744/Bitcoin-
Bitcoind.html?vendor_id=12094)

------
rataata_jr
XMonad window manager written in Haskell.

~~~
iso-8859-1
What do you think about the GHC sources in comparison?

------
dhuramas
I am surprised no one mentioned
SycallaDB([https://github.com/scylladb/scylla](https://github.com/scylladb/scylla))
. Redis and SycallaDB have often been pointed out as examples of good
codebases to look at for C/C++ Devs.

------
Dawny33
Gensim : [https://github.com/RaRe-
Technologies/gensim](https://github.com/RaRe-Technologies/gensim)

[Can't speak for the 'highest' part of the qn, but Gensim upholds very high
code quality standards]

------
kostarelo
I like Spectrum for both their architecture and code quality.
Node.js/JavaScript.

[https://github.com/withspectrum/spectrum](https://github.com/withspectrum/spectrum)
Https://spectrum.chat

------
nunobrito
Referred by thousands and available since 2004 without one single bug reported
in the last decade:
[http://users.telenet.be/AphexSoft/](http://users.telenet.be/AphexSoft/)

It is not yet on Github.

------
sv12l
Pretty sure PostgreSQL will have a place at the top quarter of this page.

------
numeromancer
Pari:

[http://pari.math.u-bordeaux.fr/git/pari.git](http://pari.math.u-bordeaux.fr/git/pari.git)

I prefer the early versions, before it was softened up for the vulgo.

------
cantagi
GTKmm. GTK uses GObject to implement inheritance between C structs and it's
easy to go wrong when extending. GTKmm wraps GTK in C++. It's a joy to use and
is safer.

------
SoylentOrange
I like the design philosophy behind BoringSSL:

[https://boringssl.googlesource.com/boringssl](https://boringssl.googlesource.com/boringssl)

If some portion of the library is overly complex, look into the use case and
delete it wherever possible. It maintains a long-term bound on code
complexity, which I quite like.

Edit: a nice explanation on the design philosophy here
[https://www.imperialviolet.org/2015/10/17/boringssl.html](https://www.imperialviolet.org/2015/10/17/boringssl.html)

------
anuraaga
I am very lucky that there are too many great open source libraries out there
to label one with the "highest" quality.

------
hysan
Any React and React Native suggestions?

------
hmsync
Spring Framework

1\. Elegant structure 2\. Strict code style 3\. Project size is not too large
4\. Have detailed documentation

------
jankotek
For Java I would say H2 SQL DB. It is small, compact, packed with features and
good abstraction.

------
tom-jh
Nobody mentions Android. Any examples good quality code on Android?

------
winkdinkerson
I guess my vote for Matt's Script Archive is going nowhere..

------
rzvme
I would suggest Laravel!

------
cmarschner
Torch has the best code of DNN libraries I have seen so far.

------
ddtaylor
A lot of the KDE source code is well written and maintained.

~~~
_pmf_
Qt sources too (which has a lot of overlap in people and mindshare with KDE).
Mostly.

------
praveenster
zeromq. Both the code and documentation are very good.

------
halayli
Postgresql, llvm, Python, sqlite are pretty up there.

------
anticensor
Debian is the best with its rigid QA procedures.

------
aloukissas
My nomination would go to the chromium project.

------
mikkelam
Any iOS swift/objc releated projects?

------
joelbirchler
Kubernetes is extremely well designed.

------
novaRom
Python (official cpython)

~~~
vfinn
There's a nice introductory lecture series on CPython internals on Youtube
that tries to cover how the interpreter works and how the python code maps to
bytecode by going trough the cpython source:
[https://www.youtube.com/watch?v=LhadeL7_EIU](https://www.youtube.com/watch?v=LhadeL7_EIU)

------
unixhero
LibreOffice

------
charlysl
xv6

------
se7entime
Linux

------
lowry
Lua. It's has everything a good C project should have: small size, simple
build system, portability by using the simplest constructs and not ifdefs, a
clear and well define scope that none dares trespassing.

~~~
Quenty
You can see a mirror of Lua here!
[https://github.com/lua/lua](https://github.com/lua/lua)

------
auslander
openbsd

------
qualawhat
Start by defining quality.

~~~
moneysconcerned
Software quality:
[https://en.wikipedia.org/wiki/Software_quality](https://en.wikipedia.org/wiki/Software_quality)

Software metric:
[https://en.wikipedia.org/wiki/Software_metric](https://en.wikipedia.org/wiki/Software_metric)

''' Common software measurements include:

\- Balanced scorecard \- Bugs per line of code \- Code coverage \- Cohesion \-
Comment density[1] \- Connascent software components \- Constructive Cost
Model \- Coupling \- Cyclomatic complexity (McCabe's complexity) \- DSQI
(design structure quality index) \- Function Points and Automated Function
Points, an Object Management Group standard[2] \- Halstead Complexity \-
Instruction path length \- Maintainability index \- Number of classes and
interfaces[citation needed] \- Number of lines of code \- Number of lines of
customer requirements[citation needed] \- Program execution time \- Program
load time \- Program size (binary) \- Weighted Micro Function Points \- CISQ
automated quality characteristics measures '''

Category:Software metrics
[https://en.wikipedia.org/wiki/Category:Software_metrics](https://en.wikipedia.org/wiki/Category:Software_metrics)

------
gupi
well, if trolling is permitted, I would say that "Hello World" example has the
most exquisite code.

in most cases "Hello World" is open-source, but I still don't know if can be
named "project"

------
theboywho
It's funny to see nobody is even questioning the question.

What does it even hold as a value to be the project of the highest code
quality in the world ? How can it exist as a consensus if we can't even agree
on best practices ?

If it's for learning purposes, why even look for the ONE project with the
HIGHEST quality ? Just go by any GOOD ENOUGH project.

I see this all the time: what's the best editor, the best color scheme, the
best font, etc.

How about we just start saying: what's a good enough X for my purpose ?

~~~
erpellan
Sometimes you need a recipe book, other times you want to lose yourself in a
masterpiece.

------
hyperpallium
Just wanted to mention some bias in successful open source projects: they are
often structured as a number of similar plug-in pieces, like youtube-dl for
different video publishers.

This is great for open source, because you can easily discover and navigate to
the part you want, and change it. You might need to understand the plugin
interface - or you might not. This flat architecture makes it easy for people
to contribute, an important aspect of a successful open source project.

But it's not the ideal architecture for every project. In some cases, a
cleverer, harder to understand approach is more elegant, shorter, more
efficient, simpler.

Of course... one might argue that ease of understanding is more important than
anything else.

~~~
leetcrew
the only thing more important than understanding is shipping on time. but how
are you going to ship on time if you can't understand it?

