There was a blog post from Peter Seibel recently called [Code Is Not Literature)](https://gigamonkeys.com/code-reading/) which pointed out that although everyone seems to agree reading code is great, very few tend to do it regularly.
I feel like I get a lot more out of messing with / hacking on code than I do from reading it. I'm sure people vary, but I've got loads more out of open source contributions to sometimes small projects, and not very much out of trying to do something like read the code for the Glasgow Haskel Compiler or something.
So for code to read, I think having an in is crucial, at least for me. So I'd say, find a cool, maybe small open source project, look at the issue tracker if it has one, and try to implement something. You'll only really know if it's good code (or why) after you start trying to change it.
> You'll only really know if it's good code (or why) after you start trying to change it.
I like this and it is true. In addition, I have found that good code also tends to survive refractors despite having the quality of being easy to change.
The Ask HN post specifically mentioned good code but what follows are some (of my subjective) thoughts about the benefits of reading bad code. Bad code is haphazard and varied and good code is “samey”. You are a pattern matcher and this is part of your training. Good code will make more of an impact on your understanding if you have read a lot of bad code beforehand. Being able to read and understand bad code is a far more lucrative skill to have in the workplace than being able to quickly understand good code. In a big team you will spend a lot (most if you’re senior) of time reading other people’s code. Best to get good at it.
I have yet to be presented with a steaming mess of code to read and explain to an interviewer; yet it is the very first thing most new joiners face at any seniority level.
Good point on the benefits of reading bad code. One of the best developers I've worked with said he'd become a good developer by refactoring a lot of bad code.
I think the difficulty with this advice is that it's hard to know where the venn diagram overlaps between "having an in" and "good code". A lot of the code I read because I "have an in" is, I think, not very "good" in the sense of being enlightening in furtherance of my craft. It usually just looks like code I'd already write.
I typically find the recommendations people make like "here's an example of particularly good code for this particular language / style / architecture" to be more enlightening.
It really depends on the level that you're at in your coding ability. When I was a younger programmer I had the fortune to have a CS professor who believed in reading good code and encouraged us to do so.
Nowadays if I want to learn some new thing I'll find the 'right' code to read. Someone who has done somthing similar to what I am trying to do. But when I was younger and still trying to understand more basic concepts I just wanted to read 'good' code. I wanted to know how to structure programming logic.
Now I don't want to know how to structure logic unless I'm really interested in learning some new fangled concept. Typically something having to do with concurrency since there always seem to be new ways to express concurrency in programming.
Most of the code I look to read these days is because I want to start using a new library and I want to see how someone else has done it.
Big agree. Tinkering with code to some specific end is gonna teach you a lot more than just scrolling through files, unless you're just curious about the basic architecture or how they implemented some specific thing.
Either way, have a goal in mind when reading code. If nothing else, take notes on what you learn.
It's worth noting, Go's stdlib isn't perfect, but that's part of why it's so good. You can see how they've deprecated certain methods/fields/etc while maintaining backwards compatibility.
Most code is not worth reading. Even well structured codebases are mostly composed of code which is not worth reading.
The difference in a well structured codebase is that some of the code prevents you from having to read huge amounts of other code. All code is bad, it starts out bad just by existing, it's only redeeming quality is preventing you from having to deal with more bad code.
Everyone thinks they write good "clean" code, and it's never true. Good programmers are good because of the architecture of their code, not because a single excerpt of code in isolation looks a certain way.
What you really want to read about are good designs. Read APIs, models, concepts, schemas, etc.
Another comment mentioned the Go standard library, and I totally agree. But stop at the APIs, if you look inside, you'll see that it's also mostly garbage. It's good because the APIs are good, and you don't have to read the rest.
I don't know, I argue APIs and models are "code" in the informal sense. Architecture is important, but it's even harder to understand "good architecture" than code, not without the architect right there explaining it to you. Many decisions were made for some reason that you can only understand through experience.
There's definitely a code component to them, but most of reading about and understanding them would come from reading the surrounding documentation or reasoning. Most of that text would be prose, rather than a language consumed by any computer system.
I'm not sure what you mean by "understand good architecture." One thing that makes an architecture good is its simplicity and clarity. If it's hard to understand, it can't be that good. I can say for certain: if you understand the problem being solved but not the architecture, then it is for sure a bad architecture.
One thing to look for are programs that use advanced techniques as opposed to programs that are just simply clear or well-commented. For instance, instead of learning a new programming language, you should learn how programming languages are made. Most of the secrets that computer science majors know that you probably don't were learned in compilers class.
(Unfortunately many people are chasing the Holy Grail of "functional programming" and never finding it because "functional programming" is a pale shadow of what's possible when you understand how compilers work: this is how Common LISP and scheme are so much more profound then, say, Haskell)
It's a little out of date but I was lately thinking up about the Scott Adams adventure games of the early 1980s that were written with a specialized interpreter which could be implemented in BASIC but was also implemented in assembly language for better performance. See
If you tried to implement a game like that directly in a language like BASIC you would be driven nuts because that kind of game is fundamentally "object oriented" in that there are a number of things like rooms and items that are all mostly the same except they are different in some ways and trying to code that with IF, THEN and ELSE is bad enough even before GOTO gets added to the mix. The thing is that GOTO becomes quite benign and even useful when it is used to implement interpreters.
So back in the day you would study systems like that to stretch your skills, today I would look at compilers and related technology, like the Jena rules engine.
> Unfortunately many people are chasing the Holy Grail of "functional programming" and never finding it because "functional programming" is a pale shadow of what's possible when you understand how compilers work: this is how Common LISP and scheme are so much more profound then, say, Haskell
Huh? What does this even mean? Both languages are very different from each other and the only thing they really have in common is having functions — how is one more “profound” than the other or any less worthy of being “functional programming”, for whatever that means to you?
I've met a few young programmers who heard somewhere that object-oriented programming was bad and they want to get the enlightenment of functional programming that they've heard about. Frequently they travel from job to job like itinerant martial artists always looking for somewhere where they practice the true technique but they always seem disappointed as it is just as easy if not easier to screw up handling errors with monads than it is with exceptions and they find analogies like "a monad is like a burrito" just get them more confused.
which many people will struggle with because like many other production rules engines in LISP (and many other examples of simple compilers), there is hardly any code! Contrast that to the orders of magnitude larger rules engine Drools
which is so crazy-complicated primarily because the Drools language is Java-based so you need all sorts of things that Clara or CLIPS don't need. Note that both of these systems use variations of this algorithm
where a set of rules can be compiled to a network of transitions that can happen when facts are added to the knowledge base so it is very much an example of compiler technology. (That said, Drools supports a lot of features that Clara doesn't and also supports a very advanced RETE-like algorithm that can exploit parallelism that early versions couldn't.)
What are these iternant code students trying to achieve with all this? Performance? Productivity? Lower defects?
Safety critical code seems to still be usually in C/C++, for reasons having nothing to do with the language being safe. Why are these people not studying Erlang, MISRA C, Rust, some kind of formal proof solver language, or OpenCL?
What do people aspire to one day work on that would make the Haskell or LISP more appealing than a more locked down, static checking, kind of language?
Programming culture is more confusing than the code itself!
At risk of sounding like Peter Thiel I think a lot of this thinking is driven my mimesis rather than thinking. That is, there is a lot of talk about how
functional >> OO
and "OO sux" and all that. I think most of these people hadn't worked in the industry enough to have a clear picture of what better productivity and lower defects would look like but instead they were looking for a movement to join.
Mastery, perhaps. They see some sort of role model and figure the tools they use are a roadmap to becoming a distinguished engineer themselves. If only it was that easy.
It is interesting how functional programming has to consider data organization completely differently, but sometimes it is just that, a tool or methodology, not a philosophy to subscribe to. Ideally, the good (or future good) engineers realize this and try to understand the less talked about soft skills needed to succeed.
I guess that's probably a lot of why I don't get it, I never had any programmer role models. I looked up to Adam and Jamie, and
I never saw them doing serious code.
Yep! For a while when I was really little I thought maybe I wanted to make video games, but... that's really hard.
The stuff I was really excited about was mostly "tech in the physical world" related.
Although with the AI boom, and how I'm too clumsy for jobs involving driving... I might be making a lot more if I'd at least tried to focus on the pure programming stuff instead of embedded, even though I never had any real talent for advanced algorithms...
> which is so crazy-complicated primarily because the Drools language is Java-based so you need all sorts of things that Clara or CLIPS don't need
from drools website
"Drools is a Business Rules Management System (BRMS) solution. It provides a core Business Rules Engine (BRE), a web authoring and rules management application (Drools Workbench), full runtime support for Decision Model and Notation (DMN) models at Conformance level 3 and an Eclipse IDE plugin for core development."
No expert here, but it doesn't come across as a fair apples-to-apples comparison.
Something like Drools ought to have a relatively small core and then additional systems built on top of it.
Drools even has some projects like OptaPlanner and jBPM that are in different source code repositories.
The core in Drools really is dramatically more complex than the core of Clara and some of that really is the higher level of functionality (Drools supports both hash-based and ordered indexes for data because you need that for complex event processing, Clara has only hash indexes) but a lot of it is that the syntax of Drools is much more complex because it mixes Java expressions and statements with the rules languages thus it needs a real parser.
I was working on a project with Drools and had extreme difficulty because Drools error messages didn't make a lot of sense to me so I did a lot of looking at the Drools source code, running the compiler in the debugger and such and still didn't get very far. I switched over to the Jena Rules engine
which still emits lousy error messages but I quickly was able to understand everything about the Jena Rules Engine, get really good at writing extensions, and use it for things the developers said were unsupported such as using it as the control plane of a data processing engine that bridged stream and batch processing: I'd use Jena rules to control the process of setting up and tearing down reactive streams that would do a data processing job. That is, I got good at the Jena Rules Engine that I kept discovering new things I could do with it.
Clara is very similar to the legendary CLIPS rule engine from the golden age of ai
and if you find Clara is too small to understand, CLIPS is even smaller. The massive reduction in code size comes from being able to lean on the affordances of LISP to develop a DSL for writing rules, if you have to write a parser and all the stuff that comes with defining an external DSL rules language your code starts to get much more complex.
Jena Rules Engine is written in Java and I think it is very nice code and you'll learn a lot about RETE engines and other advanced programming concepts by studying it, you might even get more out of it than you get out of Clara. It was common at the tail end of the golden age of A.I. for people to write a first draft of a system in Common Lisp and then end up re-writing in it C++ for performance. Many of the ideas you get out of Lisp programming apply just well to other programming languages but require a huge amount of elbow grease and in-depth understanding of compiler technology whereas you can frequently use macros and similar affordances in Lisp to implement very sophisticated ideas with postage stamp size coded (that granted many people find challenging to understand.)
Isn’t this just a case of right tool for the right problem? If you were writing, say, an IntelliJ plugin I think you would much rather prefer Java over Clara rules, and not a subjective opinion of profoundness.
Also, having asked about Haskell, I'm not sure why you brought Java into this!
Peter Norvig's Pytudes was recently posted here. I think that's some of the best code I've read, although they're only small problems and not a bigger project. Still very much worth a read, he goes through the whole problem solving through code process.
Same, actually. Perhaps some of it is due to the naming convention. For instance, in the Lisp interpreter, he tends to use "parms". Which I assume is short for "parameters".
That makes me think of parmesan cheese -- "params" would be a better fit.
Give up on “good code”. Its pursuit is how junior devs pesters senior devs based on a delusion that such a thing is possible.
Good code is working code, code that pays the bills. Focus instead on writing code you can throw away easily, code that you are wholly unattached to and is isolated enough that rewriting it won’t cost absurd hours.
The problem with believing that "good code" is good enough to deliver business value is short-sighted.
Good code is highly maintainable so that you can continue to meet business objectives in a timely manner without regressions. Often this means (ironically) taking a little bit of extra time early on to think about how to make your code readable and "simple enough" for someone else to be able to jump in and maintain it.
The problem with this “maintainability” argument is the presumption a) that it will be maintained (note I recommended rewriting frequently) or that b) it’s in conflict with meeting business objectives.
Rewriting throws away years of accumulated edge cases handling. Suddenly the thing doesn't work because one particular model of printer needs an undocumented command to enable some feature, or users are inputting bad data because you forgot the checks you accumulated...
Seems not ideal for end users, unless you're working with microservices or something with well defined specifications that people are actually paying attention to.
Depends on the field, but what I work on generally evolves so much over time that by the time I'm ready for a rewrite, the "edge cases" I had to account for when I started are either solved, partially solved, or can be isolated into some tiny part of the codebase that could be ported over from the previous code.
Beside, "rewrite" here doesn't mean "new repo, new project, new everything" it means reimplementation, usually based on the lessons learned from the previous implementation, and that does include edge case handling, as well as expanded functionality to "underwrite" or justify the effort spent on the rewrite.
Rewriting frequently (reference a) is often out of the hands of the developer because of (reference b) business objectives. Writing code is entirely in the hands of the developer only at the time of writing, not a "henceforth and forever" sort of situation.
And I suppose a dev who only cares about paying the bills would still resonate with this advice. It's not their problem once the suits take charge of the product and give a thumbs up. Why bother with future dev maintenance? the suits will pay for that bride when it burns down, and having that happen is 100x more easy than trying to explain good code upkeep that delays business for a few weeks.
> Focus instead on writing code you can throw away easily, code that you are wholly unattached to and is isolated enough that rewriting it won’t cost absurd hours.
> Good code is highly maintainable so that you can continue to meet business objectives in a timely manner without regressions.
I think you essentially agree on what people should do, regardless of whether you call this 'good code' or just maintainable code.
Write tests. Give things good names. Use comments to explain why you're doing something, not how to program. Don't copy and paste the same implementation x times because that way there's only one place to fix it.
> Don't copy and paste the same implementation x times because that way there's only one place to fix it.
But also sometimes it makes more sense to copy and paste over trying to fit an abstraction where it shouldn't be. :)
I think the purpose of questions like the ones by OP is not to figure out "rules" (which are useful only for beginners) but to figure out where and why rules were broken. Sometimes (often) the answer is time, but that in and of itself is a useful example.
Good intermediate (I suppose Sr. in our industry) level code is notoriously difficult to find examples of and mentor toward.
>But also sometimes it makes more sense to copy and paste over trying to fit an abstraction where it shouldn't be. :)
depends on goals. diverging modules should be copy-pasted, two modules that rely on the same functionality should be consolidated (not necessarily abstracted, but synchronized somehow). Those are both two common cases, so there's no general advice on which is better.
>Good intermediate (I suppose Sr. in our industry) level code is notoriously difficult to find examples of and mentor toward.
so much industry code and knowledge is proprietary, so I imagine that is by design. even intermediate code has a bunch of value to a company, even if the company lets go of that engineer to make their earnings report 0.1% higher.
> But also sometimes it makes more sense to copy and paste over trying to fit an abstraction where it shouldn't be. :)
I find that's a smell of limited languages: Maybe a language has poor error handling semantics, maybe it's not expressive enough to make a parameter generic.
It can also be a smell of not understanding the language well enough, too. Maybe there is no need to copy and paste, but the programmer didn't understand the language well enough to make a generic abstraction.
I agree. I've had the unfortunate experience of moving from more expressive languages and interesting problems to less expressive languages and boring problems as the size of my TC and company go up. :')
But you can only do what your tools allow you to do.
"maintainable" code is disposable. Modern web stacks are built on the idea that you can delete a file and re-implement its interface with a different service behind it later on. Instead of writing code which will be easy to modify, write a good interface which solves your problems in a way that's easy to reason about, and feel free to write garbage implementations that will get "refactored" (thrown away) every year or so
>Modern web stacks are built on the idea that you can delete a file and re-implement its interface with a different service behind it later on.
and that paradigm is exactly why my domain is the start opposite of front end web development. Code I write may be used by thousands of other engineers over decades. I can't take into account every edge case, but I do try to write with the assumption that one day some archeologists will uncover that code as some Rosetta Stone. Of course, modern demands never let me reach that ideal, but it teaches care and good documentation.
This is what "top down" coding looks like. In "top down" coding you're told what to do by the boss, given too short a timeline, and forced to push something out the door.
Congratulations on figuring out how to create a massive churn rate among your actual good engineers.
I understand you're characterizing the word "good" as some bad definition of "good" used by some subset of others, but your quotation marks are doing a Herculean amount of lifting (i.e. it's hard to tell what you're saying). I think most people would consider "engineers who value solving problems" as "good" and vice versa.
I'm just using "good" here as defined by the parent comment I replied to, to make the point that I believe the engineers as he described wouldn't value solving problems as much as they value "good" code.
It goes hand in hand. We're still talking about engineers here, not pure computer scientists. And if we want to call them "engineers" they should understand short and long term ramifications of any decision they make, something reflected in the code they are responsible for. Few other engineers would ever accept a paradigm of "make fast, re-implement fast", but since that's a luxury a software engineer has, it comes more down to understanding what the business needs and using the right tools.
Taken at face value, this advice is probably just as bad as the opposite ("write perfect extensible modular code with 100% documentation and test coverage"). There is a lot of room in between where the enlightened developer can find happiness.
You misunderstand; nothing about what I suggest says you shouldn’t write, “perfect extensible modular code with dull documentation and tests.”
If that’s what you need to do to solve the problem, do that. The point is to stop focusing on the code as the work product, and instead focus on the solution as the work product, of which code is one part.
If you're at the start of a company writing an MVP, then you're spot on. If the code isn't being written for a startup in the early stages, this doesn't make sense.
Writing bad code that just "gets it done" is the proverbial broken window. It's how you end up with shitty code-bases that get shittier with every change, until it all collapses under its own issues.
Doing this on established code-bases is basically what the classical duct tape programmer does. Sure, you deliver "business value" in the short term (normally to claim credit and gain favor) but at the expense of everything and everyone else.
I took it as a compliment. Some value "productive" engineers, while others value "curious" ones. curious ones are probably for companies that want to retain talent and invest a lot into R&D. productive ones are for when you need to get a product out fast before anything else.
Not saying one is better than the other. Sometimes being first across the line is make or break for a company. But I wish companies could be more honest about what they want.
Writing good code doesn't necessarily take more effort than writing bad code; in fact, my experience tells me otherwise. Teams that write good code iterate faster.
The point I'm trying to make is that the pursuit of "good" code is futile, as it does take more effort to achieve than writing functioning code.
But in a way you're right; if there is no to ensuring your code follows more conventions, go for it. That's exceedingly rare, however, as a situation to be in.
Everyone's opinion is it's based on the code they actually work with(And devs have lots of personal projects which adds bias, because they think stuff is a good idea just because it works on 1000 line projects)...
I'm guessing solid and dry are a big part, but lack of cleverness, language choice that doesn't require cleverness, and heavy reuse with libraries and frameworks, and choice of feature complete libraries is probably a lot of the speed.
Something in C is going to be more work than JS or Python or Dart and work takes time. Code you write takes more time than code that already exists, unless the code that exists disappears one day.
>Or is it because it's written in a manner, easy to understand, easy to extend, and easy to throw away and rewrite if necessary?
This. But IME when you write for this purpose you tend to end up with code that is in fact confusing, not documented, and not test covered. So you need to slow down even if your goal is to one day completely re-write that module.
Perfect abstractions are never perfect unless you completely architect out the product, down to every single edge case. That's virtually impossible with a large codebase, be it due to an API or even compiler level bug.
It's abstracted a little bit, but every abstraction is designed to lift a roadblock and reused multiple times like an old tool. And special effort has been spent on avoiding technical debt (hard coded stuff, untested stuff, overloaded or missing properties) to the point of doing things with less effort.
Statistically speaking, most developers are not working for startups. And of course, large companies have the sway and politics to go around not being first. e.g. Apple isn't worried about missing the market, they are trendsetters and they light a market up even when 10 years late to it.
My craft is not leaving a dumpster of a solution to the next people to look at my code when business logic inevitably changes. If someone's solution to seeing my code is "we gotta re-write it", I have either failed or a much more novel solution was discovered that makes my code irrelevant. I hope it's not the former.
Your “craft” is not to write code, it’s to solve problems. If you can’t be proud of the problem you’ve solved, you are no engineer. That’s the main difference between an engineer and a scientist.
Sure, and a civil engineer can solve a water leakage problem with duct tape. Real prideful moment.
As I said, it depends on the project. I'm not going to approach a leaky faucet the same way I would an industrial sewage pipe. Fortunately the industry is big and I can choose to work on larger problems where longevity is valued over throughput. You're fine whipping together a React App in a weekend While I'd be more the end of maintaining the React repository. To each their own.
I am guessing that your definition of "problem" is a bit too narrow and simplistic.
> Your “craft” is not to write code, it’s to solve problems.
Writing code is part of solving problems.
The quality of code has an impact on the various aspects of how a group of people solve problems.
As a small example - consider onboarding time for a codebase/system. Longer onboarding time means - lower profits end of the day for the organization (I'd also argue that longer onboarding times correlates higher talent attrition). And code quality has a strong influence on onboarding time.
Code quality has an impact on the "debuggability" of your systems. How quickly can you fix stuff when things go wrong?
Code quality has an impact on the "deployability" of your systems. If your code is well-done it is easy to deploy, redeploy, etc.
I can cite maybe 10 more properties crucial to org health, which are influenced at least partly by code quality.
So, "the problem" is not as simple as it may seem at first glance.
Or maybe “code quality” really means how visually pleasing the code is, or maybe it means it takes up the least amount of disk space, or what if it means code that doesn’t use the letter ‘e’?
This is why the pursuit of “high quality code” is pointless; you are not an artist, you are aligned to solve a problem. If you do that, code or not, you are doing good work as an engineer. If you are not, you are not. Whether the code fits some arbitrary definition of “good” separate from your ability to solve a problem doesn’t play into it.
You refuse to define the problem in a comprehensive way in the first place. That's the meta-problem :)
The second issue is that you think it is impossible to define an "abstract good" in code quality given a specific context (team, product, market). The "abstract good" stems on its own for the given internal culture and market situation. Through some common sense examples, it is easy to see how "abstract good" wields influence on "practical parameters" critical to business survival/thriving.
As an analogy, I can say the "body is healthy". I am aggregating a bunch of metrics to say - "this is healthy". It doesn't mean the term "healthy" is meaningless. The term "healthy" has useful meaning although not at a mathematically precise level. One could even argue that the term "healthy" captures something even precise mathematics cannot capture (it's abstracted at a higher level). Apply similar argument to the term "code quality".
Edit: Maybe it is better to explore the idea of code quality "via-negativa". Find what's actively harming beneficial outcomes. And remove it. If you cannot find many harmful things, then it has high code quality.
You’re falling for the trap I’m suggesting you avoid. Stop caring about platonic ideals of what Good Code ought to be, and start focusing on how well the code solves the problems it was built to solve.
You can call that good code if you want, but my argument is to stop caring about the code’s “quality”, as a value it carries independent of the problem.
The physician - operates "via negativa". He tries to find faults with the given body, tries his best, and when can't find - he calls it "healthy".
The engineer/businessman can look at the code from an empirical point of view.
If onboarding is bad -> code is bad
If understandability is bad -> code is bad
If deployment is bad -> code is bad
And so on. As you eliminate these issues, your "code quality" increases (just like as you eliminate disease, the body becomes more healthy).
Look into say, Taiichi Ohno's Toyota Production Management - one associates "zero defect" ~= "quality". So, the term quality alludes to a continuous elimination of faults and shortcomings.
The aggregate placeholder/banner term 'code quality' stems from very firm practical sources, that can be inspected, amended and improved.
Sure, but you are in agreement with what I’m saying; good code implies specificity towards the line-by-line writing and structuring of the language (that’s what “code” is, more or less), and I argue that’s useless, as do you here by citing examples of how the code solves the larger problem in the system.
The first is the business problem, most important.
The second is the problem of maintaining and iterating on the solution to 1…
Without solving the first problem, the business dies and you don’t even need to care about the second problem… however the second problem can also kill the business if not solved eventually…
I am a self taught LAMP guy that wrote a mini saas that 34 companies pay for and use.
It pays the bills, and works surprisingly fast compared to most CRMs, but I can assure you, it is not good code. Even I'm pissed how shitty I let it get.
With the caveat that "easy to throw away" means "easy to understand the implementation front to back so you can know the full impact of throwing it away without crossing your fingers"
I don't know about front to back, a good module should be explicit on what it affects and _ideally_ abstract its implementation.
We were talking about what a module should aim to be, and that aim is a good proxy for a lot of properties of well architected applications, even when not perfected. And even when you don't plan to ever throw the module out.
A painter might study and practice enough to be as good as Michealangelo, but the attempt to be 'more good' or 'better than Michaelangelo' stops as soon he starts a serious painting. He has to finish the painting with the skill he has at that point.
I agree code should be unattached and you can throw it away. As soon as you start coding a program for a client, your attempts to become 'more good' as a coder and to write more ideal code, have stopped and it's time to make some code you can throw away or sell to the client.
But all the 'training' paintings you made on the journey to becoming 'good' have to be kept, because you trashed thousands of attempts along the way in pursuit of an ideal painting. The good painting is framed and hung on my wall. The commercial painting is sold to the client or trashed.
Yes, give up on 'good code' at work. Keep the ideal of good code as a direction to improve towards, not an end result in commercial works.
As soon as 'good' became explictly defined/bike-shedded it died anyway..... it's an infinite direction, not a limited thing that can be defined and boxed up.
Comes down to semantics, really. Is "good" equivalent to "acceptable" or does it mean "to be strives for"?
Code that works is, usually, acceptable. But code that is acceptable while making no unnecessary maintenance trade-offs is much better. Good code is code that uses standard techniques in standard ways to achieve a result without being verbose or inefficient. But that is a much higher bar than code that is literally good enough.
>isolated enough that rewriting it won’t cost absurd hours.
I mean, juniors can do this. Once you're a senior, there will inevitably be come coupling you need to make in order to "pay the bills". Or you may make the first part of a system that will be a pain to re-write, even if it's the most elegant, readable code ever.
This is uncharitable. The GP's comment is short. What people have an opinion about is, "Give up on 'good code'" and, "working code, code that pays the bills ... code you can throw away easily, code that you are wholly unattached to".
big claims require big justifications. I can't just go out and say "C++ is better than Javascript" and expect people to not scrutinize me because "well it was a short comment".
But hey, they do justify it in responses. So maybe they are indeed playing to their philosophy of "work fast"
I would say a take on this is good code is code that is worse than the next code you write. Don't pursue perfection, pursue improvement over time. Make things work but learn new stuff that makes your old stuff look foolish. Improvement is more important than perfection.
Is there any particular language you're looking for? I've found some languages hideous until I understood them and could appreciate their respective graces. Off the top of my head the I can think of a couple of projects you and others may be interested in.
The first is Jones Forth (https://github.com/nornagon/jonesforth), start with jonesforth.S and move into jonesforth.f. I really enjoyed following along with it and trying my hand at making my own stack based language.
The other is Xv6, a teaching operating system from MIT (https://pdos.csail.mit.edu/6.828/2021/xv6.html), not all the code or implementations are top notch but it shows you non-optimized versions (just because they're simple and more readable) of different concepts used in OS design.
If you're interested in the embedded world, there is a really neat project I've been following that feels a more structured and safe (as in fault-tolerant) while still staying pretty simple (both conceptually and in the code itself): Hubris and Humility (https://hubris.oxide.computer/).
I find that reading books rather than code tends to be more helpful in terms of finding good takes on what clean code is -- more specifically books on refactoring or specific language-related features (like 'Effective Java' or 'Fluent Python'). The issue with just reading code is that many times - you'll miss out on why the author chose to use the expression or abstractions which they chose to use. Reading a book at least takes you through author's thought process. For an alternative - you could always browse repositories which contain notes on refactoring as well like this one (which does a good job summarizing some of the key principles from Fowler's book on refactoring):
Rust stdlib code is quite high quality although not particularly dense due to large amount of comments. Start from the docs, and click any source link: https://doc.rust-lang.org/std/vec/struct.Vec.html
I learned a ton about Java and pragmatic algorithm development from reading Dagger (https://github.com/square/dagger) and porting it to C#. It's small enough that you can grok it in a reasonable amount of time, but sophisticated enough that there's a lot to learn. (Yes it's deprecated in favor of Dagger 2, but the latter is a tougher slog IMO)
Actually most of the big Square OSS libraries are great to read - okio, okhttp, picasso.
The best – and I think the only – way to discover what good code looks like is to work with it.
Eventually, after working on, say, half a dozen code bases, you'll start to understand intuitively what good code is, providing you get lucky enough to find a good code base, or a code base with a significant amount of good code.
It's a long old journey, but once you have the skill, it never goes away. It's like learning a musical instrument or a foreign language. (By which I mean you can read as many books as you like about it, but without application, you haven't yet begun. Nevertheless, read the books.)
Warning: most developers never attain this skill, but almost all of them believe – truly believe – that they write good code; just as everyone thinks they are a good – nay excellent – driver.
Warning: no one writes good code. Good code becomes good through iteration, just as good writing becomes good by iteration/editing. The reason for this is obvious; but if you don't know why, then you haven't done enough yet.
Warning: everyone has biases. Learn to recognise yours and when you are applying them. Learn to ignore them and see things through a different lens. Explore with an open mind.
Iterating to good code is one of the most satisfying things you can do with software development.
Not exactly an emphasis on coding style itself, but I would recommend checking out the "The Architecture of Open Source Applications" to see examples of how some large and popular open source projects are structured.
Reading code, is second to using it and working with it.
Someone can tell you "This is good code." but good for what? Why is it good?
It is fast code? Is it highly maintainable? Is it well documented and kept up to date? Is it a code that is highly reliable? Is it code that solves and important problem?
My rule of thumb is: Ugly code usually comes from ugly problems. Ugly code can often be some of the most valuable code, because... it does the ugly things! It does what we want 99.9% of the time, using heuristics, and other nasty stuff.
So don't judge code on if it is "good" or not. Judge it on if it does what the author intended, and if it doesn't suck too badly to read with no reason.
Code bases I've worked in and have opinions on:
Samba: Good code base, but you MUST understand the idioms of the codebase, or it is absolutely horrible. It also alas, has the wisdom of 20+ years of existence in it... so it isn't always pretty.
Illumos/OpenSolaris: Nice codebase. Get the SmartOS distribution and you can literally type a few commands and build an entire OS and userland.
FreeBSD: See above. Great codebase, ans also, it can build userland + kernel, though it takes a few more commands. I'll admit I haven't read this one in 20 years. But I always found it a good codebase to work in back when :).
Grab the source for a library you use all the time, you know the useful one but the API feels a bit off... Download it and look at why the API is the way it is.
When looking at code, do NOT neglect looking at the history of a given file or piece of code, it often can teach you quite a bit. :)
The Elixir standard library is quite readable I’d say, but it all depends on what you’re interested in? I’d recommend learning Elixir as it’s immutable and a very simple language and does concurrency better than everything else (by inheriting from Erlang/OTP on which it is based) and the packages don’t seem to exhibit the needless churn that goes on in the world of JS, for example.
To folks starting out with Elixir, I suggest reading its standard library. From my experience, there was this aha moment when I started reading `Enum` module.
Also, Elixir's documentation is one of the best out there.
I would just find a heavily used, well-loved open source codebase used by lots of people in production. You need to define "good"/"bad" as actual objective qualities you can see in the world.
I would look for places the code seems to be really liked by its users. Maybe its very reliable, or extensible, or fast, or something else. How do they achieve that? Why do the users say these things? How do they measure / focus / make tradeoffs to focus on those attributes?
Then for painful to use software & common painpoints, why does this happen? Is it a fundamental design decision? Is it just sloppy code? Is in just intentionally slow to be more user-friendly? Or hard to read code because the focus is on speed?
It's all about the tradeoffs and intentional choices...
Zulip is an open source chat system and has fantastic technical architecture docs, a team that cares a lot about code quality, and is very receptive to working with new collaborators. https://github.com/zulip
This isn't exactly a repo to look at, but the book "Clean Code" is a fantastic read for learning how to write good code. It does have a lot of examples in it, and does a great job explaining everything. https://github.com/jnguyen095/clean-code/blob/master/Clean.C...
I don't like clean code, and like the author I find the things presented in Clean Code to just make incredibly inscrutable code. The worst code bases I have seen have been the ones with no code plans at all, and the ones that use Clean Code and Gang of Four like a bible, both are equally mazes of spaghetti. Clean Code I really just don't agree with, and I think this author lays it out well. Special design patterns and tons of polymorphism usually end up creating multiples of complexity in the effort to reduce DRY at any cost (and often are abstractions that are only ever used one or two times anyways).
Ultimately I think the single most important rule for clean code is: skinny controller, fat model. If you are doing batch data, then this applies still I think. You should have all of the logic you can in the model, avoid data objects. And the code paths that alter things should be as thin as possible. I honestly think it is better to have a 5k line model if it avoids more.
The most unlcean code I have seen usually falls into the abstracted out processes in financial instutitons where they follow Clean Code advice and everything are a bunch of functions passing around some big fat objects full of getters and setters, and changing any functionality means adding changes somewhere in the process to check state and alter the state, which leads to loops and if statements everywhere to see which account type it is at each point etc.
But in OOP programming its supposed to be objects sending messages to each other. Every object should know everything about itself, which is what a fat model demands. Any more abstraction than that seems to get in the way.
The popular OOP seems to be exactly the opposite of what OOP was meant to be. If you get into the original intentions, it honestly starts to sound more like FP.
Popular OOP passes Structs around (they just call them records or POJOs or whatever) through a bunch of "classes" that do x. But you could rewrite the code from Java or C# or C++ into C or Cobol and it basically is the same. Its just imperative code with classes as a nice way of getting rid of globals.
Interesting read. I read a companion book (or is it wholly unrelated?) called Clean Code for Python and I learned a whole lot! That book improved my Python more than anything else, honestly. That being said, I agree with the critique that obsessively committing to DRY is throwing the baby out with the bathwater.
Clean Code is dreadful, or at least the programming examples are.
It was a good book of its time in that it was influential and encouraged people to think more deeply about how to make code readable, but even when it was published I thought it had some terrible advice.
It's probably still just about worth reading, as long as you ignore all the code examples and appreciate that some of the thinking is out of date, and a lot of the rest is controversial at best.
I also find the style grates, as "Uncle Bob" is far too full of himself and e.g. "rips apart" someone's code to produce a worse refactor.
There’s so much conflicting advice in this book (like saying functions should be immutable then also saying the ideal function has 0 arguments, apparently mutating the class doesn’t actually make the function mutable to Bob Martin!), and the code examples themselves are horrible (the prime number generator for example).
I’ve heard people say it’s a good book if you just ignore all the bad stuff, but how are you supposed to know what the bad stuff is if you’re a beginner? I think it’s time to stop recommending this.
> like saying functions should be immutable then also saying the ideal function has 0 arguments, apparently mutating the class doesn’t actually make the function mutable to Bob Martin!
What does "functions should be immutable" have to do with mutating classes?
I should have been more specific, the way Bob Martin phrases it in the book is functions should have 0 side effects. Then he shows an example at the end of the chapter where the entire example works by creating a giant class full of member functions that mutate the owning class. I (and I think most people) assume that a function with 0 side effects means that if you call the function, the state of the object should be the same before and after the function call (and the rest of the external system should remain unchanged). But, according to the examples in the book, it seems like Bob Martin only considers it a side effect if some external state is modified as a result of the function call.
The best example of a function without side effects would be sin(x). You call the function with an input and it returns a completely new output. The function should be thread safe and easy to isolate because it never touches any outside state.
Performance oriented code can be hard to maintain. Writing things in unintuitive ways to try to coerce the compiler into doing fewer instructions can result in code that's harder to read.
I’m hesitant to leave a comment because “good code” almost feels like answering “good food”, but I do have some opinions:
- Ratio of code:documentation INSIDE the source code
- Directory structure depth is “just right”; not too deep nor too shallow
- Number of dependencies is “just right”; don’t build things yourself, but also don’t import the whole world
- TTLD (Time To Local Dev); how simple is the getting started guide in terms of copy-pasta commands + automation +
the right amount of context + easy-to-use tooling
- Code culture; follow industry best practices and make it clear where & why you deviate
My personal favorite one: `make todo_list`
We use keywords (TODO, OPTIMIZE, HACK, etc…) through the codebase and make them easily searchable with make helpers.
In the early 1980s, I inadvertently agreed to produce a 100-page manual for the student programming environment I had designed and implemented at the University of BC. I stupidly decided to do it in TeX. The stupidity here was that (a) the only laser printer on campus was in another building and (b) the TeXBook didn't exist yet. All I had was a listing of a late prerelease of the Web source of TeX, along with some of Knuth's writings about TeX77, which was a substantially different system. I managed to do it, including formatting much of the manual in the style of Unix man pages (the environment itself lasted for several years, until mainframe timesharing for students went away for good). When I finally got my hands on the TeXBook, I found that I had learned much about TeX, including a few things that weren't so :).
Knuth's programming style is highly idiosyncratic, and there are many points with which I'd disagree. Furthermore, the choice of Pascal required all sorts of strange compromises. That said, I highly commend reading the source of TeX, to see how a brilliant computer scientist attacked a problem that is not nicely structured, using very restricted programming tools. You will not get much that you can copy into your programs, but it is an excellent, well-documented attack on an interesting and complex problem. (By the way, Knuth's TeX was written long before modern typesetting things like Unicode, PDF, and OTF fonts; you don't want to use Knuth's TeX for modern work; I use LuaTeX. But Knuth's version remains as a useful subject of study.)
Another good thing to study is Lions's commentary on Unix V6. This is interesting because it shows many of Unix's key abstractions implemented in a few thousand lines of code.
Yes - I wouldn't recommend someone use WEB to program in unless they were already Knuth-like, but the concepts and examples of "how it is done" is worth reading.
Just like if I were to write a book, it wouldn't be in the style of Plutarch's Lives, I'm still glad I read them.
I'd encourage you to first be opinionated about what you think makes good code. Maybe read some articles or books on it. Don't worry about being right or wrong--someone is definitely going to disagree with you no matter what you think.
And then read any code through that lens. Then read some different code and contrast it. What did you like more or less? What worked and what didn't? Where did it work and where didn't it?
Remember that code that is fantastic is some ways is often horrible in others (e.g. the legendary fast inverse square root).
Approaching it this way helps one consider the reasoning behind what makes certain code good, and forces one to examine the context of the code, which is also critical. And being opinionated helps you remember to apply those rules in the future.
And there are definitely some languages to avoid all together if one is trying to learn. I have opinions, but they'll just spark flame wars. Googling worst languages will work.
The fact is someone programming languages - either by design or culture - just encourage unbelievable crap. Obviously, any code can be written well, but if one has to look for a needle in a haystack it's not worth it.
I don't really know what to expect. You want to see good production code, and the FOSS community is way way WAY better at this than some of the bubble gum you'd see in a professional setting.
Your question is too general, so I can't exactly give you a specific repository. I could direct you to BGFX[1] for a decent architecture of a cross platform renderer, but if you're not a graphics programmer, that may be a bad exercise, as you'd spend more time learning jargon than studying clean code. Or it uses patterns (or lack of, given graphics programming) that don't apply to your domain.
Good code is different depending on your sense of aesthetics, but also it's purpose.
Good code for an enterprise-life-blood type of system is terrible code for a "let's check if this is an idea" type of prototype (and vice versa).
For a relatively large and mature project, someone might suggest Linux, but it's a bit hardcore in my opinion. FFmpeg is actually really nice, and you can wrap your head around the general idea of how that system works, to the point where you can comfortably add new options and even introduce new codecs and containers in a few days.
I would add to your points. Just read code. When dealing with code you are most of the time going to be dealing with 'good' and 'bad' code. Learn to read both. As both are going to be around. If you just go to 'good code' how do you know what 'bad' looks like? Also to your point what is 'good code' can be very subjective. I have one language I use a decent amount. There is a particular style that many use that they consider 'good code'. I think it is a terrible bit of style to do. I have my reasons for it. But good/bad is not necessarily a metric you can measure.
I would posit that discarding github out of hand is not a great start. Pick one of the big projects on there and follow what is pushed in. You will notice how others interact with the code. You will start to notice who checks in good stuff. Follow them. Also pick your language. As a good style there could be a bad style in another for example C++ vs Python. Python styles in C++ would drive the C++ guys nuts and the other way around.
This is an interesting article on the subject of reading code by the coder and writer Peter Seibel (he also wrote a nice book called Coders at Work with interviews with some programmers who have worked on interesting and widely used software): https://gigamonkeys.com/code-reading/
I’m sure I’ve seen some HN posts about this article, but I can’t remember the content of such.
Note: Readability, coherent organization, and logical correctness are often more important than obfuscated "hold-my-beer"/Heisenbug code.
In general, have a look at the Standard Template Library or Boost examples directory. Then some unit tests for simple GNU programs similar to what you are building, CLI command source like "ps" for OS interactions, and finally an OS kernel like Linux or *BSD. There are also several online classes offered by linuxfoundation.org etc.
Start with a small SBC like a pi4/BeagleBoard, and learn how to snapshot disk images (you will severely damage things while learning). There are also several open syntax formatting standards published by projects (and companies like Google), that will guide you on the local ecosystem.
Expect a Hazing in some places, as some folks tend to forget they were students once too.
It would also be wise to spend a few days studying security-auditing-tools, as one may learn to mitigate common ways people will try to break stuff. Detection and incident-handling is arguably more important than outright prevention.
When studying protocols, you can compare apples-to-apples because the protocol has to work the same way by design, but the implementation can vary. With compilers, you're getting a look into programming in its maximally symbolic form - and every strategy a compiler uses is one you can directly apply to abstract your own code. And commercial video games have another mode of apples-to-apples in that the original release - the dirty, meets-deadline stuff - often can be compared with fan remakes and patches, which have the luxury of an exact specification and no deadlines. To actually ship in industry, you have to accept and know good dirty code hacks, but it's worth comparing them to their counterparts.
When I'm coding / debugging with a framework of some kind (nodejs, dotnet, jdk, wordpress, whatever I'm using to deliver value at the time) I sometimes "step into" the framework or a library to see what happens. (Of course, I need open source for the framework/lib for this to work.) It sometimes opens a window to interesting code to read that's relevant to what I'm thinking about at the time. And sometimes doesn't.
Lots of source these days has auto-documentation comments. Good IDEs present that documentation, which helps guess what might be worth diving into the Step Into rabbithole.
Often, from a high-quality framework / library I learn a bunch about handling weird edge cases and about writing code for long-term maintainability. And, I often learn some useful constructs and techniques. (And, it's possible to learn useful things from not-so-high quality code too.)
An event/threading library for C#. I keep a fork in my Github because the original source was archived: https://github.com/GWBasic/retlang
Note that both examples are "functionally obsolete." The Fitbit studio environment is deprecated in favor of Android Watch; and if you're using C#, you can should be using Tasks to get similar functionality to Retlang.
I would suggest starting with reading a spec for your favorite language. I got a lot of value from spending time with the Ecmascript (JS) spec, as well as the Postgres manual. Take something like Generators that may be less familiar to you, then plunk those expressions into sourcegraph or github search and find them used in real codebases. Try to understand why the developers chose to use a generator instead of another pattern.
I think reading code itself is only valuable when you need to explore a specific domain. Trying to extract coding patterns from an unfamiliar domain very difficult; it comes with unseeable assumptions.
I would argue that if you find any successful open source project on Gitlab, Github, or the code forge of your choice, you've likely found some form of "good code".
Any code that is used and minimally hated by a large number of people is good code in my opinion. It's valued by its users, and ultimately that is what matters.
It may not be perfectly DRY, use popular abstractions, or whatever people today think of as ideal code, but successful software projects solve real problems every day. The authors have likely done a good job of balancing usability with code quality. And I think that's the best we should hope for.
Read source code from projects that used by wide range of people. Some tiny repository might look nice at first sight, but it could be a result of limited scope, and production codebase always involves a large range of scope.
Even if there're some caveats/hacks in these codebase, it might have a valid cause and you can learn something from how it evolved or how it (dirtily or gracefully) solves the issue.
For me, I read code instead of docs when I need to understand a dependency or tool better. That gives me direction and focus when reading code. You learn a lot from the process.
Start a reverse-engineering project like I did, and you'll quickly find yourself reading 10s of thousands of lines of assembly. Or at best, decompiled generic C.
What do you want to build? It's much easier to find code to read, and motivation to read it, when you have a goal to work towards. For example, when writing an interpreter, I read code for various interpreters. Looking at a variety of projects from different people gave me a lot of ideas and examples of different approaches to solving the same problem.
There was a HN thread months ago on this topic. I recall people suggesting well known github repo’s. I’m unsure which ones but you could start at github…
> I generally wrote my algorithms in Pascal (later in Modula or Ada and by Borland days, in C++) and then hand compiled down to assembly
Beast mode. This is a great way to understand more about how the high level code we write actually performs. I learned the basics of compilers long ago but never though to apply it in this way, where you can have a reference implementation against which to test an assembly rewrite. Very cool!
Rather than seeking an impossible ideal (>2 people who agree on what 'good' code is)-- instead read ALL sorts of code. This will then help you learn to distinguish between what is good and not good, and what makes it so. Simplicity, clarity, and lack of surprise come to mind as concepts for 'good'. Others will have different ideas.
I recently came across Beanie. A Python ORM for MongoDb. A pleasure to work with and integrates well with FastAPI, the tests document the code well, and at this point it’s only as complicated as it needs to be.
Chromium. I don't know how it looks right now, but several years ago lots of its parts were really good. Modular, with a lot of extensibility points, easy to read. (That may have changed, so another opinion would be useful). It was quite different from Firefox code, that looked surprisingly convoluted.
The best code is no code. I would stop chasing these kinds of rabbits and start looking harder at the actual information being processed throughout. I'd also focus more on the actual business & customers over any specific coding practice.
You would hopefully find that the most compelling aspect of well-designed software systems is the data. In "data-driven" applications, 100% of the application state and configuration can be made to live in a database somewhere. In these scenarios, seeking code examples is not going to tell you much of anything.
My advice is to look at a bunch of SQL schemas (ideally, ones you know to be under successful products) and compare them to the problem area they support. Think about how you would answer questions a reasonable person might ask of that business by way of a query. Then, consider how much code you just now did not write to answer a realistic business problem.
Relational modeling can eliminate entire repositories worth of bullshit code that should have never existed in the first place. Do you want to train yourself to rely upon something that a true wizard can walk in and disaparate at the snap of his fingers?
Good code? Code reading is more like understanding different cultures. Read big tech codebases meta and google, to startups codebases like openai and comma.ai, you will see.
Thank you for everyone who responded! I’m overwhelmed by your responses and reading through each one. Getting this amount of comments means a lot for me, because I usually get something like 4 to 5 comments. Thank you.
Sometimes it helps to look at pull requests/merges on large repos. You can see exactly what the change was, what code was touched, and discussions on those changes so that helps you get a feeel for what's good and bad.
My personal nomination for code you should read next: the source code for Android.
The problem with with "good code" is that it tends to be code that -- more often than not -- is written by people I wouldn't trust to work on real-world large-scale software development.
On the other hand, any sufficiently large project that was successfully delivered necessarily contains almost exclusively "good code", because if it didn't, it would have collapsed under the overbearing weight of "bad code".
What makes the Android source code interesting reading:
- It's written by some of the best software engineers in the business, by any imaginable standard.
- It is an unimaginably successful project.
- It is codes that deals with the gritty enduring reality of programming in the large that cannot reasonably be addressed by toy samples of "good code". That's where real "good code" lives.
- As a programmer, that's the kind of scale I want to work on. Preferably as one of the principle engineers working on Android 1.0. But Android 14 wouldn't be awful.
Overwhelmingly, it is exceptionally good code. Occasionally it is less than happy code. But the places where it are less than happy are almost more interesting than the places where it's good.
The code is 14 years old. It's been through 14 major releases (34 minor releases). It started on phones with 320x200 displays, with megabytes of memory, and processors than could barely run a toaster. And now it runs on phones with 4k displays with 8GB of memory, on processors that are about 2.4 kilo-Crays.
If you're a junior programmer, every single line is better than what you're capable of writing. If you're a senior or intermediate programmer, there is serious food for contemplation. The question that should be asked at every turn: if I was on the Android 1.0 development team, what could I have done to make this happier code?
And I'd really like to see what the "Less code is better code" guru (a recent HN posting) could do with androidx/fragment/app/Fragment.java and friends. A perfect example of a "Good Code Guru" that I would not trust to work on any of my projects.
---
There is no such thing as bad code; but some code is happier than other code.
-- Herbie Hancock.
Or what Herbie Hancock would have said if he were a programmer instead of a jazz musician.
One approach might be to find individuals whose technical acumen you respect and look over their public projects (or alternatively their public contributions to bigger projects).
I like gecko / chromium source codes. My knowledge of javascript and browser APIs helps understand browser source code even if there is no documentation.
As many have already said, what you'd consider "good code" depends on what you want to learn and achieve, but here are a few recommendations:
1. Busybox. It's basically a collection of common Unix utilities, from common commands like ls or cat, to system daemons like crond and init. It's a good way to learn more about how Linux and other Unix-based systems work under the hood. Busybox applets are pretty independent from each other, so you can just take one and focus on it specifically, digging into the common library code if you need to. The utilities aren't as fully-featured as their GNU coreutils counterparts, which makes it easier to understand what they actually do. Busybox is not a toy project by any means, though, it's used in many embedded devices and leaner Linux distributions, Alpine being the prime example. However, it's written in some pretty dense C, with a fair bit of pointer magic involved, so if you don't understand things like the equivalence between a pointer and the beginning of an array, some things might not make sense.
2. The Go standard library. Unlike many programming languages, Go does not rely on much external code. Whereas Python delegates zip handling to zlib, handling of TLS connections to Openssl and so on, Go just includes all of this in the standard library, and it's all pretty readable Go code. If you want to understand many common algorithms or file formats, everything from sorting arrays to parsing JSON to sending and receiving HTTP requests or common cryptographic operations, all written in a readable style, in a language much higher level than C, just look at the Go stdlib. Go even fully implements everything needed for its own compilation, including linkers and assemblers. I haven't read these parts much, and a lot of that code is transpiled from C, so I can't say how good the code quality is.
3. Serenity OS. It's a hobby Posix-based operating system written in C++, with no external dependencies, not even libc or libstdc++. They have their own homegrown implementations of every part of an operating system, from a monolithic kernel, to common Unix utilities, archive handling, audio and video codecs, common data structures, like vectors (growable arrays), hash maps, locks, mutexes and other concurrency primitives, a custom string implementation, a window server and a GUI library, including a fully-featured event loop system, their own window manager and many common GUI widgets, to actual applications and games. They even have a custom web browser with a custom web engine and JS interpreter. As a rule of thumb, if something it's either in Busybox or in the Go standard library, there's a good chance it will also be in Serenity. Again, their utilities do much less than their non-serenity counterparts and are far less optimized, but that also means there's a lot fewer layers of abstraction to deal with and that the general principles underlying their implementation are actually easier to understand. The fact that everything is in a single repo, neatly organized, written in one language with common conventions, just makes it really pleasant to read. Their code quality isn't always the best, but the fact that it's C++ and not C does make things easier. Even though I haven't actually used the OS (because of accessibility concerns), it's one of these repos that I always have cloned on my computer, and it's the first place I look if I'm curious how a particular feature or app can be implemented.
4. If you're in any way interested in AI, everything written by Andrej Karpathy, notably Micrograd and Nano GPT. There's also Tinygrad, a bigger but still understandable take on Micrograd. Unless you're an expert, you need to watch Andrej's Youtube videos to actually understand the code, but the feeling I got when I actually understood the principle behind Micrograd is one I will never forget. I consider it to be the most beautiful piece of code I've ever seen, it basically embodies the whole principle of what a neural net is in 200 lines of code. Everything else that the big libraries do is basically just implementations of actual models, optimization and glue code, such as for loading data and such. It's often crucial optimization, optimization without which modern neural networks wouldn't be possible at all, but just optimization nonetheless.
5. Everything concerning Elixir, both the standard library, other libraries written in it, as well as open-source Phoenix web apps. Deep down, it's basically a Lisp without the off-putting parentheses. There are a lot of lessons to be learnt there, from the power of macros and the fact that things like "if" can be written in the language itself instead of being a special construct, to the power of pattern matching and the pipeline operator, to the advantages of its concurrency model and functional programming in general.
To generalize beyond these specific examples, if you want to understand something, find a smaller version of it, and try understanding that. The smaller version might just be a git commit from a good few years ago, with much fewer features, but it's better if it is a different, more basic (but preferably not toy) implementation of the same app, feature or algorithm. Don't read V8, PyTorch or Postgres, read Lua, Tinygrad or Sqlite instead.
Thanks for reminding me of reading busybox's source code. You can learn a lot from learning how bash utils implemented, but it's really a headache to understand which tools from which package/git repo, busybox is a good enough and battle-tested collection for all of this.
Most of my job involves reading bad code to understand what it's trying to do before I wedge some nugget of decency in there. The ability to understand the intentions of someone who has no idea how to express what they're trying to do is critical for maintaining other people's software
Good code is working code, code that pays the bills. Focus instead on writing code you can throw away easily, code that you are wholly unattached to and is isolated enough that rewriting it won’t cost absurd hours.
I feel like I get a lot more out of messing with / hacking on code than I do from reading it. I'm sure people vary, but I've got loads more out of open source contributions to sometimes small projects, and not very much out of trying to do something like read the code for the Glasgow Haskel Compiler or something.
So for code to read, I think having an in is crucial, at least for me. So I'd say, find a cool, maybe small open source project, look at the issue tracker if it has one, and try to implement something. You'll only really know if it's good code (or why) after you start trying to change it.