Among the most convoluted source codes I've read is Tor. It works (apparently), and it isn't even very insecure per se (the code is littered with hard asserts that will abort code execution if an expected condition isn't met), but it is unnecessarily dense. Example: I use software to analyze the call graph (which function calls which function) and when I ask it to find potentially recursive loops (A() calls B() calls A() etc) it spews out tens of thousands of potential recursions.
By comparison, mbed TLS only has a couple of these, and a large project like OpenSSL 50 or so.
Conversely, C software that isn't consistent in error signaling (return -1 on error in function A, return 0 in function B, set parameter int* err in function C, etc), doesn't perform due error checking, whose call graph is spaghetti, mindlessly performs multiplication (leading to overflows with certain inputs), uses signed or unsigned int where size_t is better suited, are usually susceptible to bugs and abuse (vulnerabilities). The projects I mentioned are very clean in this regard.
I have learned a lot from reading the source code and watching it develop. It is written in modern Java 8. The authors are obviously experts of the language, JVM and ecosystem. Since it is an MPP SQL engine performance is very important. The authors have been able to strike a good balance between performance and clean abstractions. I have also learned a lot about how to evolve a product. Large features are added iteratively. In my own code I often found myself going from Feature 1.0 -> Feature 2.0. Following Presto PRs, I have seen how for large features they go from Feature 1.0 -> Feature 1.1 -> Feature 1.2 -> ... Feature 2.0 very quickly. This is much more difficult than it sounds. How can I implement 10% of a feature, still have it provide benefits and still be able to ship it? I have seen how this technique allows for code to make it into production quickly where it is validated and hardened. In some ways it reminds me of this: https://storify.com/jrauser/on-the-big-rewrite-and-bezos-as-.... You shouldn't be asking for a rewrite. Know where you want to go and carefully plan small steps from here to there.
Gstreamer. Pipeline is very powerful model for software, the potential of it is tremendous. Unfortunately I find level of development & maintenance of Gstreamer project itself quite poor - the code is horribly complicated for questionable reasons (it's said to be non-blocking everywhere; I find it bad excuse for being ridden with subtle bugs and for failures to use custom pipelines as blocks for higher-level pipelines).
I find such projects as ffmpeg and linux kernel quite well engineered, but have nothing special to say about them except that they are reasonably well organized and get better day by day.
For user-interface apps with considerations of high user productivity, I find such software as readline, tmux, mutt and bunch of other following wise pattern of extensible and scriptable software: if you want hotkeys, you need a domain-specific language and bindings must be
key: action[, action...]
Asterisk's code base is a pile of crap.
It's been getting a bit better over the years, but it still is terrible, tons of conceptual blunder, protocol implementations are only losely inspired by the specification, system APIs are used incorrectly, lots of code doesn't bother with dynamic string lengths, but instead simply truncates strings arbitrarily if they don't fit into some fixed-size buffer, ...
The only reason it kindof works is because bugs that happen often enough do end up being fixed at some point, but that's about it. If you know your C and POSIX APIs and you don't believe me, just go and have a look at the code, I promise you'll find a bug in less than an hour.
What is still amazing to me is the set of core design concepts which I've listed - channels, applications... I have a case for comparison here, where the project is of comparable complexity but all features are bolted-on ad-hoc without such complexity compartmentalization which Asterisk has.
FreeSWITCH is an alternative to Asterisk.
edit for details: The authors are quite meticulous (notoriously, every comment in a multi-line comment is 3 characters less than the previous) and stick to the "convention over configuration" mantra no doubt inspired by Ruby on Rails. It's interesting to see how they create abstractions to simplify so many common web dev tasks.
I learned a ridiculous amount from reading the source code to TeX (https://www.amazon.com/Computers-Typesetting-B-TeX-Program/d...) but it is written in a very 1970s style.
The thing I like about it the most though are the examples which there are for every feature. The person who wrote it actually understands what I want out of an example, I want code I can look at and immediately understand what is going on and why. I want examples I can refer to when mine does not work so I can compare and see what it is I did wrong. Take the GUI example for instance, anything that happens that is specific to that example has a comment. It makes no assumptions about your prior knowledge other then you understand C++.
So far, it's the cleanest code I've ever worked with while still being very self-contained.
Having an annotated guide for each software would be difficult but all of us have to start somewhere.
It shall be helpful in learning Object Oriented Design using C programming and GObject.
* Varnish Cache
* Mercury Programming Language
Watch out for biases based on how much people like the end product vs how well it's actually implemented though.
Especially the kernel in src/os
A assume most "standard library" type stuff is where you will find the cleanest code.
I ended up refactoring it again, and the code is still not as clear as guava.
For instance, I noticed they were using an enum for functions and I was like WTF who does that? Later I decided to make my library serializable so we can save to disk. Well, turns out that's exactly why they used an enum. My solution was to make a utility class to wrap the non-serializable objects but their solution was much clearer and less code