They're actually quite decently commented. The functions describe what they do clearly, and the bodies show how they do it tersely, with comments around the sticky bits.
Have you read through the original Unix kernel sources? Claiming that they're uncommented and difficult to work with is surprising. (On the other hand, the directory structure could be better.)
Without tracing the backing kernel structures and the code that relies on them, please describe why sleep(chan, pri) works, and as a maintainer, what I need to watch out for when modifying that code.
What does swtch() do? What does issig() do? Does that mean it's checking for a signal during sleep? What signals could be generated? Under what circumstances do I need to check for those signals? Are there any race conditions? Does ordering matter? What happens if I move the call to issig?
Beyond the invariants, this code is NOT READABLE. I can't just glance at the comments for an atomic unit of 4-5 LoC and see what it does -- I have to examine the code in depth, running the logic in my head, and explore the workings that way.
Reading code without comments is like tracing out a circuit without a schematic or documentation. First you have to manually establish the what, and only then can you even start to spend your time determining the why.
I work on modern BSD code. It's better than this old stuff, and I still have to dig to figure out how/why things are supposed to work. It's a headache compared to properly commented and documented code, where I can just skim standalone units and know what they do without having to trace everything myself.
Of course you need context to understand code. You can't get away from that without essentially translating the rest of the code into prose and attaching it to each line. At which point there are enough comments that you need context to figure out which parts of them you care about, reading them becomes a chore, and they essentially become noise. I've seen codebases with insane levels of commenting. I found myself ignoring the comments and tracing through the code.
If the model is simple and the code is clear, learning the context becomes easy, and it fits into your head. Following code becomes easy.
Comments decrease the amount of code you must personally read and understand, and define invariants that can't be expressed purely through the code.
> At which point there are enough comments that you need context to figure out which parts of them you care about, reading them becomes a chore, and they essentially become noise.
I've never seen this outside of contrived examples from lazy developers that think they're too smart to need to comment their code.
> If the model is simple and the code is clear, learning the context becomes easy, and it fits into your head.
In other words, you must trace the entire system to understand it and then fit it into your head. This is not advantageous to maintainers.
No, just the part you care about. You do need a high level mental model of the system. I've never seen a system where this is not the case, regardless of the number of comments. Even literate programming -- or at least, the examples of it that I've seen -- suffered from this. (Amusingly, I found literate programming examples were often easier to understand by mostly ignoring the prose and looking at the code.)
> You do need a high level mental model of the system.
That is extremely time-consuming to build without code comments and documentation, regardless of how "literate" the code is, because code alone can not express sufficiently detailed invariants, and simple logical/atomic operations involve a non-trivial amounts of code. (especially when writing in C).
There's almost nothing I hate more than inheriting a complex uncommented code base and spending hours or days tracing out the code to build a high-level mental model, when instead, with reasonable comments, I could have had that model nearly immediately.
You want some sort of design document. Comments are not sufficient to describe the way that the concepts interconnect. They are almost always detail oriented, scattered and do not give the big picture.
Comments embedded in source code are singularly unsuited to giving the sort of interconnection of concepts that allows you to quickly and efficiently understand build a model of the system. They are a poor substitute for real documentation.
Something like this should always be present to describe the overarching structure of the system: ftp://gcc.gnu.org/pub/gcc/summit/2003/GENERIC%20and%20GIMPLE.pdf
http://gcc.gnu.org/onlinedocs/gccint/RTL.html
Incidentally, I'm not sure if you're familiar with literate programming. That's where the code is almost an afterthought to the comments. This is an example of a literate program: http://tug.org/texlive/devsrc/Build/source/texk/web2c/tex.we.... You may be familiar with it -- it's tex, the core engine used by LaTeX. Compiled to PDF, the source looks like this: http://eigenstate.org/tmp/tex.pdf.
Design documents are necessary, but are no replacement for comments. When working in GCC briefly, I found the design documents to not be nearly so valuable as the (few, poor) comments that existed in the code (objc).
The problem with comments is that they decay into irrelevance and worse, lies (that chapter of Clean Code will live with me forever). You can't write a unit test to ensure comment correctness, but you can for code correctness. Hence the code is the only reliable source of the truth.
Comments only decay into irrelevance if lazy developers don't do their job. Circularly, it's developers that don't write comments that claim comments decay into irrelevance.
However, for maintenance programmers to do their job, if the comments mostly describe the code, they have to ignore the comments. Why make the maintenance programmer's job any worse?
Good comments are golden, but bad comments (those that tell you what the code is doing or duplicate coding logic in the comments) are worse than useless because they make it harder to spot and read the good comments.
I think the only thing worse than an uncommented codebase is a codebase full of comments that tell you what the code is trying to do in full detail, every few lines.....
I work at a shop where we use comments sparingly if at all. Our primary product has 14m LOC and 400k unit tests. We enjoy over 75% global market share in our extremely lucrative industry. We didn't get where we are today by being "lazy" as you put it. Comments are by very definition a redundancy when the code is perfectly descriptive and self-documenting. Thus developers can get on with their job and be more productive when they don't have to duplicate their efforts for dubious benefit.
> We didn't get where we are today by being "lazy" as you put it.
The two are hardly correlated.
Have you ever worked with some of the Mac OS X code written by poorer teams? For instance, the security framework? The internals of that code is a disaster at best, and yet, the core OS enjoys tremendous success and market share.
> Thus developers can get on with their job and be more productive when they don't have to duplicate their efforts for dubious benefit.
How do you determine the invariants of your APIs when writing code against a module? I don't trust arguments that boil down to "we're much too important to waste our time documenting, code is perfectly expressive!"
It's not. You're just wasting your time somewhere else, in little tiny increments, every time you have to trace code a few steps down just to figure out what it probably is supposed to do.
Or, sometimes in BIG increments, when somebody new has to learn the code base.
> the core OS enjoys tremendous success and market share.
How is that relevant? OSX has at best around 10% global market share? I'm talking market dominance here.
> How do you determine the invariants of your APIs when writing code against a module?
We flick through the well-documented code just as you would largely ignore the code and flick through well-documented comments. Here is a simple example for you:
public class EntitySynchroniser : IEntitySynchroniser
{
public EntitySynchroniser(BusinessObjectFactory factory, IDirectorySearcher directorySearcher)
{
Argument.NotNull(factory, "factory");
Argument.NotNull(directorySearcher, "directorySearcher");
this.factory = factory;
this.directorySearcher = directorySearcher;
}
...
The arguments here are invariant - they must not be null. I don't need a comment that may or may not be written in precisely the same format between 100 developers maintaining the codebase. I simply look for the one call to Argument.NotNull that all developers use in this case. The code is MUCH more readable than if the constructor had a comment explaining that each should not be null, mixed in with explaining what each should do. I can also determine by file searching or through the IDE which methods have these requirements.
If readers want to know what this API is and what it can do they need only look to the interface, or the implementation of it's methods in the class. Each method explains what it does by maintaining the same level of abstraction within the method. Like:
public void MethodA()
{
DoFirstHighLevelThing();
DoSecondHighLevelThing();
}
void DoFirstHighLevelThing()
{
// do less high-level things
}
Each method call would maintain a constant level of abstraction so that the code is easily reused, easily tested and easily maintained.
> You're just wasting your time somewhere else, in little tiny increments, every time you have to trace code a few steps down just to figure out what it probably is supposed to do.
If I needed to figure out why code wasn't working the way it should what's to say a comment would be more forthcoming in explaining the reasons for the defect? Surely if MethodA did "thing A" but was really doing "thing B" the comment would tell me "thing A", therefore I'd be forced to examine the code further anyway, only I'd be hampered by the deceitful comment also? This is the nature of a defect - something that operates outside the documented behaviour of the system or module.
Not to mention that your stated opinion is that comments should explain the inner workings of things so that consumers can be well informed. This has the potential for exponential maintenance as really low level changes are made. Imagine that you change the conditions under which data access-level exceptions are raised. If anything that ever touched your data source explained what happened in exceptional circumstances you'd have to change hundreds or thousands of comments, or face the sort of comment rot I mentioned earlier. This is the real time waster!
> ...when somebody new has to learn the code base.
This is always going to be the case. Comments don't make this task any easier than a well structured codebase. A well structured codebase comes from developers making the conscious decision that any reliance on the crutch of a comment is a failure to write proper self-documenting code. A well-structured self-documenting codebase by definition requires no comments (except for the "why" not the "how", which I fully support doing by the way).
> How is that relevant? OSX has at best around 10% global market share? I'm talking market dominance here.
Fine, I'm familiar with OS X, but I'm sure there's equivalent stupidity in Windows.
> The arguments here are invariant - they must not be null. I don't need a comment that may or may not be written in precisely the same format between 100 developers maintaining the codebase. I simply look for the one call to Argument.NotNull that all developers use in this case. The code is MUCH more readable than if the constructor had a comment explaining that each should not be null, mixed in with explaining what each should do.
No, it's not much more readable, because if properly commented, I WOULDN'T HAVE TO READ THIS CODE AT ALL. Instead, my IDE or documentation browser would tell me, inline, exactly what I needed to know.
> If readers want to know what this API is and what it can do they need only look to the interface, or the implementation of it's methods in the class.
If you have to look at code's implementation, you've failed. Why isn't that code a black box to me? Why do I care at all how it's implemented?
Why on earth would I want to waste time doing that instead of instant access to high-level API documentation?
Moreover, it's a contrived example, because nullable/not-nullable is the least of an API.
> Not to mention that your stated opinion is that comments should explain the inner workings of things so that consumers can be well informed. This has the potential for exponential maintenance as really low level changes are made.
No, my stated opinion is that comments that are externally visible should document externally relevant invariants.
Comments that are internally visible should document internally relevant invariants (if the code does not adequately express those, as it often does not).
> A well structured codebase comes from developers making the conscious decision that any reliance on the crutch of a comment is a failure to write proper self-documenting code.
A well-structured code base comes from writing good code. Comments are part of writing good code. Self-documenting code isn't fully documented code, unless it's literally a literate programming language. Claiming otherwise is just an excuse for you to be lazy and not write comments under the misguided auspices of being much too smart to need them.
Sure, if I were publishing a public API I'd want to document the functions available, but the fact is, the vast majority of the code we write at my shop is not consumed outside our own codebase. Why duplicate the effort of writing documentation when everyone has access to the code? There is no benefit for us so we don't do it. We made that decision and it's the right one for us. Sure, you have a different situation and I'm sure using comments has helped and empowered your developers, but for us it makes no sense and is a waste of time.
I'd suggest you stop trying to tell the world that senior developers must conform to your narrow views (narrow, by the sheer size of the response against your point of view here). Calling people things like "lazy" and insinuating they inflate their own abilities simply because they don't do what you think they should do is a pretty trollish thing to do. You obviously missed the first point in the original link.
The beauty of literate programming, quite honestly, is that because of the way that the documentation and code are interwoven, it requires fewer code comments, and encourages both clear code and clear documentation.
Yes. It's some of the easiest to understand code I've read. For example, http://unixarchive.cn-k.de/PDP-11/Trees/V6/usr/sys/ken/slp.c
They're actually quite decently commented. The functions describe what they do clearly, and the bodies show how they do it tersely, with comments around the sticky bits.
Have you read through the original Unix kernel sources? Claiming that they're uncommented and difficult to work with is surprising. (On the other hand, the directory structure could be better.)