
Writing system software: code comments - indy
http://antirez.com/news/124
======
mlthoughts2018
When I interview for jobs, one of the questions I ask is where the team and
engineering org stand on code comments and what state of documentation do they
have for legacy code. It's a good question because people will use excuses
like IP protection to avoid talking about the ugly truth of the state of their
actual code, but if you're just asking about comments and documentation,
there's no reasonable way they can deflect the question. If they won't answer
it, well, you've got your answer.

Any place that takes an attitude of "code should be self-documenting" I just
walk away from. That is a type of dysfunction you're not gonna be able to
solve. Also places that say, "we're always evaluating new ways to improve
documentation practices..." begs further questions to get the specifics out of
them.

For most jobs, I know I'm going to have to deal with bad legacy code. But if
there is also a culture of not documenting things, arguing against comments or
formatted API and function documentation, and it results in documentation
spaghetti of Google docs, Markdown files, spreadsheets, emails, confluence
pages, etc., then there's no point. I can't help you.

~~~
ozim
I don't care about comments, I care about unit tests and I need to run trough
code with debugger. If code is testable I can run piece of code in isolation
and understand how it works. Most of the time people do comments:

 _a = 4 //assign value 4 to a_

Of course that is exaggeration but usually they restate obvious things and
think it is good comment.

Also I don't trust comments, I have to see commit history, and if I see there
was piece of code changed where comments are, without stepping through and
debugging I don't trust code is doing what comment is saying.

I do not know how other people are writing code, but I have to run code
locally, ideally in unit test or click through in the interface with test
data. Maybe others just can run code in their heads and account for what is
written in comments. But unfortunately I am not human
compiler/runtime/processor.

~~~
crazygringo
> _Most of the time people do comments: a = 4 //assign value 4 to a_

When I see people arguing against comments, I see this strawman _again_ and
_again_ and _again_.

In my multiple decades of programming I have _never_ seen anybody write a
blatantly redundant comment like that. It's not "most of the time" that people
comment like that, in my experience it's _never_.

Programmers have a job to do, and in my experience generally do not waste
their time using comments to "restate obvious things and think it's a good
comment". Rather, they comment when necessary to explain the _why_ , or to
provide a concise 2-sentence description that would be too long for a function
name but lets you skip reading the next 30 lines of code which would take much
longer.

I really wish people would stop using this strawman of "most comments are
redundant/obvious anyways, therefore don't comment". I mean, maybe you
personally did have the terrible luck to work with a codebase that was
actually like that... but I personally have never seen it.

~~~
ozim
So you are working with experienced people. It looks I don't have such luck, I
have to look at lots of junior code.

Juniors think that they should write comments and then they waste time writing
epic descriptions of trivial things. They waste time writing whole story of
creating simple piece of code in one commit message. They waste time
deliberating if tabs are better than spaces.

~~~
mcny
My personal anecdotal evidence is that they are not junior in the
organizational chart. I'm just a contractor. They are a part of the company.
Who am I to tell them they shouldn't comment "what" and "how" after every line
they write?

This job made me hate inline comments. If I were to ever create a new
language, I'd insist in there being no way to write comments in the same line
as executable code.

------
rocqua
A type of comment I miss here are "section headers".

I tend to organize my files into paragraphs or even sections of paragraphs. I
often use "section headers" to distinguish these. It is essentially a weaker
version of splitting a file across multiple files.

For example, I am working on a very simple pre-processor, and I have split it
into 4 sections. The first is unmarked and presents the public interface.
After that is a section marked simply by the comment "Lexer", after that is a
section marked by the comment "Parser", and finally there is a section marked
"TURN BLOCK BACK INTO TEXT". That last one should probably be changed to
something like "Process result of parsing".

Sometime, my 'paragraphs' of code also get a short line comment explaining
what the entire paragraph does. I guess that would generally fall under the
'guide comment' category.

~~~
steve_musk
Why not have each of those sections as a separate file?

~~~
rocqua
It's a balancing act of-course. The total file is ~200 lines at the moment,
breaking it up would create some very small files, and generate a lot of
'including' boilerplate.

Moreover, it would make it more difficult to understand the whole thing. The
lexer and parser are closely coupled (by the data type for the tokens) and the
parser and 'processor' are closely coupled by the data type of the syntax
tree.

I could still see it happen that it does get split, and there are certainly
coding styles that would have split this without question.

------
apo
I'll take well-named variables, functions, and classes over code comments any
day.

It's a chore and and half to plow through code in which the author has, for
whatever misguided reasons, shortened variable names to inscrutable
abbreviations, acronyms, and single letter head-scratchers.

Context-switching between comments and code may be easy for the author, but
isn't good for the reader.

More often than not, the short-variable tendency points to an even worse
problem: the author hasn't though very deeply about the software at a high
level and therefore hasn't developed a good vocabulary or conceptual framework
around the problem space.

~~~
antirez
The best is the code the more and better the comments usually... No mutual
exclusivity.

------
threeseed
Thank you so much for this article.

Just because a developer thinks their code is a gift from the gods and
understandable by all doesn’t make it true.

Code is the what. Comments are the why. You can’t expect the developer’s
intent to come out through the code alone.

~~~
vbezhenar
Writing good comments is not necessarily easy task. When you're writing code,
everything seems obvious. You can write obvious comments, but they won't help.
I think that best comments would be written by someone who came later and had
to understand the code from the scratch.

~~~
crdoconnor
I do this:

#1 If anything in a pull request isn't blindingly obvious, ask a "why are we
doing this here?" style question.

#2 Encourage all developers to do this on all pull requests.

#3 Get developers to answer those questions in the form of a pushed code
comment rather than (or as well as) answering the comment. Make that an iron
rule.

IMHO this gives the best comments. I'd usually discourage comments that don't
come about like this.

------
tralarpa
Very nice article and good job on the classification of comment types.

For small tools that I have written myself I prefer the guide comments because
I usually know the general design of the program but need some help to
remember the order of the things done inside a function/procedure/method. But
the more complex the application, the more higher-level comments are needed.

Note that the guide comments can also appear in the form of function comments.
For example, you see sometimes the networking function written like this:

    
    
        ...
        logLinkDisconnectionWithReplica();
        freeQueryBuffer();
        deallocateBlockingStructures();
        ...
    

In that case the the function comments of the called functions are actually
the guide comments (edit). I have the impression that this style of
programming is popular among smalltalkers (please correct me if I am wrong).

It is also interesting to see how the comments depend on the expressiveness of
the programming language. For example, you can see a lot of "trivial" comments
in assembly code because even simple things (like incrementing a counter)
sometimes need code that is not easily readable. In Java, you see less
comments like "This method throws an IllegalArgumentException if ..." because
the programmer thinks that the list of checked exceptions in the method
defnition are sufficient. In a language that I just made up for this comment,
a procedure with signature

    
    
        getPerson : (name : String not empty) -> List(Person) not empty | Nothing

needs less function comments than

    
    
        struct list *getPerson(char* name)

~~~
pjmlp
While true, even staying in the realms of straight C that function could have
been improved.

    
    
        typedef struct list *person_data_list;
    
        person_data_list getPerson(const char const *name);
    

Yes it still does not provide all the necessary information, but it already
does provide a bit more.

------
soneca
There is a middle way between _code comments_ and _" my code is self-
explanatory"_ though, doesn't?

A problem with code comments is that it very easily becomes outdated. And if
the same method keep changing for different reasons, the code comments might
become rather long.

I believe a good practice is to avoid code comments and document the why in
the commit messages and PR descriptions. This way is easy enough to git blame
and go to the "why" whenever a code is not self-explanatory

~~~
nemetroid
There's a particularly good example of this risk in the article:

    
    
        /* Unlink the client: this will close the socket, remove the I/O
         * handlers, and remove references of the client from different
         * places where active clients may be referenced. */
        unlinkClient(c);
    

I wouldn't trust a comment as technically detailed as this one to stay in sync
with code changes to unlinkClient.

------
js8
Great article, comments are unfortunately too underappreciated.

I sometimes write in (zSeries) assembly and I use guide comments a lot, on
almost every line, to say what is the value that the resulting register will
hold.

I also sometimes use backup comments for code that I wrote at some point, but
then it wasn't strictly needed, but I still believe it might become useful in
the future. You could have it in Git, but then the information that this code
is already in Git is lost.

------
keithnz
Really nice article. While I'd quibble about some of the details, this
categorization of comments is really good. Really good discussion around each
category too.

Not sure if is a separate category or not, but I often do Quirks type comments
especially when dealing with APIs of libraries with links to relevant
documentation ( even if though those links may go dead at some point, I'll
capture the essence of the quirk in the comment and leave the link )

~~~
mkingston
Get a browser plugin to single-click archive the documentation with
archive.org, then include that link in your comments. Yes, I do exactly this,
I'm not just sniping!

------
therealdrag0
I really appreciated this taxonomy! It lines up with my own experience; but
I've never thought about it this thoroughly.

One specific type of comment that I've found valuable is a sort of BDD in my
tests.

    
    
      testMethod() {
        // GIVEN a <specific type of thing>
        <setup code>
        
        // WHEN the <specific operation is done>
        <do the thing code>
        
        // EXPECT <specific effect>
        <assertion code for the effect>
      }
    

These are like "guide comments" but with more structure. I find that they help
write tests and also make reading tests transparent.

------
ridiculous_fish
A lot of disagreement and talking past each other in this thread.

One unacknowledged facet: Code with different purposes requires different
levels of commenting. If you're (say) implementing a UI, you can often avoid
commenting through good naming: `saveDocumentButtonClicked()`. But if you are
implementing the C++11 spec, you should provide liberal references to that
spec in the comments.

Comments become more important as the code becomes more forced and less
natural.

------
pferde
From the article: "For quite some time I’ve wanted to record a new video
talking about code comments for my "writing system software" series on
YouTube. However, after giving it some thought, I realized that the topic was
better suited for a blog post, so here we are."

Kudos! I wish more "content creators" realized this. If you have something
useful to say, it is much easier for your audience to read it at their own
pace, instead of having to listen to someone talking, sometimes with a weird
accent.

If your topic is about something that need the visual presentation, by all
means, video is cool (although often a bunch of screenshots, graphs, or other
imagery is better), but if you just want to talk about something, it's a waste
of everyone's time and resources.

But I guess monetization is better on youtube, huh? _rolls eyes_

------
p0nce
Great article. I like "debt comments" though, no point hiding defects really.

~~~
rocqua
It depends on what kind of code you are working on. Having 'debt comments' in
production code could be taken as a suggestion that code should not be in
production.

What I really like are "This is fine, but with effort it could be done better
like this" comments, which one might also call 'debt comments'. For example,
if you have a brute-force search on something that could be done by bisection.

~~~
p0nce
Let's rewrite this: there is a point to hiding defects... if you are a
contractor or looking for a promotion.

~~~
rocqua
Alternatively, one should address defects by fixing them not just marking
them.

~~~
p0nce
You can't address them later if you've forgotten where they are. Maybe there
is no time "now".

------
ksri
Great categorization!

Re. Guide comments - I usually try to make a smaller function with an
appropriate name and then call the function. So long function become a series
of invocations to smaller functions. This eliminates most guide comments IMO.

~~~
karmakaze
I came to make a similar observation re guide comments, with the difference
that guide comments with inline code is better in most cases than a sequence
of small function calls.

I've often wondered why my code style is moderate inline blocks with guide
comments when I know about the benefits of good naming and self-documenting
code. In the end, I find it more effective. By encoding in function names,
you've (technically) eliminated comments but now buried what it actually does
somewhere else meaning the reader must navigate to verify it does exacty and
only what each reader infers from the name. I much prefer having a function of
a small-medium logical size that I can hold in my head all at once and see
relevant details without navigating away from where I'm reading/editing.

My experience is that less depth and fewer single-use functions aids overall
comprehension similarly to how flattening as in list comprehensions or
filter/map pipelines buys more headroom. I can imagine building/using DSL
where everything is well-named and free of guide comments but I have never
encountered such a thing where there are multiple committers using a
procedural language.

------
lifeisstillgood
I think code comments should play much greater role in code management - i am
trying to get this right -
[https://github.com/mikadosoftware/todoinator/blob/master/tod...](https://github.com/mikadosoftware/todoinator/blob/master/todoinator/todoinator.py)

the docstring is my goal ... still waiting for time on the commute to dig in
but thoughts welcome

------
johtela
This article sums up all the reasons why I think documenting your code is
important. But unlike the author, I take my comments one step further and
generate documentation from them with my literate programming tool (for C#):
[https://johtela.github.io/LiterateCS/](https://johtela.github.io/LiterateCS/)

------
mannykannot
If programmers were half as clever at writing self-documenting code as some of
them are at explaining why writing or reading comments is never a good idea,
there would be no use for comments.

------
p0nce
+1

What is this "self-documenting" code and where can you read it?

~~~
DanFeldman
It's either code that is purposefully built to turn into nice documents (using
something akin to Sphinx or Doxygen, potentially hosted on readthedocs.org) or
a tongue-in-cheek excuse for not writing documentation.

------
latchkey
Document the why, not the what.

------
alan_n
Great analysis of comment types. I tend to do all the first six, including the
guide comments. I can't understand people who think code should be completely
self-documenting. There's always so many edge cases. Are they seriously never
puzzled by past code they wrote? Every time I go back to any projects I
haven't touched in a couple of months I'm always a little disoriented, and it
was all code I wrote! The fewer comments I wrote usually the more I regret it
later. Guide comments in particular just make it so much easier to find my
place again.

And I'd go as far as to say they're almost mandatory for any math related code
with a lot of complicated conditions. It's 100x easier just reading the guide
comments. Not that they should repeat the if statement exactly, that's
pointless. But describing what's happening (the bigger picture that is) really
helps imo, and adding an example is even better. I try to keep them short
though, single sentence, usually 1 comment for every 5-10 lines depending on
the complexity of the code. For example, I was writing a sort of panel/window
manager, it had a ton of conditions that looked like:

    
    
       if (area_left < current_left) {/*...*/}
       if (area_right > current_left && area_right < current_right) {/*...*/}
    

You could probably figure out what was going on if you saw it was inside the
resize handler for a panel (current_*) and looping through all the other
panels/areas, but something like "// finds any areas that will limit our
movement" right above the ~10 line section is so much easier to understand.
For more complex situations I might have added a visual description as well:
"// e.g. given panels [1][2][3] - when dragging a side of #2, find any
limiting panels to the right (#3) and left (#1)", though usually I keep the
examples to function comments.

I'm torn about debt comments though, especially for protoypes/beta projects
where they're the most common. On the one hand, if they're in the code, it
makes finding where changes for certain features should be added easier if you
add them consistently. On the other hand, it's then hard to see an overview of
them all. Putting them in their own file makes it easier to manage them and
helps me remember what I was working on, but then I have to go hunt down all
the places the functionality needs to be added to. Maybe some combination
would be better. Like naming them something specific: "todo - feature -
somefeature" then having the details of somefeature in the todo file. Or
another solution might be to use headers as someone else has described, then
refer to the filename/headername/s that need to be looked at in the todo list.

Usually imo, the more comments the better, so long as none of them are
outdated (working alone though I haven't really found this to be a problem).
You can always remove comments, it's much harder to add them in months after
you wrote some code.

------
majikandy
In my world, the test suite is a better “why” than code comments.

~~~
tralarpa
What was hard to write should be also hard to read! /s

~~~
majikandy
Why read the code when you can read the out of date comments! /s

~~~
raarts
If comments are out of date you're not doing code reviews.

