I partially disagree. Some of the worst code I’ve seen was an attempt to reduce duplication.
It’s easy, especially in UI code, to mistakenly identify duplicate code, and prematurely build a DRY “solution”. Premature DRY is the root of much evil.
I try to follow the rule of three before trying to identify duplication. Even then, I’ve become much more cautious than in my younger days.
This "rule of three" seems to be fashionable these days, but I think you'd want "the rule of 2" all the way to "the rule of 10" depending on the size of the duplicated code, how hard it is to abstract, the complexity of the resulting abstraction, the distance between uses and abstraction, etc.
I certainly wouldn't want to wait to 3 if I am copying a block of 10+ lines that requires just a single int parameter to abstract into a function very local to the 2 uses. And I may use "the rule of never" on a 2 liner that requires a callback to abstract.
In my experience, instances where removing duplication made code worse have usually been because the code is similar in its mechanical behavior but very different in its purpose. I wonder if a rule of thumb could be to judge duplication more on the purpose or function of the code rather than the mechanics of how it is implemented.
For example, suppose you have two promise based functions with error checking; one of which is doing a fetch and the other is performing a complex calculation. The error checking blocks might be the same since they are mechanically performing the same tasks; however, since the errors they are likely to see are so different a generic version could make the code more difficult to understand.
One trick I use is to identify whether the duplication involves something that would ultimately be the domain of different modules. If it creates a dependency across modules, it's a design problem significant enough to probably not result in the abstraction I have in mind right now, so I should let it rest and see how they evolve.
And there are definitely cases of duplication too small to matter - which as you note is often the case for UI and UI-like things. You often end up in a position where the most reusable code is, in fact, the copy-pasted code, because the only thing it does is describe a permutation of abstractions and assets glued together. Sometimes this code is not the optimal code, but that's a problem that can be returned to down the line.
The easy picks for DRY are one-liner functions that describe a preconfigured intent, like "draw a centered bounding box". That's something your glue code will crave, and there's little issue with deduplicating it later. But even there some tension arises since you can always decouple a little further by having those functions rely on an imperative context where most of the state(e.g. the size or color of the box) was previously configured, versus specifying it at the callsite. After certain thresholds, there's a flipflop between wanting to code it and wanting to configure it.
Yes. I think one of my most important learning experiences was to not be too strict about code duplication. There is a cost for duplication, but there is also one for DRY. The first one is just more obvious. What it really comes down to is to recognize the true costs of each option.
For me personally, this was one of the biggest lessons I learned that made me see large codebases and working in a team in a new light.
In my experience the worst messes I’ve had to deal with are cases where people didn’t DRY enough: it’s easy to “unDRY” a function: you just duplicate it and rename it when you notice that its code serves two different purposes. It’s harder to take duplicate code and deduplicate it, because you have to think really hard about the context and whether or not you’ve encapsulated the environment correctly.
That really depends. A single function is often easy to unDRY, but some abstractions, in particular when stateful objects, hierarchies or layering come into play, are incredibly hard to unDRY. Even just spotting that they can be unDRIED without side effects can be super hard, making it such that most people won't have the courage to do it. Now you are much more stuck with this code than any repeating code.
I'd claim that _on average_ DRY-ing is easier than unDRY-ing. This assumes the repeating code is intentionally kept simple and linear.
The absolute worst code is a mix of repeating code and half-assed abstractions, where the "perfect refactoring" would have to "move" or "redo" abstractions. This to me is even harder than DRY or unDRY.
You need at least document the duplications when you see it (especially in big code base). Because the rules of three easily become the rule of 87, if you are not aware of the 86 other "second version" that have been written by people that were knowing only one of the duplicated versions.
It's hard because you often don't know what you don't need before you don't need it.
Duplication for < n is good in principle until you realize some coworker decided to make their own system or went way past n with end result being it'll take more effort to refactor than to have initially designed something robust.
It really depends on how complex the thing is that you have n copies of in your codebase. Also, is it a concrete algorithm? Is it a pattern? A concrete algorithm can usually be factored out more easily than a pattern, because the later needs to be flexible and extensible. Supporting all the variations of a pattern in a complex codebase in a single place can be challenging and in the end it may be harder to get right than maintaining the repetitive code.
I vehemently disagree with the idea that returning from a function often is bad and that it should be used "sparingly". I find that code that returns as early as possible is the best. The idea of a nearly mandatory single return point is a very antiquated notion. I'm much more concerned when I see ten levels of indentation for no good reason.
Likewise with continue and break. I think this advice is very bad. All the other recommendations seemed fine.
Second that. I disagree with the premise that such code is hard to understand. On the contrary: when done well, early returns pop your mental stack and free up mental resources when reading code. (I guess I am a bit extreme. I even advocate judicious use of goto's in C code, for instance for cleaning up state after an error or for writing out FSM's without the distracting fluff of loops and switches.)
Returning multiple times can also reduce nesting, since you can add guard-like statements and return early.
Take this contrived example:
function foo(bar)
{
var result, error;
if(bar is null)
{
error = "{NameOf(bar)} is required";
result = null;
}
if(bar is not null)
{
... Main function logic ...
}
return Tuple(result, error);
}
If you can return multiple times instead you'd be able to write:
function foo(bar)
{
var result, error;
if(bar is null)
{
return Tuple(null, "{NameOf(bar)} is required");
}
... Main function logic ...
return Tuple(result, error);
}
Assume "Main function logic" has several of nest of its own (branching, loops, etc) it quickly gets difficult to read.
In the days of Pascal, where multiple returns didn't exist, I used to write a nested function that expected sanitized inputs and the outer function had the input checking logic.
When I can't return early, I find that the && operator, because of its short-circuit evaluation, is extremely useful for writing chains of "everything before must successfully execute" without the labourious series of "if/else ladders" or deep nesting that would otherwise be needed. Compare:
Instead of loading up every language with stupid keywords that are meant to simplify monadic code (but only for one domain), language designers should really take a page out of Haskell or Scala's book and think seriously about unifying on a do notation/for comprehension type construct.
That way regardless of if you're working with promises or not, you don't have to worry about hacking it with && or nesting 10 layers deep with if/else.
If your functions are short enough and with a low level of cyclomatic complexity, it really shouldn't matter all that much. If it means that you have to wrap 2/3rds of your function in an if just so that the single final return still works and that if then contains 20+ lines, then try to correct that in the first place.
Having said that, I'm with the "guard statements" crowd, too, but it would probably be better if that were handled by other parts of the language syntax. Contracts, for example.
The slide refers to the inversion of control principle but can easily include other important aspects of including only what is needed.
I would like to expand on this for discussion. Consider the challenges of not over-engineering solutions to problems you and others don't fully understand, yet. Unless you work for NASA, you won't fully know what you need until you need it.
How do programmers here account for the unexpected?
> How do programmers here account for the unexpected?
You don't. At least, I don't.
When I'm writing code for an application, I'm doing the following...
"Hmmm, I need a function to perform <certain thing>"
Function is written... tested... debugged until <certain thing> is achieved. Job done.
Whilst writing said code, I do not think to myself "Hmmm... y'know in the future it might need to also do <some future thing>" and proceed to add additional function parameters and write additional code which would only be useful sometime in the future, just in case.
No. I write sufficient code for what is required of said function at that moment in time. It is only if and when required would I then add any additional parameters and code to that function. Or write new functions.
>"Hmmm... y'know in the future it might need to also do <some future thing>"
Those are my trigger words. My experience is that about 97% of the time somebody says those words (in the past it's been me) in the context of writing software and it turns out that:
* <some future thing> is never required.
* <some future thing> is required but it ends up being needed in a completely different form, rendering the preparation work pointless.
* <some future thing> was nice to have but it ended up so far down the list of priorities that it might as well not have been.
This applies to every level of the stack from one off functions to grand features, to refactoring, testing and even to things that many people consider "best practice".
I don't think it pays to be an extremist about many things in software but "if there isn't a glaring need for it then don't implement it" is one of them.
There's a difference between not adding bells and whistles you might not need, and shipping asymmetric, broken abstractions. Just because your app doesn't need to support vectors in the negative plane, for example doesn't mean you should ship a vector library that doesn't support negative numbers. In the same way I see well-meaning devs "dribble" data into an API, forcing future users to expand the API into a shape that could easily have been anticipated, and made cohesive sense. This stinginess with API is particularly bad because now developers learn to not respect API boundaries at all, because there aren't any. (And additionally it contributes to a lot of unnecessary indirection and typing because you have to define subset types at both ends of the API.)
So, yeah, DON'T just add arbitrary stuff because you'll think it will be useful to someone; but DO add stuff to maintain the symmetry and understandable abstraction of what you've made.
Including/writing a fully featured Vector library when you need a Vector library is perfectly fine; but if you're writing some code to handle those Vectors you should stop after it's done and not generalise it to also handle a bunch of other types in case you might need it later (because you probably won't).
I wish I was agreeing! But sadly I am seeing more than a little but of jingoism here, which I think is 80% right, but like anything you can take it too far. "Every generalization is false," etc.
No. I write sufficient code for what is required of said function at that moment in time. It is only if and when required would I then add any additional parameters and code to that function. Or write new functions.
I'll add that sometimes the more general solution also happens to be the simpler one, and in that case you should always take the simpler one even if it's more general, because then you have both the advantage of simpler code now and flexibility for future change.
To put it succinctly, "increase generality only when it decreases complexity."
When I started as a developer 20 years ago, this was the most difficult thing for me. I would tend to over-think and include WAY too many "what if"-scenarios.
So that is what I try to teach junior developers today: Only write what is required now and don't think too much about what the function might be expanded to do in the future.
EDIT: The way I think about this is that software development has a very large degree of unknown unknowns. More than maybe other domains like architecture/chemistry/...
> Just implement what you need today and do the rest later.
I'd add that, while you shouldn't change the complexity of your implementation very much, you should look at potential future scenarios and migration paths. Sometimes you can make small changes to the plan that make migration paths easier. Optimize for ease of change I suppose.
Obviously the effort you put into this should be proportional to the complexity/importance of the problem.
And here comes the part where experience plays its role. History tends to repeat itself, so if you been/seen certain situations you can account for high possibility of them happening again.
I think most experienced devs here are talking about one side or the other of balancing benefits/costs. Yes, there are pros/cons of taking it further one way or the other. The best way I've been able to prioritize is to the the current code primarily serve current needs. If there are other likely and soon expected capability, make some accomodations that don't negatively impact the complexity of serving the current function. If they're at odds, merely document the ideas to handle the expanded case.
The situation to avoid is to pay a complexity tech debt now, for something that may not occur for some time, possibly never with all the maintenance of current features upon that complexity.
A common reoccurrence is generalized methods. There was a PATCH endpoint for main entity which I'm sure started out innocently enough with maybe just two change sets. Following the intents through the layers of this choke point implementation is unpleasant. We could achieve the same by composing variants at a higher level so that each concern is separated from the others, but then we introduce this 'machinery' that may not be warranted. If there are many or a dynamic components then it may be the best solution but not until.
I recently fell for this myself. On recognizing that a particular service was just a combination of two state machines, I abstracted the state machine operation and applied it to each subproblem. It worked like I expected but the result wasn't right. The state machine machinery was more visible than the core logic. I eliminated the concrete abstraction and created two implicit state machines that invisibly did what was required for each subproblem. If we already had a common state machine form that we were familiar with, things could be different but not when introducing it into a codebase for the first time.
Listen to the user feedback and _selectively_ act on that.
Not all feedback is good, not all of it is applicable, requests for marginal features that benefit only few are exceedingly common. All feedback should be taken strictly in advisory capacity and assessed critically. However if you did miss something useful or dearly needed, it will surface almost immediately.
We've been practicing this for ages and it works extremely well.
>> How do programmers here account for the unexpected?
I work on systems that need to run reliably. However, they depend on data collectors (like sensor data or web scraping) that are not known for their reliability or scalability.
We do lots of retries, log errors and failures, and patch problems until things work as expected. One thing that helps is the use of sentry.io . It alerts us about exceptions and detects regressions/regression fixes by checking if there were any commits that might have caused or fixed issues.
They mostly don't get to take their equipment back to update it. So they better know very well what they need. (Yes, there's some amount of remote update, they still can't move fast and break things.)
As usual, in engineering, needs create capabilities. It is very expensive and risky to go fully determining your needs before execution, but they pay for it.
Without understanding why these rules get broken, it's hard to avoid breaking them.
There is a tension between dependency and redundancy. Avoiding redundancy often introduces dependency as code becomes shared for several purposes. Knowing which is preferable (dependency or redundancy) in a given case requires a lot of judgement and changes as the code evolves.
Redundancy is bad when it introduces too much code and too many chances for error or unnecessary divergence of implementation.
Dependency is bad when two things that start out closely related diverge in purpose, straining an implementation to handle both cases.
Suckless is at best satire and if we're being honest just plain idiotic.
Without fail every suckless person I've talked to either has no idea what they're talking about, or in the off chance they do they're incredibly shitty far right wing types who gripe on about women/minorities/CoCs ruining tech.
Yikes, that sounds like a horrible experience. Never got that from hanging around at the Arch linux forum back in the days where their products were moderately popular.
Having said that, my general impression was that there was an inordinate amount of Plan9 cargo-culting (with some DJB fanboys mixed in). Young, idealistic people without much experience but lots of admiration for the more Bauhaus part of their elders.
I used to make this same argument, and then took it overboard by never commenting. After reading some literate codebases, I’ve changed my mind.
Code is (almost?) always obvious to you as you are writing it. So, you never recognize code that is in need of a comment until you come back to it and struggle to understand its purpose and the context that gave it birth.
I now follow a rule: a meaningful comment at the top of each file. A comment on every exported function / value. Often, in the process of writing the comment, I realize I’ve poorly named something or that I’ve failed to handle some case. Comments help guide the reader, but also the writer.
Can confirm. This is literally what happened in a codebase in the past:
// OLD CODE WITH COMMENTS:
// check ...
4 lines of code
// drop ...
4 lines of code
// read ...
4 lines of code
// do xxx ...
4 lines of code
In total, that method was like 16 lines of code, with some short commented sections. And the sections used variables from before.
Situation: New coder comes in, sees comments. Says "comments are bad, they get out of date, yadda yadda". So proceeds to change comment names into method names:
What coder forgot was the variables and context. As said before, the total was 16 lines of code with simple variables used between it. Not a big deal. But now when it has to be split into functions, it became like this:
// ACTUAL 1: COMMENTS => METHODS
a, b, c = check()
d, e = drop(b)
f, g = read(a, d)
i = doxxx(e, g)
And this was a language that didn't support multiple return types. So the code was actually:
// ACTUAL 2: COMMENTS => METHODS
class X { a, b, c}
class Y { d, e }
class Z { f, g }
X x = check()
Y y = drop(x.b)
Z z = read(x.a, y.d)
i = doxxx(y.e, z.g)
So now coder added 3 more classes, with more lines of boilerplate. Then the coder decided, to manage this problem better. So they created some more interfaces and did some DI. The result was around a 250 line PR, which I cannot ping here anymore. When we pointed out the same logic was now 250 lines, the answers were: "It's inherent complexity which we didn't know how to manage. His/her solution scales better, and he/she has shown us the way".
All for what? Because comments were considered a smell. Go figure.
Doesn't sound like it was OO'd very well either. OO would move the check, drop etc. onto methods of the relevant objects. So it's sort of worst of all worlds here.
> I find it easier to keep in my head when methods are short
I think that's true if "short" is measured in terms of internal complexity. I think each function should try to manage a small bite size amount of complexity that represents a reasonable level of cognitive load for a brand new person to reverse engineer. It could be quite a lot of lines of code if they are very simple ones, or it could be a one liner if it's super complex / obtuse.
> I find it easier to keep in my head when methods are short.
How short? 100 lines? 10 lines? 1 line? "It depends" => i.e. discuss forever in PRs towards a one-way "shorter and shorter" ticket?
More and more coders who join the project say the same thing, and reduce methods shorter and shorter and shorter, until it's dozens of classes with one method each having small lines like "twiddle-doo" and "fiddle-foo". Of course, now the method is easy to memorize, and supposedly it's "clean code" now. But the functional value and flow is just completely screwed up and can never fit in the head of anyone.
You shouldn't conclude from this that the rule is bad.
Because the original code, while preferable to the version with 3 classes, is not good either. Indeed, as this makes clear:
// ACTUAL 1: COMMENTS => METHODS
a, b, c = wait()
d, e = drop(b)
f, g = read(a, d)
i = execute(e, g)
There are complex dependencies among the subroutines which can probably be simplified with refactoring, and which should be explicitly called out, but are allowed to "hide" in the original code where, locally, we have 7 global variables shared by 4 inline subroutines -- not a good situation.
I'll agree that one case doesn't make the rule bad, but the rule hasn't been proven yet either.
> There are complex dependencies among the subroutines which can probably be simplified with refactoring, and which should be explicitly called out, but are allowed to "hide" in the original code where, locally, we have 7 global variables shared by 4 inline subroutines -- not a good situation.
What is locally global supposed to mean? This piece of code can be written at least two ways: as a single function with comments or as a function calling four other functions. Just because splitting it up into multiple functions a certain way requires multiple returns or long argument lists doesn't mean that the original code is necessarily bad; It could be that the splitting points were chosen badly or that this code is simply better off as a single function. These "complex dependencies" you're worried about form a DAG in the example which seems simple enough to me.
> Anyways, most of this is moot without real code.
Agreed, I hesitated answering because of this, but oh well.
> What is locally global supposed to mean?
It means within this local context, we have 7 global variables. Just because they aren't global to the entire program doesn't make them magically exempt from the problems of shared data. Specifically, it's too difficult to reason about logic and control flow, even in this small example. You cannot have strong confidence the code is bug-free just by inspecting it.
If you're trying to split up a 16 line function into multiple functions and having trouble doing it, instead of claiming the original code is bad, maybe you should consider that the original code is good and you just shouldn't be splitting up the function...
This might be a good interview question to identify net negative programmers like that. Give them the "ACTUAL 2" code, and ask them how to improve it. If they start talking about DI or any other additional abstractions, you have your red flag. And of course if the programmer asks "are X Y Z used anywhere else?" or has the balls to say "just inline it all into a linear flow of code", instant hire! :P
Writing understandable code is a difficult problem. There is no simple trick that will make you just do it over night.
> Code is (almost?) always obvious to you as you are writing it.
I try hard to go back and reread code as I'm working on it, several times a day, as if I'm someone else, seeing it for the first time. It may help to pretend I'm showing it to a particular coworker.
It's a certain kind of empathy skill that it takes work to develop, but it's very useful both for coding and any other writing.
Perhaps the most common problem is what I call "assumed context". That is, you're so deep into the task that you take the major parts of it for granted and it never occurs to you to explain the most basic and fundamental parts of what you're doing. Instead you explain minor details.
I have found that people who are able to defy conventional wisdom are, paradoxically, usually those who put even greater than normal focus on the core truth that the conventional wisdom is based upon. For example, someone on a team who is a "lovable jerk" is usually the person who, although rough in their presentation, is always willing to help others and put the team first. On the other hand, people who claim they are the exception to the rule usually do it as an excuse for not focusing on the core truth behind the rule.
In the same way, I think the problem with "prefer code to comments" is that people often miss that it actually requires even greater focus on communicating your intention to others (e.g. via methods like declarative programming and meaningful names).
I usually just consider my opensource commits to be 'learning opportunities' for others; so I comment on every other line. It's actually pretty nice as a way of 'inlined duck debugging' that helps catch flawed reasoning, even when the code is 'obvious'.
// Have the window rendering in the highest allowed state.
I'd argue that comments like these are fluff. It says basically the same as what the code does but doesn't say why it does that.
In general I find comments like that suffer from bit rot faster than anything else (perhaps with the exception of commented out code) as it's tempting to leave the comment changes until you've got something working. And a wrong comment can lead you down the wrong path very quickly.
Well, ye. I mentioned that it was mainly for less experienced users to learn what the code does; and in this case it also tells the reader what the thread is responsible for.
I get where you're coming from with the 'inlined duck debugging' observation, however I'm of the opinion that explicitly commenting out "obvious" code chunks is usually fairly detrimental for a few reasons:
(0. The debugging approach can depend on a person's mental model)
1. Readability - even for a beginner, the method name should give a fair indication of what the code does. If the method name doesn't give a good indication, then the name should be changed. Otherwise, the comment is redundant.
2. Staleness - imagine a commit which changes the behavior of Engine::Compositing::onFrame(Deltatime) to only notify some of the components about the new frame (comment currently states "// Notify all components about the new frame"). Presumably, this commit should only change the internals of the onFrame method. However, because of a redundant comment, the person who changed the onFrame method now also has to find all its invocations and possibly update many comments.
My personal preference is commenting only domain-specific code chunks (example: [1]). These code chunks are usually put as close to the implementation as possible.
This way, whenever someone wishes to change the code, the person (1) immediately notices reasons behind the implementation (and may decide against modifying the code if he/she was unaware of these details) and (2) can modify the comment right away in case domain-specifics have changed in the meantime.
I'm curious to hear your or someone else's opinion against this argument?
It depends on who you're trying to reach. Since the GP wants his open-source commits to be "learning opportunities for others," it might make sense to comment in that level of detail.
That particular comment does contain semantic information which is not expressed in the code: that the thread is being prioritized for rendering purposes. It's possible you could communicate that through variable names, though.
It's also bad to have overly detailed comments like this because it tends to result in code reviewers glossing over the code instead of checking what it does themselves.
Generally, comments should focus on the "why", not the "what".
I think that’s fine as a choice for a personal project, especially since your committed to it. It’s a totally different story when working with a team though. Comments become outdated, files get larger, and it becomes more of a hassle to maintain while providing little to no value most of the time.
Comments becoming stale is a real problem. But it is one that is solved by code review and culture, not by fewer comments.
In a team setting, comments like these are even more important, as it gives context and higher level meaning to the code without forcing you to jump all over the place through small functions (which would be an alternative way to make this self documenting.
> Comments becoming stale is a real problem. But it is one that is solved by code review and culture, not by fewer comments.
Well, it can be both. Less code is easier to maintain than more code, and that applies to comments too. Finding the "goldilocks" point is the challenge of much code design, and it applies to comments too. There's such a thing as both more comments than you need (increasing maintenance cost and risk of outdated comments without providing enough value to justify), as well as less comments than you need (to decrease cost of understanding the codebase).
I strongly disagree specifically about comments like these throughout an active code base. A well named variable or method can act as a much better descriptor of what’s happening and doesn’t have the same maintenance cost.
I think we both agree that comments are valuable, just not the scope. Comments are valuable when you’re doing something unexpected or where the code fails to explain what’s happening.
For what it’s worth I mostly work with Ruby, JavaScript, and TypeScript which definitely color my views.
I comment my code because I've run into elegant, clean, self-documenting code that did the exact opposite of what it was a supposed to do. Comments let me, and other coders, know what I was trying to do, which I think is better than reading what I actually did and assuming it was correct.
I think that conventions and a set of patterns explained (or at least listed) in the wiki/readme of a codebase go a long way.
When I'm reading some API, then of course I want a detailed comment for each function and value, including valid/invalid parameters, maybe even some runtime behavior stuff if it's important (like whether some async method may contain some cpu-bound parts) any definitely anything unexpected or deviating from conventions.
The conventions part being important because usually I don't have the time to read all the comments and all the documentation, but instead I'm looking for some simple concepts I can learn once and then use to understand most things just by reading the identifiers.
If the project is written in a consistent way, it can be pretty large but require very few comments to be understandable. And anyone modifying it will need to know some set of rules for it anyway, so they should be made explicit.
The rules in your last paragraph make perfect sense; documenting the purpose of the code is probably the most important.
I think what is intended in general are comments in the form of:
// Start the loop.
while ( have_posts() ) : the_post();
[..]
// End the loop.
endwhile;
And yes, that is an actual example from a real codebase.
This is an extreme example, but quite a few of comments I see are very redundant. Sometimes comments are useful to split stuff up, but stuff like "start loop", yeh nah. It's just a random stray comment, probably as an artefact from when the author was gathering their thoughts and/or took a break.
Even worse are comments that are just confusing, or don't match the code.
I've done some literate programming, and I find it interesting for certain types of programs and/or if you're struggling with something. But I think that for a lot of stuff it's redundant.
This is actually a bad example. "The Loop" is a concrete concept within WordPress: it's a global loop (shared across scripts) that represents the portion of the request processing spent looping through the posts that need to get rendered to the page. There are a number of WordPress API calls that only make sense when called from "in the loop", since they rely on global variables that get set in each iteration. Whether or not this is actually a good request handling model is much more debatable, but the comments in your example are meant to communicate that any code which needs to be called from The Loop should get called between them.
Not really, because there could be any number of "whiles" in between unrelated to the main capital-L Loop (e.g. to loop over the tags of each single post).
> It's just a random stray comment, probably as an artefact from when the author was gathering their thoughts and/or took a break.
In my highschool introduction to programing class, we were for a long time required to comment every line of code. Examples given by the teacher were all this type of comment.
I've no doubt it put the wrong idea into some students' heads.
So often when I see this I wish people would have used another function. Its hard to argue that a function which requires splitting things up using comments is "doing one thing".
What I don't like, is when companies adopt a policy that is checked for before code check in, such as in Java where there are many tools to perform the auto analysis of code.
When one of these rules is "thou shall comment all declarations", it results in the developers polluting the code base with verbose drivel, to avoid the review-loop.
One of the reasons for this sort of policy often is that IDEs for untyped languages parse special comments for type information so they can provide autocomplete. (E.g. I’ve worked on PHP codebases where javadoc-style comments were required)
It's the same with C#, and that naturally gives birth to tools like GhostDoc, which will automatically generate comments to satisfy those idiotic checks, even if the comments are completely useless to actual humans. (IMHO any comment that can be automatically generated has no value.) Thus the proliferation of comments like this:
// FooBar - Foos the Bar
// @param baz The baz.
// @param quux The quux.
You would not be allowed to do any of this if you worked on my team. Do not put documentation in code, put it in documentation. Comments break up code, constantly flipping between reading the language and reading English.
Comments are failures on the part of the programmer to describe the implementation clearly in the language being used. The lesson you should have learned is to recognize that it's okay to fail in this way, but to still consider it a failure.
I work on a team with the same policy, but personally I think it's a bit silly. There's nothing that guarentees that a version of the code that explains everything only in code is actually easier to undertsand, but that's the assumption being made.
In a lot of cases I think it tends to be the opposite - adding more code just for the sake of 'explanation' bloats the code to much larger lengths and ends up making it harder to understand than if you just added a few lines of comments that explained the exact same thing. In that situation, adding comments is not a 'failure', it's just another option for making your code easy to understand.
You wouldn’t be able to contribute to the Redis or SQLite codebases, then, as they are heavily and helpfully commented and are almost always near the top of the list of answers when people ask the question: “What are some examples of exemplary codebases?”
In fact, it was reading those projects and then returning to my workplace’s uncommented source that convinced me to change my philosophy on commenting.
I just looked up a sample of the SQLite codebase[0] and it seems to align with what most people are saying here, to keep a minimal amount of comments and the ones that are there should comment what the code can't tell you.
The SQLite project is a good example of a codebase despite this decision, not because of it. SQLite is nearly 20 years old, and their position on this topic is reflected in their policy.
And Redis makes no such requirement. You are wrong.
Indeed. Apparently the author didn't think that minimalism included simplicity of language ("Baroquecratic", really?) and not adding tangentially related and unnecessary quotations in every page.
Some recommendations in this pdf are sooo vague and lack proper context. For example, what the hell does this mean: "Let the Code Make the Decisions: There is no need to spell everything out in minute detail".
I partially disagree. Some of the worst code I’ve seen was an attempt to reduce duplication.
It’s easy, especially in UI code, to mistakenly identify duplicate code, and prematurely build a DRY “solution”. Premature DRY is the root of much evil.
I try to follow the rule of three before trying to identify duplication. Even then, I’ve become much more cautious than in my younger days.