Hacker News new | past | comments | ask | show | jobs | submit login
Avoid Else, Return Early (2013) (timoxley.com)
682 points by signa11 on March 26, 2018 | hide | past | favorite | 578 comments

Programmers with lots of hours of maintaining code eventually evolve to return early, sorting exit conditions at top and meat of the methods at the bottom.

Same way you evolve out of one liners.

Same way comments are extra weight that should only be in public or algorithm/need to know areas.

Same way braces go on the end of the method/class name to reduce LOC.

Same way you move on from heavy OO to dicts/lists.

Same way you go more composition instead of inheritance.

Same way while/do/while usually fades away, and if needed exit conditions.

Same way you move on from single condition bracket-less ifs. (debatable but more merge friendly and OP hasn't yet)

Same way you get joy deleting large swaths of code.

and many others on and on.

Usually these come from hours of writing/maintaining code and styles that lead to bugs.

> Same way you get joy deleting large swaths of code.

This is the true sign of a programmer's transcendence. Specifically the irrational joy of seeing net negative LOC diffs.

It's not about how much you can add. It's about how much you can remove without sacrificing correctness, functionality, and readability.

"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."

This is my favorite research quote from a Civilization game (Civ IV, narrated by Leonard Nimoy). I use this as my mantra for all design-oriented aspects of my life now. Or... I try to. Sometimes it's hard not to want to add more haha.

FYI that quote is from Antoine de Saint-Exupry, author (and pilot) famous for "The Little Prince".

Do you know what the context of that quote is? I have used it as a sort of mantra specifically when programming, but I'm guessing it probably was intended for writing.

https://en.wikiquote.org/wiki/Antoine_de_Saint_Exup%C3%A9ry has the preceding paragraphs. It’s about the design of airplanes.

I don't, but it seems like it would find good use in both writing and aviation.

A story about that from the early Apple team: https://www.folklore.org/StoryView.py?project=Macintosh&stor...

I always tell my team that deleted code is the best code. Obviously less code is often more maintainable but there is also the element of being willing to throw away stuff you did earlier and not being attached to it.

I have a similar take on this, but expand it a bit by arguing that any line of code added – no matter how innocuous – is a liability. It's a line of code that has to be maintained, with all of the responsibilities that come with that. It may be a line that's responsible for adding more value than it costs, but it's still a liability. It won't add value forever, most likely, and when it stops it'll just be debt. It may be cheap debt, but debt nonetheless.

Deleted code on the other hand, is never a liability. The act of deleting it may be, but due diligence should ensure it is not.

Yes! Every addition carries some risk. The real world example I give to clients/managers of the risk of adding a feature is this:

Once I was using an industrial camera that had a mode that allowed it to mirror or “flip” the image output. Unfortunately, after running for a few hours, sometimes it would decide to flip the image on its own. Had that “flip” feature not existed, the bug could never have occurred.

The point isn’t that you shouldn’t add useful features, but that even easy to add features aren’t as “free” as they might seem and the risk of adding them should be recognized and weighed against their utility.

I remember teaching a guy about YAGNI - You Ain't Gonna Need It.

He would write helper functions that he thought would be useful before writing actual code and often ended up wasting time. Half the functions he wrote, he never actually used, but he'd still spend time writing them and unit tests for them.

Its interesting that University puts an emphasis on re usability, when in reality it often makes code a lot more difficult to understand.

Sometimes deleting code isn't enough and you have to do a rewrite. I once rewrote 2,650 lines of C as a seven line shell script. The previous programming team had written a data transfer program with its own implementation of ftp. I just used the one that was already on the computer.

"The only good diff is a red diff"

Nothing irrational about feeling joy over what is hopefully more succinct and efficient code.

Yes, as long as you recognize where that line of obfuscation (LOO) is and stay just short of it.

The less code there is, there less opportunity there is for bugs.

It's really code tetris

This is like the 10 commandments. Good ideas mixed with stuff that is a matter of taste :)

If it was this simple we wouldn't be still arguing about coding styles decades after they were invented. I know few old programmers that I respect very much, and they don't agree about the perfect coding style, not even simple stuff, like braces in separate lines or not.

If it was this simple we wouldn't be still arguing about coding styles decades after they were invented.

Money quote.

> If it was this simple we wouldn't be still arguing about coding styles decades after they were invented.

I disagree with the logic in this sentence. New coding styles are invented all the time. The guard statements in the article were only formalized in the late 90s, for example. These arguments about coding styles "decades after they were invented" are the only way we know which work and which don't.

It's not a matter of taste. It's how you separate the wheat from the chaff. Except for the brackets one. ;)

Late 90s were 2 decades ago :)

Guards were brand new in the 90s. Unless you're going to argue programming started in the 90s, your point is irrelevant.

>braces go on the end of the method/class name to reduce LOC

You could argue the placement of braces with lots of valid arguments both way, but this... to reduce LOC ? Doesn't feel like a valid reason in any language using braces...

Oh but it is, independent of the language. Fewer LOC means more code on the screen, which means you can more easily grasp the functionality of some piece of code, which makes it easier to maintain.

> Fewer LOC means more code on the screen, which means you can more easily grasp the functionality of some piece of code,

uh ? for me the less code on each line and the easier it is to read

When it comes to bracket placement, we're talking about empty lines. They don't affect the amount of code on a line.

That depends. Python's pep guidelines has an 80 char wide rule, but I personally find that a lot harder to read than one long line (most of the time).

I find the 80 chars too short but, and this is highly objective, I notice I am more relaxed and usually having more fun when working with shorter lines. The best theory a I have as to why is just that most lines in code are short so long lines require making my eyes deviate from the code structure. If there is a long line, it will probably be out of place. So, my personal rule is to avoid that one or two lines that juts out beyond the lines above and below it. If all lines in a block are long, they are often similar commands/expressions which makes them orderly, and not so discomforting.

It's pretty sweet when you need to read code in a console that's limited to 80 characters. Also, it makes it easy to have two files opened side-by-side.

Eh, all things equal, the less the number of lines the more readable.

I disagree. Consider the absurd example where every single statement is crammed into a single line. Not exactly readable to me?

Sometimes more verbose code is more readable, and sometimes it is the presentation, and sometimes this is "superfluous lines", that increase LOC.

That's covered by "all things equal".

The argument against placing braces on their own line is that this:

    if (a)
conveys exactly as much information as this:

    if (a) {
while taking up more space. The thing the brace tells you is already told by the indentation, so the brace is on a superfluous line.

Maybe it's just me, but I find the first version easier to parse visually. When the coding convention forces me to put the brace on the same line, I end up doing:

    if (user.exists()) {


> The thing the brace tells you is already told by the indentation

Which is true, and of course begs the question: why do you even need the braces?

1) I frequently tell my diffs to ignore whitespace so I can see my structural changes without getting drowned in a sea of indentation changes when introducing new scopes. Not viable when diffing python.

2) I semi-frequently make indentation mistakes when resolving merge conflicts. In braced languages this is fixed by an autoreformat. In unbraced languages, I have to take a lot more care with merging, lest I end up in yet another lengthy debugging session ending in facepalms - or worse, check it in.

3) Redundancy in the face of typos and whitespace destroying mediums such as many pieces of web technology - comments sections, forums, etc.

In this specific case

    if (a) {
and if (a) { print(a) }

Could just be if (a) print (a)

Except in perl, where it's

    print (a) if (a)
And if you want to add a second line you have to change it, thus I got into the habit of

    if (a) { print(a) }
But I don't code for a living, any more than I run network cables for a living, or screw things into bays for a living. I code as a tool to get the job done, most recently that was writing some perl to parse the output of a tcpdump which was outputting rtp sequence number discontinuities to see where packet loss was occurring, what jitter was going on, etc. Fast, to the point, isolate the problem (crappy juniper srx), fix the problem (replace with mikrotik ccr), job done, drink beer.

I do like one-liner conditionals (at least short ones for early bailout) but sadly my coworkers and their coding standards do not.

Off-topic and pedantic pet peeve, would you mind replacing "begs the question" with "raises the question". The former might not mean what you intended.

Let's say we can see 30 lines on our screen and the average function takes 5 lines with OTBS. That means we can see 6 functions at once with OTBS and 5 with Allman. That makes a difference at least in my case.

Not being a pest here, but how often are you comparing more than 2 functions at a time? I can't say that I have ever dealt with code in over 20 years where I had to deal with more than 2 functions at once.

Maybe I am mentally disabled, but I can only focus on one logical task at a time. Can other programmers multi-task their coding?

All the time, but to be honest, it's mainly side by side when I do. This is useful when building one function that acts on some other part of the code and I don't want to scroll / swap files back and forth.

I more meant it in the sense that it's nice to be able to soak in more of the code at once and avoid unnecessary scrolling.

Refactoring might have you adding a parameter to a function, and then feeding that additional parameter to the function at it's call sites. Even when you only have one call site and thus are in the "2 functions" case - these two functions are often separated by multiple "irrelevant" functions.

The less scrolling, tab management, and other context switching I have to do to look at those "2" functions, the easier it is on my memory.

>The less scrolling, tab management, and other context switching I have to do to look at those "2" functions, the easier it is on my memory.

I don't disagree, but the sentiment seems to be "fit everything on the screen at all times", which seems like one of those rules that may take more time to implement than it saves in actual time.

I work on projects with upwards of 100k lines of code consistently, the very idea that anything relevant will fit on one screen is absurd, so I am always curious what kinds of projects people work on where the relevant code is literally within a half screen of each other.

I tend to work on gamedev codebases roughly near a 1mloc (both above and bellow.) In one recent refactoring I recently managed to get a class declaration (that was previously local to a single .cpp and had all functions defined inline in one go) to fit on a screen. I use small fonts, 4k resolution monitors, portrait orientation, and I still had to cull some "GetFoo gets a foo" style worthless doc comments to accomplish it. (I kept the comments that actually added anything of meaning.)

Given that above description, you can probably guess I don't have everything on screen at all times, not even close. But I'll refactor off small files of related functionality where I can, subdivide things that are getting unwieldy, etc. - put private/static functions near the public/exported functions that call them, break things up into (sub)modules, etc. and get good grouping.

> I don't disagree, but the sentiment seems to be "fit everything on the screen at all times", which seems like one of those rules that may take more time to implement than it saves in actual time.

Taken to the extreme, any such rule absolutely will. I certainly don't have that as a hard and fast rule. Or even as a guideline per se - it's more a faint but distant dream, or perhaps a feverish hallucination, a reminder of codebases past. But there's often plenty of cruft and low hanging fruit that can be fixed up in these codebases. Everyone has their "just one small change" to check in, so things build up and they don't jump to "okay, this is one line too many - time to refactor."

I practice pain driven development - when it's getting to be a pain to keep enough of something my head to do whatever sweeping changes or overhauls I might need to do, see if some pure or near-pure refactoring changes it. No expected behavior changes, often not even class or structural changes - often literally just regrouping code by topic and category via cut and paste. Horrific diffs, but low-risk checkins, and often a lot easier to reason about afterwards, and suddenly related code is often on the same screen.

(Although far from always.)

It sounds like we are on the same page. It's hard to tell with brief forum comments what people mean. I am just used to the reality that you _must_ switch contexts all the time, and it's _normal_ simply because the code bases are too big.

Yes, it's nice when things are close together, but in my case it's so rare as to not be point to consider trying to achieve, but more like a mental joke like "Hey, Bob. Come here and check this out! Two functions and template I am working on all fit on the same screen!" And Bob's like "wooooah, dude". And then we take a group selfie with my monitor and have a laugh, print out a poster of the selfie for the bulletin board for a few months, then throw it away during a cleanup and move on...

But to have people say "It's important to save whitespace so stuff can all be on the screen to avoid context switching" seems completely and utterly pointless.

I've always wonder why no C or Java code formatter tool allows me to use Lisp-style formatting. It's clearly superior, and the popularity of Python should prove that people are OK with not having closing braces on their own lines:

    void foo(int x) {
        if(x == 0) {
            bar() }}

After having seen a meme, I gave a teacher a project formatted like this:

    public class Main                                        {
        public static void main(String[] args)               {
            System.out.println("Hello world!")             ;}}
We had a good laugh. (that was fine with him, of course as long as it's a joke)

PS: getting back to the original topic, imo, formatting style should be project-wide rules set at the beginning or during refactoring. Don't get too hung up on style as long as it stays consistent.

I've written a bit of code formatted rather like this:

    void foo(int x)
    { if(x == 0)
      { bar(); } }
That is workable. I have adhered to it (mostly) in this Yacc grammar file: http://www.kylheku.com/cgit/txr/tree/parser.y

Just in the grammar; not in the supporting functions.

It's an attractive style for C statements embedded in another language, because they look more like capsules somehow.

"this... to reduce LOC ? Doesn't feel like a valid reason in any language using braces..."

It's not really a valid reason. Unfortunately, there are still a lot of people who think LOC is a valid metric.

Here's an extreme, yet very real example: mandatory Doxygen comments on accessors:

   * Get the foo
   * @return the foo
  int getFoo();

   * Get the bar
   * @return the bar
  float getBar();

   * Set the foo
   * @param foo the foo
  void setFoo(int foo);

   * Set the bar
   * @param bar  the bar
  void setBar(float bar);
Compare with what any sane project would do:

  int   getFoo();
  float getBar();

  void setFoo(int   foo);
  void setBar(float bar);
Multiply that by a dozen such attributes, and what should fit on a single screen spans now hundreds of lines. What you could see at a glance, you have to search for. There's simply no way those utterly redundant comments increase readability. (Yes, the real comments in the real code were just as redundant.)

LOC is more valid as a metric than most care to admit.

Or you use scala:

case class Baz(foo: Int, bar: Float)

Gives you a free hashCode, equals and copy as well.

Also: Why use get and set accessors for private fields at all, when you can just use:

int foo; float bar;

That can later be changed, when the need arises.

Meh. It's noise but what's the harm? It was generated by an IDE so nobody had to physically type that. And usually no developer will be dwelling on this object or this area of the object. It just sits there doing no harm to anyone.

>LOC is more valid as a metric than most care to admit.

It's not.

> It was generated by an IDE

Not all of it, actually.

> nobody had to physically type that

We totally had to read that.

> And usually no developer will be dwelling on this object or this area of the object.

Well… yes, usually. Sometimes however we did want to pay more attention than usual. At that point the original dev would have made a real comment. And that comment was lost in the noise, so you couldn't even distinguish between real comments and useless boilerplate.

I seem to recall that LOC correlated just as heavily with bugs and other problems as any other measure it was put up against. And, just like every other measure, gaming it leads to a degenerative case. (Can't remember where I saw this, or I would provide a citation.)

I'll repeat my current gut feeling. Intrinsic measures of software quality are terrible as guides for how to write code. They are decent at predicting quality, but are not usable in a prescriptive way to write quality code.

For whatever it's worth, I insisted on single return for a long time when I was an intermediate programmer. 30 years in, I strongly prefer return early (as well as continue early in loops) and I cannot go back. The immensely reduced indentation is marvelous. Having that 'happy path' consolidated is more readable.

I find it interesting how most of the comments about return early are based on experience.

I had the same thing happen about 10 years in. I just got sick of the extra coding and long indented blocks. And just switched over one day.

Why is this not taught from day one?

The problem is that for any rule you come up with there are cases where that rule is just too rigid and too inflexible.

There are times where early return makes a ton of sense. Usually that's the case when you can derive the return value from a shallow interpretation of some input values (i.e. check preconditions) but still require more complex processing for other input values. In that case, return early, but constrain the rest of the method to one return point.

On the flip-side, there are cases where an early return makes it incredibly hard to reason about the function. This usually occurs in dense, nested conditional logic. In those cases, I found sticking to rule that a return value should be initialized once and only once, and return from a single point, beneficial in even reasoning about the problem.

Can't that be taught as well? (both rules, and exceptions to rules?)


But as a novice youd should adhere to rules. As you gain experience you learn when to break them.

I think you read my comment backwards.

I think the "rule" should be "exit early", and the exception "multi-indent exit later".

I sold my first program aged 15 and I'm 53 now. My experience is that the benefit of early return is to keep indentation to a minimum. To me, every time code indents it is a "context switch".

Having said that, early return should be for error/ignore situations - there should only be one return for a result and that should be at the very end of the function.

We were taught "one entry, one exit".

I now think of the early part of a function as a filter for bad params, invalid state & whatnot, and return as early as possible.

Code just feels less complex that way.

I'd like to believe that this is a result of the concept of functions being first taught in the context of mathematics. Since there's no concept of flow control in mathematical functions, their entire purpose boils down to "Given parameters, do processing, return value".

Habits based on what was fastest on slow machines where every cycle mattered and compilers weren’t as good at optimizing, I’d bet.

I am on board with returning at the beginning while checking arguments. But I hate it when code returns somewhere in the middle for some reason. I have spent countless hours debugging a problem with unexpected behavior only to find a return statement in the middle of a huge block to code.

That's because the return statement is a kind of goto statement. Goto-like statements reduce the value of structured programming. In a pure structured program, each statement sequence is either completely executed or not executed. There are also pure structured languages, like Oberon, which doesn't allow impurely structured programs.

When return is used in the middle of a complex code sequence I much prefer a goto so at least I know where all paths exit.

> Same way you move on from heavy OO to dicts/lists.

OO is better if the objects tell a story in a way that ProviderStrategyDataFetchers fail to do so and are effectively just wrappers for data structures. If you go pure data structure without story you just end having to add comments to explain purpose. The comments are the classes.

I agree with both you and the OP. OO where necessary, lists and dicts where things are simple. You can always take a list of strings and turn it into a list of objects. But going the other way is harder because you have to refactor anything that depends on them.

The concern is when you encounter a list of very dense transforms between different data structures without anything informing you as to the purpose or even reason for these operations. Is it _just_ to build the output, are there critical side effects in the process? Are the apis just so bad that all these operations _need_ to be performed or is just pure 100% required processing? Method names help somewhat if people are adverse to objects. All that concerns me is that much modern functional code often becomes very dense transforms with little narrative to inform the reader.

"Same way comments are extra weight that should only be in public or algorithm/need to know areas.

" Best thing I learned about commenting is: Comment WHY you are doing it... not WHAT. Code is the WHAT... Comment is the WHY

If you feel the need to specify in a comment that the code does some potentially unexpected thing or that the way it does what it does has certain consequences, that's fine. And obviously the comment/doc of a function/class/whatever should say what it does if it's not 100% obvious and unambiguous from the name. I don't think this kind of rule is really helpful. Just comment if you feel a comment is needed, and if other people who interact with your code think you have too many or too few comments then you can adjust from there.

People do need to be taught this. I inherited a script from a co-worker who had left. My team leader had gone over it and commented it for me.

    # open file 
    with open(my_file, "rb") as binary_file:

No mention of what was at byte 8, or what the hell it was doing, but he commented the only obvious part of the code.

> Same way you move on from single condition bracket-less ifs.

This is one of those things that whether you drop it or not, you recognize the times it's caused you severe pain because you didn't notice it was single line when adding a statement, such as a debugging one, and all of a sudden the conditional part isn't what you were expecting.

This is actually one of the things I love about Perl. Single line conditionals were changed so they both looked different, and they only allow a single statement. e.g.

    if ( condition ) statement;

    statement if condition;
Regular if blocks still work as expected, but there is no braceless version of regular if conditions. I understand many people find it jarring at first, but it does allow for clear and concise precondition and return early statements, especially when grouped together. e.g.

    return undef if $param1 < 0;
    return undef if $param1 > 100;
    die "Not a number" if not looks_like_number($param1);

This can be horrible in JavaScript. It can make code really ugly. I have had to deal with code that looks like this recently:

    if (condition){
    doSomethingConfusing()  // is this part of the else or not??

Perl got many things right. Its a shame it seems to have fallen away as a language.

> Same way you move on from heavy OO to dicts/lists.

Is that a thing? No question that many concepts of OOP are in heavy need of reform, but I didn't know the basic idea of a struct was one of them.

I can sort of imagine this for quick-and-dirty things in untyped languages - but if you have types, passing dicts around everywhere seems like needlessly throwing away type safety - while it's also cumbersome to code and less efficient at runtime.

I think this might partially be related to the move towards streaming applications and microservices. You could create objects for everything you expect to receive/send and then when an API changes you lose the new fields, you get parse errors, etc. If you did it right maybe you can avoid some of those things, but the bottom line is that type safety is a little bit too strong a constraint for these use cases.

Even if I'm getting from a DB which won't be changing out from under me, many times I'd prefer just getting a simple Map rather than spending time creating a "safe" object. Do I create a User and UserWithDetails? Or always return UserWithDetails but sometimes fields are empty because I don't care about them? Dicts are light, objects are cumbersome.

You can of course create a User and a UserDetails, the former embedding the latter when necessary.

I also see getting parse errors when APIs change as a positive thing. When you change an API, things do break. Learning what breaks as early as possible is good. It's not even about objects vs. dictionaries, you could use any structured composite datatype and the overhead you associate with it depends very much on the language.

Of course, for a quick hack I'd happily dump data into a dictionary and call it a day. But if I want to learn why it breaks 5 years later, having a clear specification of the expected data is much better even if it cost me an extra few hours.

Right, dictionaries everywhere is a reliable way to slow down every language to the speed of dynamic languages like Python or JavaScript.

My understanding is, by “heavy OO” they mean custom container-like classes implemented by encapsulating a list or dictionary, but exposing non-standard APIs for get/set/erase/replace.

By the OO books that’s the way to go, because object state encapsulation. Practically, exposing raw lists/dictionaries/vectors/iterators at the class API boundary is more flexible because it makes the collection compatible with lots of other code, both in standard runtime and third-party libraries.

Another thing, if the only state that container holds is the collection of items, maybe you don’t need any container class at all. Just write global functions / static class / static methods that directly process lists or dictionaries of these items.

I'm a fan of TypeScript's structural typing for this reason. I can define an interface that describes the shape of a "plain old object" (it's fields and their types), and then write functions that accept/returns an object of that interface. What's great about structural typing is I don't have to explicitly say "create a new object of this type", any object that satisfies the shape of the interface is valid (object literals for example).

So you end up with the best of both worlds (in my opinion), where your state is made up of simple plain objects and you behaviors are just functions that accept simple plain objects, but you get all of the benefits of compile time static type checking because they are checked against the interface.

Don't you get problems with equal Signatures but different meaning or different scale? Example:

interface SomethingWithAge{ age(); }

class Person{ age(); // age in years }

function classify( SomethingWithAge item ){ if( item.age() < 20 ){ print( 'young') } else{ print( 'old' ); } }

p = new Person( 33 ); classify( p ) // prints 'old'

class Message{ age(); // age in milliseconds }

c = new Message( 231443 );

classify( c ); // prints 'old'


It's in the roadmap, but discussions about it are still going on four years after the ticket was opened...

Using dicts/lists is more apt to flexible serialization/deserialization, versioning, changing data structures in fields/structure and are great for data/api prototypes that move into MVPs and eventually alpha/beta if not the final form.

OO can run into versioning issues with fields/structures changing and needing to have many versions of those over app iterations just dicts/lists. Database fields being added and removed and supporting old versions is more easily solved in basic dicts/lists that map to json/xml and the base of any language. Dicts/lists are very interchangeable across systems and are the base of API outputs and the ease of use in javascript, python and other dynamic uses.

There should be a reason you are using structured OO but many times it becomes as weight, though there are good times to use it and most projects have some of it. Good reasons are hardened/non-changing codebases, possibly native apis/libs to help understanding, but when it comes to rest/http apis usually you are building OO that consumes data simply to return to dicts/lists when output/input into the apis/client-side/etc.

Classes/OO/types can be built as well with objects that are based on dicts/lists that do have some internals and helper methods that get/set keys and values and perform actions if you don't want to do that with another context class/api, or wrap dict/lists so that the serialization/deserialization and versioning issues are not a problem as the flexibility of dicts/lists and the benefits that brings in simplicity are still there but there is more structure where needed.

> Programmers with lots of hours of maintaining code eventually evolve to return early, [..]

I agree with all your other points and I‘d even agree if you wrote return early makes code more readable but not that it‘s something experienced programmers do.

Programming is still - to some degree - resource management. Inexperienced devs often miss that fact, because they are focused on memory management and believe they can rely on the garbage collector for clean-up. In real projects you quickly end up managing file descriptors, sockets, handles, GUI resources, etc. Early return is terrible for that, because your coding style starts to depend on the kind of resource to manage.

That‘s why the more experienced folks end up writing single exit point code sooner or later.

EDIT: I‘m not arguing that early return and proper resource management are incompatible, just that most experienced programmers have often been burnt enought to avoid it.

Oh common. This might be true for a very specific type of codebase, but it isn't true for programming in general. If I'm coding a library of standard statistical functions I'm exiting early. If I'm coding a Rails webapp I'm exiting early. If I'm coding a shell script, even one used on tens of thousands of servers, I'm exiting early.

The resource almost all programmers are managing is the resource of human time. Human time to code the project, human time to review the code, human time to alter the code after the fact, human time to port the code to different environments. If you're writing code to run AI at Google then fine, you're managing resources at a level where performance starts to hit limits that warrant a style change, but generally speaking you know that ahead of time and you choose tools to handle a project of the right scope, but generally speaking exiting early isn't impacting your code performance. Big O mistakes are.

Don't forget the human time to debug the code (especially the rare or only-happens-under-load issues, which can be related to resource exhaustion, among other things).

Also, some of your examples (like altering or reviewing code) can become easier and faster if resource management is a first-class citizen in your code base or language (i.e. is either more explicit in the code, and you know to look for it and manage it, or it is somehow taken care of automatically by your language, so you would have a hard time forgetting it, even if it is handled implicitly).

Yeah that is true.

I guess I would agree with the concept of managing “human time”, but I believe you are focusing on the wrong human. The developer will certainly spend a lot of time writing and maintaining a code base, but that’s minuscule compared to the time spent by the end user. What software design principles will save the user the most “human time”?

Cleaning up resources after use is not a question of performance, it is a question of correctness.

Choosing an environment that requires you to clean up resources instead of handling it for you is mostly a question of performance.

Not really, inlined RAII is just as fast as doing it yourself.

That's one of the reasons I love RAII; all resources, memory and otherwise, are treated the same.

That‘s what I was thinking when writing my edit. The somewhat ironic thing is that C++ people could use early exits safely, but often don’t, while Java folks typically write early exit code when they really shouldn’t.

it's not just about in app resources. This could also affect external resources.. like not cleaning up temporary files because a future maintainer returned early and left some critical data behind.

RAII can handle such resources, and good use of RAII does handle such resources. Languages without time-deterministic GC even tend to add extra language constructs to aid this, such as `using` in C# and Java (7?) with its new `try` block.

Downside is, now your cleanup code is spilled all over the codebase. And where it's implemented it lacks often the necessary context (from the location of use). So, it's often wrong or incomplete, or suffers from the type-for-single-usage syndrome.

Yes, RAII works for the casual std::vector, but it's not a maintainable solution for general resources.

It's right next to the create code, not 'spilled all over the codebase'. And if you really want different behavior on destruction (which is _really_ rare IMO), you can pass a destruction strategy at object construction. If you don't know how you want to destroy something when you construct it, you need to take a step back and reexamine your architecture.

True, for those who don't agree, read about rule-of-zero for some expansions on this. If you have to write the same type of clean up code in multiple destructors you are doing it wrong, wrap that resource in a class that has the responsibility of managing that type of resource and forget about the destructors elsewhere.

And the create code is spilled over the codebase, just as well :-). It's not where actual work is done. This approach of encapsulating functionality in a class may work in some instances such as std::vector but more often than not it's just a lot of (typical OOP) boilerplate for things that you do only once in a codebase anyway. Giving one-off things a type and a name and decoupling them from normal control flow (ripping them out of context) makes things just harder to understand. Better just make them global data, maybe even define the initialization/cleanup routines for that global data in-line with the rest of the control flow.

I mean, having global resource managers is orthogonal to generating the events that cause resource destruction. And you're not 'ripping them out of context's, since the guard objects are still there where you would be doing the resource management manually.

EDIT: and I was the lead for a high availability C++ RTOS. I know that not all patterns fit for writing quality, available code. This fits remarkably well though. Even though we were totally async and didn't rely on the stack for context lifetimes.

> And you're not 'ripping them out of context's, since the guard objects are still there where you would be doing the resource management manually.

The guard object is there (e.g. "on the stack") but the destructor is not defined in-line but in a separate class method definition. Classes have always that implicit ideal of being "isolated". This is wishful thinking, of course. In the end all the parts of the code need to contribute to the program. That's why OOP codebases often end up as a terrible mess. Classes can't decide if they want to be isolated or involved. The compromise is all this terrible implicit state. (do you know the quote by Joe Armstrong about the banana, monkey and jungle?)

Guard objects dont have to be on the stack. They work beautifully inside a context object that's registered with global manager if you're in an async codebase. std::move is a glourious thing.

> starts to depend on the kind of resource to manage.

Exactly, simple but not too simple. If the condition does not need to return early or can't, you don't.

> quickly end up managing file descriptors, sockets, handles, GUI resources

In some of the cases you describe you wouldn't return early or depending on the need, but if you can you should to minimize work.

Some examples: file descriptors you'd return early if the file does not exist, sockets you'd return if dropped or compromised and you have performed cleanup, handles if null, gui resources can be lots of things but if you were doing something where you needed to cleanup you can cleanup and return. In C# for instance you might wrap some of that in a using such as files, sockets, streams etc. Most of your examples you might already be in the meat of the return early flow and usually there is still a return at the very end but in/after the meat.

Returning early isn't saying do a dirty break or asserting out, you don't return early before cleaning up if you are within some resource.

Single exit point is not compatible with exceptions which most GC languages use extensively. Resource management in such environments is done with finally/using/with.

Even C has auto cleanup with __cleanup__ attribute in both gcc and clang.

I frequently use `goto` in C for this reason. A very common pattern is that I acquire some resources, do something with them, and have a section at the end of the function where I release them in LIFO order. This is labeled so that failure to acquire any resource will jump to a label just before the resources that were acquired before it are freed.

Other languages have constructs that obviates this pattern, for example Python has `with` blocks that acquire resources and free them upon leaving the block, and Go has a `defer` statement which adds code to some LIFO structure that always gets executed upon returning.

I have doubts about whether "real" projects generally have these kinds of concerns. But even if they do, you can clean them up in a `finally`, or your language's equivalent.

Most languages with try/catch/finally don't have checked exceptions, so handling resource management in the those schemes is really brittle.

`finally`/RAII doesn’t depend on checked exceptions. It always runs (short of something that aborts the program without unwinding the call stack; and in those cases single-exit won’t save you).

Mixing finally and RAII into the same category doesn't really make sense here.

And I'm not saying that finally depends on checked exceptions, I'm saying that pattern is very brittle. You change lower code to throw a new exception, and you change the above code to catch it like you're supposed to, but now the middle code has no idea that there's this new exception and leaks resources. So you end up either having brittle code who's correctness depends on implicit choices of the code around it, or you're wrapping pretty much all function bodies with try-catch-finally-rethrow.

I find it a good rule of thumb to allocate and deallocate a resource at the same level in my design, and to maintain any associated handle at that level as well. I have encountered few situations where this didn’t work well, and almost all of them arose because an external module introduced some form of unavoidable asymmetry at its interface.

If you’re able to follow that pattern then safe handling of unusual control flows tends to follow naturally. You just adopt whatever mechanism your language provides for scheduling the code to clean up and release resources at the same time as you allocate those resources. Almost all mainstream languages at a higher level than C provide a suitable mechanism today: RAII, some variation of `with` or `using` block, try-finally, Lispy macros, Haskelly monad infrastructure, etc.

That way, the only places you need to write any extra logic are the layers where you allocate/deallocate and possibly lower layers that manage side effects on those resources in the unusual case that they require a specific form of recovery beyond the default behaviour in the event of aborting early.

Hopefully if you do have to work at low level like C or assembly language and you’re using forms of control flow that could bypass clean-up logic then you already have at least some coding standards established for how to manage resources safely as well.

> middle code has no idea that there's this new exception and leaks resources

You misunderstand how `finally` works. The whole point is that it runs regardless of exception type. So the resulting code in the middle layer doesn’t need to know anything about the code it calls, and doesn’t leak resources, even when calling or called code changes: It’s not brittle.

> or you're wrapping pretty much all function bodies with try-catch-finally-rethrow.

No need for `catch` and rethrowing. And in the case of C# and modern Java, no need for the rest either: You only need to wrap resources that you allocate into `using` (C#) or `try` (Java; but not `try…finally`! [1]). Sure, it’s more verbose than RAII in other languages with automatic storage duration (like C++). But it effectively performs the same.

[1] https://docs.oracle.com/javase/tutorial/essential/exceptions...

I just had a brain fart.

`finally` doesn't depend on catching any particular errors. If control flow exits its block for any reason, then the `finally` block executes.

Wrong. You need to exit before resource allocations :-). And in general if you allocate resources in most functions you're doing it wrong. Need global resource managers. Stack variables should only very rarely be resource owners.

RAII works just as well for heap allocated objects as it does for stack allocated ones.

Sure, but allocating resources on the stack is not good for maintainability. Software is easier to understand if its state is not hidden in semi-permanent call stack frames.

In general I agree with you, but that seems orthogonal to the subject at hand.

Parent's statement was that early returns are hard because you need to cleanup allocated resources. I don't think so, because there shouldn't (usually) be resources "on the stack".

But RAII is orthogonal to auto, static, or dynamic allocation.

Is this true? C# and Go both have a nice way to clean up disposables on early return. I'd imagine most languages are similar these days?

"Programmers with lots of hours of maintaining code eventually evolve to return early, sorting exit conditions at top and meat of the methods at the bottom."

I generaly don't like when one person says (declares) how experienced persons act, but in this case I have to confirm with myself, as I also "evolved" to it and I tend to program in isolation, so I am no influenced by others so much.

I've actually gone from avoiding else to requiring else, which is sort of a necessity when you're in a functional immutable environment where the result of a logical expression is a value (and often assigned as such), since it avoids undefined situations.

(This example comes from Elixir.)

value = if boolean, do: this_value, else: other_value

is undefined in some circumstances unless an "else" is included.

This pattern might seem weird to OOP folks (the assignment, not the ternary-operator-style logic), but in Elixir-land, it is not only considered superior to assigning inside if/then logic, you actually get a compiler warning if you assign/bind to a variable inside a logical expression because (again) that value becomes undefined as soon as you leave local scope (or re-acquires the value it had in the outer scope, thanks to immutability).

And if you think about it, values can only travel in one direction using this scheme, which further simplifies reasoning about information flows.

I would argue that this is also superior to early-exit as having one entry point and one exit point is much easier to reason about, and because the extra logical complexity you're ostensibly trying to avoid writing by early-exiting is actually still there, it's just obfuscated... and that complexity should be made obvious (in which case, if it's extensive, it would be a code smell... consider the example of a function with 25 early exits that "looks flat" visually, but which actually has 25 different branches)

That's just a ternary operator. I use that in Java all the time.

    value = boolean ? this_value : other_value;
If the branching logic gets too complicated, I usually move it into a function (private method) with a return in each branch.

And in Javascript, once do-expressions go from a proposal to a common part of the language, this will be a thing:

  let value = do {if (boolean) { this_value } else { other_value }}
...which is even closer to the Elixir style than JS's ternary operator (which is identical to Java's).

What's the advantage over JS's ternary operator?

I guess it depends on whether "if" is treated as an expression which returns a value, which I don't think is the case due to the fact that I don't think there is an implicit value return in JS (such as in Ruby; the final line in any scope is the value that scope returns; same as in Elixir)

The same pattern should be usable for things like switch statements.

Exactly, I did all of the above. Spent too many hours fixing broken code (of my own) and finally I started to learn.

What about error handling? No mention here. What I do not is let the library/framework to handle most of it. With a recent server library I wrote, I can throw anywhere and it will be caught and displayed as expected without extra try/catch.

Errors are the exception, not the rule :)

Can't tell if a logical point or a pun...

The feeling is mutual; not exclusively :)

Wow, you captured so many things I saw in my own education. let me add one:

Same way you learn to let some code fail instead of handling every exception.

In my case, half the time I don't know how to handle the exception usefully, and the other half of the time hard failure is the better option. But that's mostly for data analysis.

> Same way comments are extra weight that should only be in public or algorithm/need to know areas.

This! Comments should convey programmer's intention and explain the thought process, not the code itself. Also, too many comments usually indicate that code should be refactored.

I've been coding about 20 years now, and I've seen all of these changes in my own coding style.

> Same way you evolve out of one liners.

> Same way braces go on the end of the method/class name to reduce LOC.

For me, these are exact opposites; the reason to not use one-liners is the same as the reason to put braces in separate line: both improve readability.

> Same way braces go on the end of the method/class name to reduce LOC.

Screw that. I've been writing code for 15 years, and Allman style braces make it so much easier to mentally parse code into blocks that they're worth every single LOC. I can't speak for anyone else but I'm not working on an 80x24 terminal anymore.

Totally disagree with you.... it's funny though that I read LOC as "level of complexity" not "lines of code". I've been writing code for 30 years and I think it's jarring when the braces are on the next line, so much easier for me to parse that when it's on the same line. But everyone is entitled to their own opinion.

In last 20+ years I've changed the coding style many times, usually to fit the current team style. And in my experience it takes a few weeks for the brain to rewire to the new style. First you hate it, and then you get used to it, and then you're like: wow it's great. Then you switch the standards for a new project and again, the same steps. It's just a matter of habit...

Just give me an opinionated ${LANG}fmt utility to do it for me so I don't have to care and don't have to waste time arguing with other zealots. Let the tool be the zealot.

This is exactly how I feel, as someone who has been coding for 22 years (professionally, closer to 32 if you count when I started hacking in BASIC and 6510 assembly on the C64).

I just defer to the code formatter. That way everything is consistent looking internally, which is all I care about. I couldn't care less whether the braces are on the same or a different line or other things like that outside of the fact that I do like internal consistency within projects.

All that aside, as soon as I saw the OP title I knew (via years of exposure to bikeshedding programmers) that this thread would end up having a ton of comments on it and wasn't surprised to see it was up to 500-something and still rapidly climbing.

> Just give me an opinionated ${LANG}fmt utility to do it for me so I don't have to care and don't have to waste time arguing with other zealots. Let the tool be the zealot.

So totally this. I despise indenting with tabs with a passion, but when I heard gofmt did that I was actually pleased. There's no arguing with the official formatting utility: https://golang.org/doc/effective_go.html#formatting

Pragmatic programming has lost. If you aren't a zealot about being pedantic, just move into the old-folks home.

Not at all, I'd argue this whole Tabs/Spaces, bracket placement, single/double quote discussion should be getting killed by code formatters.

At the end of the day you can do whatever you want with the code you're writing, and some pre-commit hook can run go fmt/prettier/whatever and then it'll all get standardized.

I've become a very heavy proponent of code formatters (I added prettier to the entire team) and I truly believe that talk about code formatting will never really die off, but at the end of the day you can do whatever you want on your side but still have a standardized looking codebase, and that feels extremely liberating to me.

As a PS: If you are a diehard tab proponent/different bracket placement/whatever but your team formatter configuration uses spaces instead then you can also use code formatters to do spaces -> tabs locally, say, on file open. Once you commit your file your precommit falls back to the team configuration, which removes your tabs. Everyone is happy.

I think the person you are responding to was being sarcastic...

I'd say it's the opposite. Code formatters are winning. It's becoming increasingly trite to bikeshed over formatting when projects are using code formatters.

As Andy Lester recently tweeted (but said it wan't his): Should "bikeshedding" be hyphenated?

I should see if he remembers where he got it.

I don't know. If I remembered I'd have attributed it. I noted that it wasn't mine as the next-best-thing.

Well, i'm a pedant about being zealous, and i am afraid we are now doomed to war eternally.

I enjoy zealous pedantry. I have strong personal opinions about something that I do a lot (especially the pedantic parts), but most of it is relatively lighthearted.

I always adapt to the style used in the project I land. I can understand people liking one over another but to reach holy wars levels of disagreement about this kind of subject is something I don't get.

Same goes to spaces vs braces, editor wars, etc.

Downside: my projects are usually a bit of a mess with mixed indentation styles :D

I've also been coding 30 years. I have to use same line at work, and I use next line in my (very large) side projects. Honestly, I don't know what the difference is.

> (very large) side projects

How big once you remove all the extra newlines?

LOL. This is the part of the Silicon Valley episode-bar-scene where the fight breaks out.

Thank god OC didn't mention "Just like you move from tabs to spaces."

I never understood this part of the show. I'd think that the genius inventor of a breakthrough compression algorithm whose WHOLE POINT was high quality lossless compression would prefer spaces over tabs. After all, the compression algorithm can deal with it... without the issues that result from embedding hard tabs in a source file.

Well, he shouldn't have, considering it's the other way around.. :-)

I've actually evolved a hybrid style in my personal projects - next line for functions/classes, same line for blocks.

To me, it looks weird when functions don't have that extra line of space to set off their definition, but if/while/whatever aren't special enough to need that call out.

That said, the most important factor is simple consistency. In my person projects, I have the hybrid style. At work, I use same-line-only. If I'm on a project that has next-line, we all use next-line.

I think the Linux kernel style is similar. That was the first coding standard I somewhat adopted, but naturally have adapted to many others on different teams over time.

I think this style dates back to K&R C where you (maybe) declared argument types after the parens.

  int max(a, b) 
  double a, b;
      return a > b ? a : b;

python fans are going like "what are braces?"

The irony is that Python actually has open-braces, but they're spelled ":" instead of "{". And the syntax effectively enforces K&R style.

When I write Python I end every block with a "pass" statement so that emacs can auto-indent my code properly. The "pass" statement thus effectively becomes a close-brace. It drives Pythonistas into conniptions, but I never have to worry about reverse-engineering a block of code to figure out how to restore the proper block boundaries after a cut-and-paste has screwed up the indentation.

That's the bad part of not having braces.

If boundaries exist they need to be clear. One shouldn't have to count the tabs that make up the level of indentation.

Its already a challenge reading code. Counting invisible tabs makes it even worse.

Well, you don't have to literally count them, you use the spatial reasoning hardware built in to your retina and visual cortex to just see that this bit of code is further over to the right than that bit of code. But of course that only works for pieces of code that are small enough to fit in your field of view.

All code pieces should be small enough to fit in your field of view :)

Try to tell that to my current client :-(

On the other hand high counts of invisible tabs may be a hint that the code needs refactoring..

Enforces K&R style? I don't know what you mean...

    def foo(bar)\
        if (bar > 5)\
            return bar + 1;
            return bar - 1;

I'm not sure whether to laugh or cry

Python specific IDEs will smartly indent code upon copy and paste.

Sure, but if you somehow manage to screw up the indentation then no IDE can possibly restore it for you because the information needed to do it is lost. And there are a LOT of ways that indentation can get screwed up. If your code ever leaves the Python ecosystem (e.g. gets published on the web) then all bets are off because the entire digital world outside of the Python-sphere is built on the assumption that whitespace is fungible.

I don't know about that. I've never screwed up identaton so much that I couldn't correct it simply by selecting everything I just pasted and hitting tab a couple times in my IDE (PyCharm).

Similarly, copying code from StackOverflow, Github and random blogs have all worked without any issues.

So it can't be as easy as you imply.

What can I say? It happened to me a lot before I started using the pass trick.

Nope. Just try insert code formatter with 8 spaces into code formatted with 4, or 2. IDE will make something that compiles, it doesn't mean that it works properly.

Just tried that---typing like mad in a wordpad, with 12 spaces of indentation, then copying into PyCharm and it worked.

Mixed spaces and tabs as a next go, and it worked too---converting the pasted tabs into spaces of my style.

Smart-tabbing is pretty smart.

Granted, I knew that so long as the indents were proper (i.e it compiles), that pasting at that point would give valid code but... I can't see how using braces would have been different. Just because it compiles, doesn't mean it works. I've pasted javascript incorrectly multiple time to produce valid, but incorrect for my needs code.

How does an IDE know if a given line belongs in an if block or not?

It'll indent up to the most relevant valid block as possible, given your cursor at the moment. If you click within an if block, it'll indent into that if-block.

The indentation strategy also looks to produce syntactically valid code, so long as what you are pasting lacked indentation errors.

That is clever. Ugly, yes. But clever.

That's fine -- the Python folks have their own share of religious wars (starting with: tabs, or spaces?) :-)

Do they? All the Python folks I've ever seen express an opinion on style have said "follow PEP 8".

Unsurprisingly, PEP 8 does have a rule for tabs vs. spaces: https://www.python.org/dev/peps/pep-0008/#tabs-or-spaces (spaces, of course)

I love Python but as a long time C programmer I can't understand the preference for spaces in PEP 8.

Tabs are semantic and only take one key press for movement back and forth and to delete.

If you see 1 tab you know it meant one indentation level. With spaces you have to think.

Plus with spaces you are stuck with 2/4/8 spacing(unless you reformat), with tabs you can configure your editor to your preferences.

> with tabs you can configure your editor to your preferences.

Configure 1 tab to be 4 spaces.

Sometimes it's easier to go with the flow. I like tabs for the reasons you mentioned, but fixed-width spaces are a bit better for some reasons too. IDE's can do the heavy lifting of re-formatting indentation levels and converting tabs to spaces for me, and it means if I cat a file on a remote server regardless of the bash tab width settings or if I'm in your code or the stdlib it will all make sense.

> with tabs you can configure your editor to your preferences

Except that you can't because people using tabs will invariably start mixing tabs and spaces because they can't separate indentation from layout. So the code will be messed up unless you configure your editor to someone else's preferences. Also, I'm sure there is some obscure git setting to de-uglify tab users' diffs, but I'd rather not find out.

Tabs are semantic. But, when you start down that route, why not dictate record-separotor as replacement for closing brace/block etc?

The PEP8 myth is quite funny. Almost all people I've met who were repeating "follow the pep8" like a mantra, have never read it. The main idea behind pep8 is: be consistent... however even the python library is not consistent. And it looks like no one cares. There was a great moment to make it nice and consistent - creating the python3, where many incompatible changes were introduced. Instead, the mess is like it was, and people repeat the mantra "follow pep8".

Before you say that I'm crazy... please, go and read pep8.

> please, go and read pep8

OK. What am I looking for?

As a data point, I don't follow (all of) PEP-8.

But tabs vs. spaces is indeed a bad analogy, because of course it's 4 spaces. ;-P

pycodestyle, the Python style checker, has a few checks disabled by default (because there's no consensus they're good ideas), and even some mutually exclusive ones:


As a C++ programmer I can assure you, I know plenty of individuals with very strong opinions on tabs vs spaces debate.

And spaces sub factions/opinions on 2 vs 4 spaces.

The correct answer is whatever the project is already using, unless it is new and then you luckily get to choose.

These religious wars can largely be swept away with one simple, practical question: What is going to result in less noise in pull requests and the commit history, given that nowadays everyone uses a different editor with different configuration defaults, and is going to result in less time in pull requests squabbling over these kinds of formatting issues, given that nowadays everyone uses a different editor with different configuration defaults?

Tabs vs. Spaces has a fairly good answer that flows out of this way of looking at things. There is a counter-argument to that answer, but it is invalidated if you move toward conventions for argument lists and chained method calls that are also designed to limit noise in the source control system.

IMO, the argument about bracing also has a practical, non-religious answer once you start looking at your coding conventions this way.

Tabs, everyone can configure them however they want. :-)

I hate being able to configure stuff, I want other people to boss me around from anywhere in the world! Their opinion matters on my computer!

The war is over. Py3 doesn't allow mixed spaces and tabs in the same file. You choose your side and never get to skirmish with the other one.


    >>> from __future__ import braces
    SyntaxError: not a chance

but biased fans of both would argue that opening and closing a block in the same indent level (allman-style) is closer to python.

Tell them braces are visual indicators of blocks of logic, enabling reasoning more easily on a multi-statement level.

Lambdas shouldn't be more than one line anyway.

I remember in my programming classes at school my teacher was always like: guys!there is space on the right! Use it!

PS: he coded at Hercules the graphic cards back in the day before he got into teaching.

What is "Hercules the graphic cards" (I only found GPU's when trying Google)

The Graphics Card (HGC) was essentially an MDA-compatible monochrome card with an added all-points-addressable graphics mode.


I've done a number of informal tests on friends and family over the years regarding brace placement, and it's always been the same: For someone with NO experience programming (i.e. looking at what to them seems like a bunch of gobbletygoop - and what's a "text editor"?), adjacent braces make it appear more readable than same-line braces.

It's only people who cut their teeth on adjacent bracing that find it more readable.

Why optimize for the lowest common denominator?

I've seen both, and the reduction in vertical whitespace matters more to me than the readability to an untrained individual.

I'll be flipping the point around by saying that if you are concerned about vertical whitespace reduction, you should probably look at the granularity of your files (debatable based on programming environment). Also we could be catering to the lowest common denominator by worrying about people with untrained symmetric sensibility.

>Why optimize for the lowest common denominator?

You seem to be erroneously equating a coincidence with a preference.

Tip: Tilt your wide screen so you get more vertical real estate.

'I've done a number of informal tests on friends and family'

You should have a pretty cool family!

The level of complexity obviously doesn't change with a line break

I agree. I'm not looking to increase code density. I leave empty lines between logical sections within functions to allow the code "breathe". It helps me parse my own code.

I used Python enough that braces are completely meaningless to me in languages with them (ie, everyone else). Indentation is my mental delineation of code blocks. I just have to write these {} things all over the place because every other language is a gramatically bloated mess.

The advent of automatic code formatting (and Python pioneered this in many ways with pep8, but even huge C++ projects are realizing the value in clang-format, clazy, etc) put the nail in the coffin of arguments against whitespace blocking. You can have an enforced uniform style throughout a project now, with no ambiguity, and in such contexts using whitespace as a block delimiter also has no ambiguity. It just takes advantage of formatting already being done to avoid redundant glyph use.

Let's see. The Unix indent utility was what, mid-1980s? GNU indent came out in 1989. Python was released in the mid-1980s. PEP-8 was released in 2001.

Yup, Python clearly pioneered automatic code formatting. Nobody had done it before them...

Lisp pretty printing was available in February 1973 (http://www.softwarepreservation.org/projects/LISP/MIT/AIM-27...).

That document states Bill Gosper wrote one of the first pretty printers, but doesn’t give a date for it.

I would think it is much older, as writing a simple lisp pretty-printer is easy and reading lisp without autoformatting is “less than ideal”.

> I just have to write these {} things all over the place because every other language is a gramatically bloated mess.

In what way are context-free languages "a grammatically bloated mess?" Whitespace delimited languages like Python have context-sensitive grammars. Now that is a mess.

Most bracket languages aren't strictly context-free either (the Dangling else problem [1] alone somewhat guarantees that most programming languages are not strict, unambiguous CFGs in a mathematic sense). There's no right/wrong/superiority in context-free versus context-sensitive, they are tools in the toolbox and not really "levels in a superiority hierarchy".

[1] https://en.wikipedia.org/wiki/Dangling_else

> There's no right/wrong/superiority in context-free versus context-sensitive

Dismissing the whole field of formal language theory with that statement is so bogus, it "is not even wrong." There are tools and techniques for parsing context-free languages that are impossible with context-sensitive languages. Parser generators, structured editing, metacompilers, composable grammars; those are all impossible with context-sensitive languages. Context-free languages are faster to parse, easier to write compilers/interpreters for, and much easier to write tooling for (editors, linters, etc).

Ambiguous context-free grammars are a very different thing from context-sensitive grammars. You can't point to the former and say "so whitespace sensitivity is ok."

I'm not dismissing anything. I'm pointing out the boundaries are fuzzier than hard cut-offs and it's weird to assign moral superiority to one side of the grass or the other. Most parser generators/composable grammar tools, etc, have tools for dealing with context sensitive sections of a language, just as they use regular expressions for tokenizing and allow regular language sub-languages.

I can point to ambiguous context-free grammars and say whitespace sensitivity is ok, because the tools to fix the one are often the same to fix the other. (Ambiguous context-free grammars do turn out to make context sensitive languages, because it's a blurry line in the grass between them.)

Whitespace sensitivity is one of the easier context sensitivity challenges to embed as a sub-language in an otherwise (mostly) context-free language, because you can represent it entirely as pseudo tokens from a simply modified lexing phase in a traditional CFG parser. Python is an easy and clear proof, as that is exactly what it did (Python is also not purely a CFG after whitespace tokenization because of also how it handles the dangling else, but we've already mentioned those weeds).

Even if you don't find the boundaries between the classes of languages fuzzy (and Parser Combinators and PEG grammars have a lot to say here), context-sensitive languages are a tool in a growing toolbelt, not a "complex" evil to be demonized.

> it's weird to assign moral superiority to one side of the grass or the other

Algorithmic complexity is not "moral superiority." The whole point of formal language theory is to show how the different languages differ in terms of how difficult they are to work with.

> just as they use regular expressions for tokenizing and allow regular language sub-languages

That is because regular languages are a subset of context-free languages. There is nothing special to "allow" there.

> I can point to ambiguous context-free grammars and say whitespace sensitivity is ok, because the tools to fix the one are often the same to fix the other.

No they are not. Precedence rules and extra lookahead will help with ambiguous context-free grammars but not with context-sensitive ones. There is also a whole class of algorithms, such as GLR, specialized to handle ambiguous CFGs efficiently.

> Whitespace sensitivity is one of the easier context sensitivity challenges to embed as a sub-language in an otherwise (mostly) context-free language, because you can represent it entirely as pseudo tokens from a simply modified lexing phase in a traditional CFG parser.

That statement is nonsense. There is no way to "embed" a context-sensitive language in a CFG. What you are saying is that it is "easy" to make a whole context-sensitive parser just to produce a context-free language so you can pass that to another CFG parser. That is not "easy," that is bolting crap onto other crap.

> Algorithmic complexity is not "moral superiority." The whole point of formal language theory is to show how the different languages differ in terms of how difficult they are to work with.

Chomsky's hierarchy was designed for production from categories of languages. That it doubles as a useful rule of thumb for algorithmic complexity in parsing is a fascinating dualism in mathematics. That those same rules of thumb reflect basic automaton abstractions is even more fascinating. This is also where it seems the clearest analogy lies to why I find your arguments to "complexity" so useless. It sounds to me like a strange steampunk form of the "640k is all anyone will ever need" fallacy: "why use Turing machines when Pushdown automatons will do just fine?"

Yes, there are a lot of great tools for working with CFGs, just as we've pushed Regular Expressions far past the boundaries of formal Regular Languages, we push these same tools past the boundaries of formal CFGs. We aren't actually constrained by the limits of only using Deterministic Finite Automata or Pushdown Automata in our programming, we have vast Church-Turing–level state machines at our power completely and easily capable of tracking contexts/states.

The "context" in a whitespace-sensitive language is "how many spaces have you seen recently?". This is not a hard question, this is not some mystical "context sensitivity" that needs an entire "context-sensitive language parser". It's a handful of counters. That is the very definition of easy, that's the built-in basics of your average Church-Turing machine, keep a number on the tape and update it as necessary.

Python's specification for the language defines the language in a CFG BNF just as the majority of brackets languages. The one concession to its whitespace sensitivity is that it expects from its lexer INDENT and DEDENT tokens. Those tokens are added simply by counting whitespace between lines, seeing if there is a < or > relationship. After those tokens are in the stream (along with the rest of the tokens defined in their associated (Mostly) Regular Languages) it is parsed by whatever CFG automaton/algorithm the Python team wants (LR, GLR, LALR, PEG, etc). That's not "bolting crap onto other crap" by most stretches of the imagination, that's using a pretty simple stack of tools that any programmer should be able to (re-)build, and not one at all more complicated than (and in fact built entirely inside) the classic tokenizer/lexer feeding a parser stack.

> This is also where it seems the clearest analogy lies to why I find your arguments to "complexity" so useless.

Honestly, from everything you have posted so far, I really get the impression that you have never worked on programming language parsing, or with large code bases. The difference in algorithmic complexity is not "useless," it is the difference between being able to compile millions of lines of code in a few seconds on modest hardware, and needing a cluster just to get your builds done:


Not saying algorithmic complexity is useless, only that it is a starting point in a discussion, not the end of the discussion. I've repeatedly used the analogy "tools in the toolbelt" as my return point in the discussion, so maybe you are missing plenty of my actual words in your "impression" that swings wildly toward an ad hominem attack.

If we want to fight anecdote for anecdote, I've certainly seen millions of lines of code of Python analyzed ("compiled") in seconds on modest hardware, rather than a cluster. Differences there are more in the static typing versus dynamic typing, and strongly compiled versus weakly compiled/primarily interpreted there. Maybe a better anecdote is millions of lines of Haskell? Seen that too. Whitespace-sensitive languages scale just fine.

I can understand it if you want to admit that you just don't like whitespace-sensitive languages for personal reasons, but there aren't very good technical reasons and you seem unhappy with my explanations of why that is the case. I'm not sure how much more I can try to explain it, and given you seem close to resorting to personal attacks I'm afraid this is likely where the conversation ends.

I agree. This is especially good when you have long method names and the method declaration might need more than one line. Having the opening brace on a new line clearly separates where the function/method declaration ends and the body begins.

As someone who started out with C# and learned Java later, let me tell you: This is nonsense. You can get used to both easily, but only one of them saves space.

The One True Brace Style:


The name tells you that it won.

That link's favorite response actually refutes that brace style matters at all. No correlation with bug frequency detectable. So its all religion.

I wish this sentiment were expressed more often but also expressed less judgmentally and given more depth than sometimes occurs..

To say it is religion isn't to say "your preference is dumb". In my own code preferences I try to maintain a split between "have a reason I have confidence in" and "personal preference". And "easier to read" is almost always in the latter. (If you can say WHY, that is the actual reason, but your actual reason still has to be provable.)

Given the difficulty of finding good research and the inherent difficulty of the research (if familiarity is a big component to preference, and a proven ability to mentally parse code required to judge, then good luck running a control group) I have a lot of items I have confidence in being provable without having actual proof, but the lengthy preference list also means I'm willing to accept that each of those can switch columns.

Way too many of our "easier to read" defenses are really "it is easier to read because I'm familiar with it. I personally think camelCase is terrible - language has spaces for reasons! - but there is no denying that it is very common and that the vast majority of coders learned it first, an unfortunate self perpetuating cycle.

Nonetheless, shown research that said I was wrong (assuming said research had taken familiarity into account) I would change my stance rather than dig in my heels...even if that change were to only add "but that's just me" to the end of it.

It's not "religion" if there's a technical reason to choose one method over another.

A later comment on the SO post described how K&R can have maintenance costs because when moving things around it's more difficult to tell where a statement ends, and can accidentally cause side effects.

Not reflected in the bug statistics, according to the post. So if the effects exists, its negligible.

>So if the effects exists, its negligible.

The only thing being considered was "bugs in the final code", not "time and effort in maintenance."

I'm with you on Allman style braces, prefer it and find them much easier to read.

As everyone else seems to use K&R I've just had a adapt over time :-(

What's interesting to me is, anecdotally, the people I know who are the biggest K&R haters are people who started with K&R, then had to switch to Allman for a significant length of time, and then had to switch back to K&R.

That probably tells you more about the haters than the brace style. That was my career path, and while I've gotten used to reading Allman braces, I've heavily relied on tooling (especially Visual Studio) to auto-format my K&R (or 1TBS) braces for whatever project I'm in. Same thing with tabs/spaces.

There are lots of sane ways to read code, but consistency is key. It's hard to change muscle memory in how I type, though, so I think auto format is the way to go.

A pro maintains consistency though so that is good.

I like One True Brace Style (1TBS) with an uncuddled 'else', which has else/else if on new-line with the break before it (similar to Stroustrup K&R without any same line properties, one thing per line, no bracket-less statements) and setup the standard that way if I am designing it, but do Allman or whatever variant the codebase uses if it already exists.

I’m with you on this, and all that other stuff you said. But really, on the topic of braces, brackets and parentheses, any decent editor will highlight the matching brace for you.

I don't work on an 80/24 terminal either, but I find that if there's too much text on the screen at a time, my eyes glaze over and don't read it all. So I work with larger fonts and use a terser style with a 40-line rule for functions that aren't switch statements.

I'm often coding on a server that has 4-6 tmux windows along with another terminal window for my local machine. I rarely have only one wall of text open, so I effectively work on mini-terminals often.

Yeah, I agree with the list above except for that one. It always bugged me that idiomatic Javascript was like this. I guess Javascript just bugs me in general, though, too.

I see it the same way. It offers symmetry. The method name can be as long or short. At least the start of the code block and the end is easily visible since the indentation level remains same. Gives a sense of Python readability.

Huh? My emacs auto-indents both styles in exactly the same way:

    void foo() {
    void foo()

The start of the code block is signified by the opening brace, which starts at the end of the method name in the first case, hence breaking symmetry.

To my eye, the start of the code block is signified by the indentation, i.e.:

So I read C and Python (and Lisp) code the same way. A naked open brace looks jarring and ugly to me. It also increases the separation of other related parts of the code, i.e.

    if (...) {
      while(...) {
        do(...) {

    if (...)
      while(...) {
        do(...) {
The latter seems unnecessarily wasteful to me.

K&R isn't that terrible when code is neat and clean, but it starts to have trouble IMO particularly when declarations or conditions wrap to multiple lines. Compare the following examples... I personally have to stop and read the code to find the blocks with K&R braces, vs being able to see them at a glance with Allman.

    void MyLongMethodName(SomeLongParamType param1, SomeOtherLongParamType param2,
        YetAnotherLongParamType param3) {
        if (longContrivedVariableName1 == longContrivedVariableName2 && 
            longContrivedVariableName1 != longContrivedVariableName3) {
            // do stuff
    void MyLongMethodName(SomeLongParamType param1, SomeOtherLongParamType param2,
        YetAnotherLongParamType param3) 
        if (longContrivedVariableName1 == longContrivedVariableName2 && 
            longContrivedVariableName1 != longContrivedVariableName3) 
            // do stuff

Emacs autoindent to the rescue:

    void MyLongMethodName(SomeLongParamType param1, SomeOtherLongParamType param2,
                          YetAnotherLongParamType param3) {
      if (longContrivedVariableName1 == longContrivedVariableName2 &&
          longContrivedVariableName1 != longContrivedVariableName3) {
        // do stuff

IME the more complex your code formatting, the less likely people are to do it, especially after your code has been touched by dozens of different people with a dozen different editors.

This is why emacs is the One True Editor ;-)

I treat the parentheses the same way I treat the braces:

    void MyLongMethodName(
        SomeLongParamType param1,
        SomeOtherLongParamType param2,
        YetAnotherLongParamType param3
    ) {
        if (
            longContrivedVariableName1 == longContrivedVariableName2 &&
            longContrivedVariableName1 != longContrivedVariableName3
        ) {
            // do stuff
I also make liberal use of temporary variables:

    void MyLongMethodName(
        SomeLongParamType param1,
        SomeOtherLongParamType param2,
        YetAnotherLongParamType param3
    ) {
        auto condition1 = longContrivedVariableName1 == longContrivedVariableName2;
        auto condition2 = longContrivedVariableName1 != longContrivedVariableName3;
        if (condition1 && condition2) {
            // do stuff
Vertical space is allocated to the actual parameters and conditions, and the extra syntax-only lines exist only to separate chunks. This works well with "paragraphs":

    void MyLongMethodName(
        SomeLongParamType param1,
        SomeOtherLongParamType param2,
        YetAnotherLongParamType param3
    ) {
        auto condition1 = longContrivedVariableName1 == longContrivedVariableName2;
        auto condition2 = longContrivedVariableName1 != longContrivedVariableName3;
        if (condition1 && condition2) {
            // do stuff

        auto someData = buildSomeData();
        while (someCondition) {
            // loop

        return someData;

Great for you. I guess you never have to share your code with other people; especially those who aren't using emacs.

I agree with you 100%. It's all about being able to skim through code to find what you want and understand structure quickly.

Allman is great for this. Though I'm a Whitesmiths guy, myself. Still, same idea.

/36 years of commercial coding here.

> I'm not working on an 80x24 terminal anymore.

Even with 80x24 (don't ask), I fully agree with Allman style braces.

Ok. I've been writing code for 25+ years, and disagree.

for a vi user, visually selecting a block to its matching end for cutting/pasting is the equivalent of coding photoshop

v%y works well enough, seems like.

NoooOoooooOOOooooOOOooo! I hate Allman braces! My hatred burns with the fire of a thousand suns!

I agree! The secret to code readability isn't squeezing out all whitespace, vertical or horizontal.

> Programmers with lots of hours of maintaining code eventually evolve to return early

indeed I've evolved it myself but always wondering if it was the right thing to do. Now I even have a name for it.

>programmers with lots of hours of maintaining code eventually evolve to return early, sorting exit conditions at top and meat of the methods at the bottom.

That's not true.

Sometimes, the code begs for an early return so you can focus on the meat of the method. Most times, that's not the case. Single return conditions should be the default, and you should have a good reason not to do the default. In Java, for example, if you also stick a 'final' on the uninitialized variable, you have very nice compiler check to enforce that every code-path initializes the variable once and only once. Sometimes that is also too strict for what you want to do, so you can break the rule, but you have to have a good reason.

>Same way braces go on the end of the method/class name to reduce LOC.

No. Just no. Not for that reason.

>Same way you move on from single condition bracket-less ifs. (debatable but more merge friendly and OP hasn't yet)

It's only debatable by people who are just used to it, or who want to minimize LOC for insane reasons. It's non-debatable in that it is a very common vector for bugs to sneak in. Over the last 5 years, there were probably as many bugs stemming from bracket-less ifs in our codebase.

>Same way you move on from heavy OO to dicts/lists.

You start introducing another common vector of bugs for a little flexibility. Sometimes that may be worth it. As a general rule, it's not worth it.

Can you give some examples of replacing OO with dicts and lists? I've made classes or structs to make it easier to manage what a few associated lists/dicts were representing, but I'm not sure I've done the opposite. Maybe using lambdas instead of subclasses?

After twenty-five years doing this, the only thing I would say is that the style points you mention above are completely irrelevant and the points that refer to actually writing code mostly correct. Programmers that have been programming a long time and are skilled should know the difference between the two types of points and not confuse them as equally important as you do. Also not writing comments because the code seems obvious when you're first developing it is totally a rookie mistake. Codifying it in rules is a sign that the programmer simply doesn't understand how the human mind works over long periods of time. No code is self documenting after six months away from it. None.

> Programmers with lots of hours of maintaining code eventually evolve to return early, sorting exit conditions at top and meat of the methods at the bottom.

Except that I see soooooo many people try to write state machine code with early return/function encapsulation just to avoid indentation.

And then they get pounded when the state machine needs to evolve--and it always needs to evolve.

I really wish programmers had to design a VLSI chip, get it manufactured and debug it just once before they get out of school. The "So THAT'S why they have a global clock that synchronizes all the parallel operations in lock step" moment tends to be blinding.

My experience, on the contrary, has left me thinking that none of these are absolute. You apply certain constructs and styles where they make sense—which depends on what you're writing and who you're writing it with—and become increasingly suspicious that people who appeal to absolutes simply haven't yet encountered situations where the trade-offs ultimately balanced out in favor.

I find myself doing the same for many of these points. I wonder what the point is then to hit OOP so hard in CS majors if at the end of the day production code isn't written like that. I guess my question is more the validity of seemingly esoteric concepts when practical programming evolves to the list in the post by drawkbox.

> Same way comments are extra weight that should only be in public or algorithm/need to know areas.

Unless you are writing completely brilliant code your future self will hate you if you skimp on comments. Also self-documenting code is great, but intentions are not always clear to another person trying to figure out your code.

Only "brilliant" code ends up needing comments.

Plain code organized into understandable methods (usually no more than half a page of code), with good naming for variables & method names reduces the need for comments.

It's also easier to scan/read code if there's a minimum of comments in the way.

There is a ton of code that only exists because of issues elsewhere in the code. This is the opposite of brilliant code: these are the dirty patchworks, the hacks glueing the whole thing together. Yet often these hacks are necessary, at least until some bug is fixed elsewhere.

Clear, self-documenting code is great but it can't capture that holistic insight into what the whole program is doing. It can't capture context, because the whole point of clean refactored code is to be as context-free as possible.

Completely agreed. Put another way: the sentiment that code documents itself is only possible in purely logical systems with no edge cases or surprising consequences.

As an example, our code base has a comment which refers to our internal issue tracker which itself refers to this HN comment: https://news.ycombinator.com/item?id=9048947

It's also only possible if the code is thought of as a text and is proofread/edited rigorously for reading. My process is 1) write the code quickly- add comments 2) revise for clarity and efficiency (usually modularity), remove comments as confidence allows. This is where I might get the variable and method names correct for self-documenting code. Usually it takes several passes. But most often, "good enough is good enough" and I leave comments in. Just in case.

> which itself refers to this HN comment

I was secretly hoping you actually meant your own comment, creating a recursive loop. But a surprise John Nagle is even better

Also, any non-trivial software project is going to end up needing workarounds for bugs in other software. And you really need to document those. Because otherwise someone is going to remove that second refresh() statement, not knowing that this works around a bug in OS version xyz.

Basically, "IE8 fix"-comments

> There is a ton of code that only exists because of issues elsewhere in the code.

Or sometimes, not even that: issues elsewhere in the system, possibly completely outside of your control (e.g. the language/framework/library you're using, the OS, business requirements, etc).

> Only "brilliant" code ends up needing comments.

Code shows what is being done. Comments should explain why it's being done.

Code shows how it is being done. Comments should explain what is being done.

In some cases it seems that the name of a function takes care of the "what".

That's a really nice approach. Thanks for sharing, I might borrow it sometime!

> Only "brilliant" code ends up needing comments.

Often you have horrible complexity imposed on you from outside -- business rules you're implementing, etc, which are _not simple_. You can write code that encapsulates a lot of that, and even refactor it so you can see _what_ the code is doing pretty easily ... but it's often _very_ valuable to document in a code comment (docstring, JSdoc, etc) WHY it's like that.

Arguably, the comment is not to explain the code, in this case, but rather to explain the twisted bureaucratic logic you're having to implement, so maybe that still counts as "brilliant" code. ;)

Sometimes things should be commented. Like, for example, every time I have to do pointer arithmetic, I make a comment explaining why, because pointer arithmetic is opaque and error prone.

I agree with this. When dealing with a modern programming language, it's just easier to name your functions and variables in such as way that they're readable. It makes sense to document public functions using the language's documentation syntax, if you're writing an API.

I look back at a lot of my old code and even without comments, it's easy to follow the logic because I used sensible names and constructs.

I have the exact opposite experience. If you find yourself annotating your variables and methods with extra adjectives and nouns, your code isn't simple enough. Those explanations go in code comments.

Often it means you can refactor it so that there is only one of any "thing" so there is no need to clarify which version or role this "fooBarThing" is serving to disambiguate it from the other "bazBarThing".

Functional composition and well structured data is the key to this. It's basically halfway to point-free style.

The variable name can just be fooBar, and bazBar, and organized/namespaced such that you could have Thing.fooBar and Thing.bazBar (assuming you want those exposed to the rest of the application).

Readable variable names doesn't mean wordy names. I would avoid using more than 2 words in a variable name.

Good, clear code, with nice names and such shows what the code is doing and how it's doing it. It doesn't tell you anything about why it's being done, or the context of the function within the code.

You're lucky if you're dealing with code that's either small enough or simple enough not to need the added context...or if you only need to read your old code.

I find the better my types are, the less I need to explain. Canonical morphisms generally don't need comments.

Sometimes requirements or quircks of third party library need a comment.

For glue/app/ux code that's mostly autocompleted, sure. But for domain-specific, algorithmic, magic-formula type stuff, comments are sanity made manifest.

For an alternate perspective you might find persuasive, I recommend you read through the following slide deck: http://www.sqlite.org/talks/wroclaw-20090310.pdf. (Relevant part starts on slide 80, but I recommend you start from the beginning for full context.)

Comments are not documentation.

Documentation explains HOW to use a thing. Good comments explain WHY a thing is strange. Bad comments explain WHAT a thing does and must be made redundant by extracting and naming the thing.

> Comments are not documentation.

Probably overly generalized to be pithy, but no.

Comments are by very definition documentation, which can and should cover of all of the what/why/who/how/where.

Documentation that occasionally explains how to use code can actually be useful!

Unless you actually get around to writing a user manual (which gets out of date), please do consider documenting how something should be used, particularly in libraries.

As a Perl developer i mean this literally and transfer it to other languages as well. POD (or a block comment above a function) describes the API and use cases of the function. Inline comments inside the function exist only to provide explanation for the next maintainer when they see a strange construct. Any other type of comment needs to be refactored.

Also, i never said documentation isn't useful. HOW and WHY are useful. WHAT in documentation and comments isn't because it should be in the variable/function/method/class/instance name.

(Who/Where/When are covered by the source repository and the blame function.)

IME it's the "completely brilliant" code that your future self will hate you for, and needs the most comments.

Moreover, when the weight of those comments trends towards 50% being justifications for why you did it that way, it's an excellent sign that your code has gone from "brilliant" to "super-genius", as in "Wile E. Coyote, Super-Genius".

"Completely dumb" code can be read by your future self without comments. Even, sometimes, by other people!

Sometimes, I have seen (and, on at least one occasion, written) code that was brilliant in its simplicity. Very clever, but at the same time completely obvious. (In the sense that one needed to be clever to come up with it, but it was easy to understand if somebody showed it to you.)

But for the general case, I do wholeheartedly agree with you.

The problem with this is maintaining comments. If you are using a compiled language or even linting, code that is part of the codebased is checked by a machine at least in a cursory way and the code that is there is what runs. The comments are not guaranteed in any way to pertain to the code that is in the repo and there is not a way for a computer to check the meaning of the comment to make sure that it was updated to match the code. Comments can get out of date, which may make them do the opposite of what they are intended for.

A while back I tried to create a commenting system that explicitly tied comments to blocks of code. The idea was that if you tried to change the code without also changing the comment, or vis versa, a git pre-commit hook would yell at you. Unfortunately it wound up being less tractable than I anticipated, but I think the basic idea has some promise.

That's what code reviews are for, to help developers check each other for things that cannot be automatically linted.

> The problem with this is maintaining comments.

That's like saying "the problem with healthcare is that it costs money". Of course comments can get out of date. The solution isn't to throw them out!

"The problem with installing a roof on your house is one day it can be blown off."

There's no alternative for healthcare.

There's alternative to in-code comments answering the question "why" - it's commit messages. They can't be out of date.

But commit messages can easily get plowed under in some annoying "reformatted, I hate tabs/spaces" commit. I do like to see some "why" sprinkled in here and there that will survive those accidents. But I will nonetheless prefer blame output if it is still meaningful. Also, I enjoy a commit that removes redundant comments almost as much as one that removes redundant code.

You can either pass a commit id to git blame to see changes before the reformat happend or you can use git log like this:

git log -L0,10:file.txt

What? So in my case I'm supposed to browse literally millions of commit messages in order to understand the code?

Comments that answer "what" are redundant. Choose your identifiers better.

Write good commit messages and comments that answer "why" are redundant too. Commit messages by their nature refer to the exact code that they refered to when they were written. Comments answering "why" after a few years are misleading anyway, because code changed around them.

Commenting public api etc is obvious, and most people do it.

When you read through a part of your system that you've either never seen before or have forgotten how it works, do you also read all commit messages for all of that code? I'm asking because that's the only way I can imagine one can learn about the edge cases and surprising consequences in non-trivial systems if you have a rule about not using comments to explain them.

No, I do a git blame, and see all the relevant commit messages.

If there's not 1 line of code from that commit remaining why is it relevant?

If something looks weird still - I go back to the commit that created this part of code and git blame that. I don't remember a case where I had to do 2 steps like that.

BTW we have a rule of putting JIRA ticket numbers in commits, that makes it even easier to find out. You can see the whole discussion that resulted in the code you try to understand, test cases that you don't want to break with your new changes, etc.

>No, I do a git blame, and see all the relevant commit messages.

sounds great until there's a refactor (including moving code around), then all the "comments" get buried.

It's a bit humorous that often the same people who insist the code should be enough documentation follow that up by clarifying that git, of course, supplies the rest.

So something entirely outside the codebase, which may or may not be available to the person who needs the information, and which may or may not need to be extensively searched through years of history to find the relevant commit message, which may or may not be sufficient to explain the code... is better than having a comment in the code.

Git blame is great. But having to look at all commit messages over, say, a few hundred lines of code just to know about standard pitfalls does not sound efficient. Edge cases are not just in the places where code is obviously doing something odd. Sometimes they hide in seemingly normal looking code - the kind of code you may not think it's worth diving into the commit history for.

If I feel the need to add a comment to explain what I'm doing and why, then I usually take a step back and rethink what I'm doing and why. That urge tells me I'm trying to be too clever and that I need to simplify. I still add comments when I need them, but it's a rarity these days. Usually when I'm either on a time crunch or I can't think of a better way to do whatever it is I'm doing.

> Unless you are writing completely brilliant code your future self will hate you if you skimp on comments.

The vast majority of comments I usually see can be rolled into variable names or function names. If you're writing lots of comments, that's a good indication your code is hard to understand, your variables are badly named and your functions are too long in my opinion. I think people that say "your code is bad if you don't have any comments" have things backwards personally.

Pretty much the only time I use comments is when I'm forced to write weird code to workaround an API bug, to explain an unintuitive optimisation or to give high-level architecture documentation.

I've honestly returned to code I've written myself maybe 5 years later and rarely had an issue that would have been helped with more comments.

I understand that as "document every class/module/public method with javadoc/pydoc/whatever and also explain the (few!) counter-intuitive bits of your code with inline comments. (You might find those deep within an algorithm, but usually not in plumbing or business logic code)."

That should leave you with a solid documentation of your application's components and APIs and a small amount of inline comments.

If you find you have more comments, either your code should be simplified or you're commenting trivialities.

As everything, it's a guideline with exceptions, not an absolute rule.

> your future self will hate you if you skimp on comments.

Likewise, your future self will hate you when comments fall out of sync with the code.

If the code is readable you don't need much in the way of comments.

Code explains what the program does. Comments explain why a program does what it does. Code and comments are orthogonal, not coincident.

My rule of thumb is, if I have to ask a question during a code review, the answer needs to be in a code comment.

Source control, eg the "comment" when committing code can also be used for comments.

All good arguments except for the end braces/LOC one. The visual cue is worth far more than the one line it takes up.

Same as we now use TL;DR

Exit or get out quick... make it readable... then the meat is left for those that want it below...

Anybody has an example for "Same way you move on from heavy OO to dicts/lists"?

this is so true, and my experience exactly.

one thing i do disagree with is commenting, there is a time in a place and its hard to derive on a rule on exactly where, when and how, but well placed comments are VERY important and can save developers a ton of time.

I've never thought about this before but I nodded with agreement with each point.

Some of theese are debatable and others, like set operations, are missing

You guys just made me feel better about myself! Thanks, HackerNews.

> Same way you go more composition instead of inheritance.

What does that mean?

Basically instead of Subtype extends Type inheritance, you just give your Type class a SubtypeHolder field. Makes it easier to add different behavior as things' types become less similar than you expected.

I'm waiting on the jury for Almost Always Auto...

>Same way while/do/while is usually fades away, and if needed exit conditions.

I couldn't interpret this sentence.

Edited, basically do not use while loops, if you do tread carefully and provide yourself an exit. You don't want to be the one that nukes the server.

Dijkstra's approach towards provably correct programs has changed my opinion on loops somewhat.

Among all types a 'while' loop gives the strongest mathematical guarantees. In particular it ensures a condition is true at the start of each iteration, and is false when exiting the while loop. Combining this with loop invariants leads to precise and straightforward to verify code.

Avoiding while loops to ensure an exit just seems to be sacrificing a lot of power in exchange for some pretty shaky guarantees that your loop will finish.

Although in the majority of cases where a while loop could be used you're simply iterating over some (implicit) data structure. In which case it's obviously better to make this explicit using a "for ... in" loop.

Did you mean to use for with no counter instead? If so - I strongly disagree. I much prefer while(condition) {} to for (;condition;) {}

For one thing you can't misplace ";" in a while loop.

I agree that "do while" loops are unintuitive and rarely used, and thus the place to first check for bugs. Also they save very little so I just implement them as do(); while usually.

But I know people who disagree about that, too, and shout at me for avoiding them:) It seems what's unintuitive for me isn't so for others.

Basically what I am saying is while should be mostly off limits.

But yes if you end up in a loop you have some sort of counter/null checks or way out if you end up in a constant/recursive loop.

Never leave open the possibility of a lock due to being stuck in a loop.

You want to write "cat" equivalent. How do you do the loop?

I think for loops should be limited to iteration over collections, or repeating sth known number of times.

There are cases when you need to decide if you want to continue iteration on the fly, and while loop is the perfect tool for that. Also while loop is arguably less error-prone than for loop: can't swap clauses because there's just one, can't write "," instead of ";" by mistake.

I learned about while loops at the beginning of my career, and after a few scary errors, I never used them again. 20 years of no while loops anywhere... never going back.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact