Hacker News new | past | comments | ask | show | jobs | submit login
The 80/24 Rule (ploeh.dk)
170 points by nreece on Nov 15, 2019 | hide | past | favorite | 191 comments



I have started to move in the other direction for C#. A long sequence of code that flows in one direction should be kept in one method if possible. Good reasons to break up a method are things that are not readability but other things such as re-use testability. If the methods are only called in one place and not performing a piece of work that is testable, then they shouldn’t be methods.

But

   DoAll() {
     DoA();
     DoB();
     DoC();
   }
Is simply not better in any way than

    DoAll() {
      // A
      ...
      // B
      ...
      // C
      ...
    }
Private methods will blur this somewhat. The key to readability is not having to jump around. Ideally of course methods are both short and reusable, readable without jumping etc, but given the choice between a 100 line method or the same code broken into 10 methods of 15 lines each (yes, more lines), where each method is only called once from a main method, I’d rather read the first. The 100 lines might indicate some other problem with the api (a chatty Dx12/win32-like API for example).

This obviously varies from case to case, I think one of the worse things a team can do is enforce hard rules that invariably lead to “style rule-induced damage”.


My two cents: write long procedures, write short functions.

It is not very useful to break up long sequences of imperative commands (procedures) because chances are the procedure chunks you end up with are completely ad-hoc and non-reusable anyway. Also, as John Carmack puts it, "you should be made constantly aware of the full horror of what you are doing".

For pure functions, it's often a bit of a smell if they're very long; usually you can decompose them into smaller functions which are actually reusable, and you end up with small units that can be tested in isolation and are very easy to understand.

Generally speaking (not a hard rule), it's a win to have more of your program written as pure functions. But it cannot always be done in a sane way, in that case, write procedures and write them long.


Bertrand Meyer, one of the original OOAD people, suggested a step farther: split decisions from actions.

I really came to appreciate this while ramping up on test writing. Decisions are often trivial to decompose, reuse, and test. Actions not so much, and are sometimes defined by procedures in the literal sense.

Still, if you have facility with the English language or aren’t ashamed to use a thesaurus, meaningful method names and boundaries can help break a problem down and let someone zero in on a bug.


I like your distinction here. That fits well with how I find myself structuring code.


I am entirely in this camp as well. I have an extremely hard time dealing with the concept of a hard line/column number limit, because everything is a grey area to me. There are cases that are obvious to most, but when you say that 100% of your codebase must comply with some arbitrary restrictions, you start to cause way more harm than good.

The approach of putting all of the logic inline into the same method is much preferable to me over the alternatives. The point at which I would turn a new method is when a block of code is executed the same way in 2 or more contexts within the same class. Otherwise, I see no reason to break out the code, unless it is making readability difficult in a fairly obvious way - e.g. an inline new List<MyParameter>() which is 100 lines long should really be under a GetMyParameters() private convenience method. Other candidates could be similar points of control (e.g. make it easier to find where to control parameter listings by having them in their own named methods), or managing wildly different levels of abstraction (mapping code vs business logic, maybe create separate mapping methods).

I've taken this a step further and started collapsing entire layers. For instance, we'd have a Repository and Service abstraction, but what ultimately happened is there was only ever 1 concrete implementation of each, so these ended up combined. Something like MyBusinessService+MyBusinessRepository were reduced to a combined MyBusinessService. This has the advantage of allowing you to see exactly how the business logic is manipulating the underlying datastore by reading a single consolidated method. The combined implementation is certainly larger overall, but taking out the context switch far outweighs this minor downside in practice.

Productivity is king. I have given up on trying to have code that looks clean simply for the sake of aesthetics.


Upvoted for collapsing spurious layers. I've started doing this, too. Why am I writing code to allow an alternative database engine, when I'm never going to move this thing off Postgres?


One of the nice things about having a repository/store interface is that it’s easy to make an in-memory fake for testing, so you can have at least some subset of your tests that don’t need to spin up a database container/vm in order to run


Yes this is certainly a solid counterargument, but if you decompose it into its aspects I think someone could make contrary points:

1. I would always advocate for testing full-stack with the datastore in the loop. E.g. an actual API request from a test runner all the way down into a test instance of Postgres. Isolating your testing to specific layers is clever, but I find 99% of my errors come from forgetting to adjust SQL queries after migrations were performed, or other things that are not going to get caught in the BL layer. If you did your BL correctly using the latest C# 8.0 primitives, you get a lot of compile-time guarantees of correctness (e.g. null enforcement throughout), so the value of unit testing these layers is diminished from my perspective.

2. The value of architectural abstractions changes over time. I would certainly advocate for a temporary IRepository injected into a service implementation if it makes the initial 100 iterations of the code faster by way of using a LINQ-to-objects shim in place of Postgres. But, once this code has been running stable in production for months without concern, perhaps its time to take out these abstractions so that the people responsible for maintaining the code on a daily basis have less complexity to deal with.


I very much disagree. The in-lined version can easily be confusing, harder to change, harder to test and even error prone.

By splitting code into functions you have a more well defined scope for each of them and they can be understood in isolation.

You also get the benefit of getting a quicker overview (if you care to name your functions well) in your calling function (DoAll).

Then there is the issue of maintainability. When I have to change something in a big (in-lined) function then I'm much more anxious about the whole context. My instinct is often to refactor it into smaller pieces so I can narrow down the changes and isolate them correctly and exactly from the context.

This might very well be a thing of preference?


I don't think it's a preference issue. The problem you have when you split up functions is you're making decisions that have consequences in a fairly arbitrary way(not backed by an understanding of the system, just how you feel things should be broken up).

On top of arbitrarily pushing the system in directions, you yourself note that you hide the actual context of the code when you split it up. It might make you feel better in the moment but I don't think it's the right response to feel emboldened by making decisions with less context. Premature abstraction is at the heart of a lot of bad design and complexity.

With even primitive dev tools you can get a lot of the benefits (grouping, naming) with comments and braces. More could definitely be done on this front but dev tool progress is sadly pretty slow. Going this route you can have organization while not throwing away the all-important context of what the code is actually meant to accomplish.


I suspect this is mostly preference.

Personally, I really dislike pulling everything out into small methods - it literally forces you to remember a crap load of names. I find remembering names to be extremely high cognitive load.

Worse, the names WILL lie to you. Especially if you weren't the one doing the naming.

So not only are you forced to remember a bunch of labels that better fit someone else's mental model, you're still not off the hook for understanding what the code in those tiny methods is doing, and considering how it might impact the task at hand.

So now you're stuck with twice the work - Remembering which name goes with which functional piece of the task at hand, and then remembering how that name actually accomplishes the task (the actual code).

Worse, because functions are moved out of line, I find it much harder to jump between relevant bits of actual "doing things" code.

Basically - The ONLY time I want to split code off into named chunks is when the alternative is copying/pasting code somewhere. I look for code which is getting reused and break that out.

The in-lined version is faster to read, faster to understand (and by understand I mean REALLY understand, not some hand-wavey "I trust this name" understand, but as in you actually know the operations and changes to the system that the call will result in)

The downside to inline code is you have to actually read it. I find a lot of the folks who really like short methods struggle to parse the language they're working in and fall back on the name without actually understand what the chunk of code does. But I think they're also the folks who have a better memory for names.


Reducing cognitive load by breaking things into digestible pieces can be hard, and at times it's hard to say if option A is better than option B, but it is not preference.

If `extractId` is an eight-step process that's only used once, but I step over it and check locals to find that the variable was correct before the call, and wrong after, then I just found my problem. That would be faster than stepping over each step to find the problem -- basically the debugging version of divide and conquer (for finding the solution, optimal would be binary search, but for readability and maintainability, that should not be the goal).

Another example is if I have a boolean condition that incorporates extremely complex logic but is only used once, if I assign it to a variable called `nameWasPreviouslySet`, that is easier to read and understand, and if I'm debugging and expect `nameWasPreviouslySet == true` at this point, and it's not, then I've figured out the bug is in setting `nameWasPreviouslySet`. So DRY shouldn't be the only reason to refactor -- refactor for readability also.


Ok, so lets take your example at face value. You have an extremely complex step of code that you only call once.

Already, you have invariants that are easy to break. You're assuming it's only called once. That's relatively safe if you're inline, that's bogus as soon you've broken it out into a named function.

Some other dev WILL come along and re-use your helpfully broken out code somewhere, and that invariant no longer holds.

Even assuming no one else has re-used it, what ensures that the problem is actually in that method? Particularly if that method itself calls out to many small named methods.

So you have some complex code that does something, but the "things" it does are call out to other small chunks of code.

so now you have

DoComplexThings => { DoSubThingA DoSubThingB DoSubThingC DoSubThingD DoSubThingE return }

So you're back to binary search - The problem is in there somewhere, but "in there" is actually calls out to many other functions again. Which one is broken?

And wait! DoThingD got changed at some point and will now fail if you run it before DoThingC, but there's nothing in the names that tells you that.

So you have hidden deps between chunks of code that are all trying to accomplish a single goal.

If you're debugging that, you almost certainly have to go through all the code in that method again, except you have to page back and forth in your editor to get the meaningful pieces into view. All so some OTHER dev could give it name that was meaningful to them, but likely not that helpful when you're debugging because the mental model they have of the system resulted in the bug in the first place.

---

Now, rant aside - I think we agree more than we disagree, I think good names matter, and I think there's a lot of wiggle room around when a thing should be broken out.

But if the code is trying to do a single thing, and it's not re-used, I'll take a single cohesive method over 8 tiny abstractions ANY day.


> if the code is trying to do a single thing

What is a single thing? Login? Initialize db? One line of code, 3 lines of code, ...

If you have 100 lines of code and 6 nestings, even if none of that is or, as far as you can tell, should ever be re-used, you should break that into smaller, digestible chunks of DoSubThingA, etc.

It's kind of an aside, but you mentioned e.g. `DoThingD` got changed or "some other dev WILL come along" -- if some other dev comes along and doesn't bother to check where and in what context `DoThingD` is being used, and changes it in a way that breaks `DoThingC` or `DoThingE`, they aren't doing their job. Particularly, if `DoThingD` wasn't written to be re-used, and they just use re-write it and use, particularly in a way that breaks its original invariance... I would be having strong words with that developer.


Oh, and I'll add in a separate comment because it's sort of unrelated. Code structure should have very little to do with how you narrow down failing code.

You can do a binary search by just shoving some log statements into the application and seeing what prints out (or hell, use the debugger for your toolset). That doesn't change if I have 100 lines in a single method, or 500 lines divided out into 100 different "tiny" methods.

I think it's actually harder when the methods are all split, because again - more paging back and forth in the editor to add the required debugging info.


You are talking about what is (or is equivalent to) printf-style debugging; if you use an actual debugger, you can step over function calls, you’ve got to do more work manually setting breakpoints to do the equivalent with undifferentiated streams of code in a long function.


So, you add a command to your debugger that lets you step over a "block" (a sequence of statements surrounded by braces -- the analog to Lisp's PROGN and Scheme's BEGIN) and before you debug the undifferentiated stream of many statements, you factor it into a handful of statements -- without introducing any new function names -- some of which are blocks.


No, what I'm talking about is how to narrow a problem down. The medium you use to accomplish that is flexible.

I'm a little confused as you why you think clicking step-over 20 times to get out of the std::lib is better than just moving the mouse down 20 lines and adding another break.

So again, this feels like we're back at "this is a matter of preference".


I kind of agree with you. If fold markers were universal, I'd say that they would have been the best of both worlds. Longer linear functions with logical blocks foldable.

Folds get a bad rap because they aren't that widespread, outside of Emacs/Vim and amusingly, the .NET world, and because some people abuse them to write God-classes and such.


I understand what you mean. I think a nice middleground between our preferences would be scoped blocks with explicitly passed vars. This would keep the readability you want but still have the other benefits I mentioned. Plus it would be simpler to refactor when needed.


This is a good point, and nicely matches a pattern I find myself using in JS where I just define the sub methods inline in the larger method.

You get the niceties of a name for the folks who want it, but the code is still grouped nicely and it's easy to walk through the whole operation.

I see this much less often in languages where scoping is less flexible though.


Personally I find all those function calls harder to debug/review. I am perfectly capable of interpreting most non-complex small code blocks with a glance. If someone takes that and replaces with a function it usually forces me to step into function to understand what it really does. Function name wont be as meaningful as the actual code.

Forcing stuff to sub functions just hiding information away from the people that reads code.


If you’re talking about scopes of variables then if language supports it, the best of both worlds would be to have an inlined version where each block has its own scope so that you can easily track which variables are actually shared among those scopes and which only belong to certain scopes.


As long as the sub-procedure don't need 10+ input parameters and 3+ output parameters/return values I can agree. Otherwise it's a sign that the code/logic simply too interrelated to be broken up well.


Just my two cents. If your A, B, C are neatly separated, then good. But it's way too easy (especially for a "quick hack" type of programmers) to introduce hidden dependencies like a goto from the middle of A to the middle of B, or C depending on a variable that was implied to be internal to B. Having separate functions helps to avoid some of such accidental dependencies. But, of course, the functions can still "communicate" via the global state or a shared state in a mutable object.


How many times has it turned into this:

DoA(DoAllContext ctx);


That tells me that DoAllContext should be a class itself that doA belongs to, or that DoAllContext should be on the class itself, and the method itself should be the class.

This is, of course, when we're talking about programming in Objects.


When I see a 100 line function it almost invariably is doing stuff at very different levels of abstraction. It's making DB calls here and API calls there; it's handling flags to indicate the end of iteration; is dealing with some user set configuration there. Teasing these things out of the code can make it much clearer what's going on.


Pretty sure VBprogrammer knows what they're talking about here....


Meh. I've used the same handle online for more than 20 years. Why stop now.

For most of the last 10 I've been working in Python.


I wasn't being sarcastic, I've seen exactly what you're talking about a lot in VB. Re-read it just now, and even I thought what I wrote kind of came off sarcastic.


I get you! Apologies if what I said came across negatively.


I think it was the italics that did it.



But only because if you're writing imperative code and doing lots of mutation "you should be made constantly aware of the full horror of what you are doing."


Also when applying the mantra "Dont repeat yourself" (DRY) it can have the opposit effect; that when you are optimizing a hot loop, like a game render loop you inline everything and discover there are repetitions that can be vectorized or removed/optimized. Maybe because you applied DRY and abstracted too early or what not.


Yeah... DRY is probably the most over-used principle (likely bc it's so easy to identify). I like "AHA" (Avoid Hasty Abstractions) as a counter-balance.


DAMP (descriptive and meaningful prose) is especially useful for maintaining tests. Which is where they bulk of the time investment comes.

Literally any testing strategy can be forced to work for about 18 months. Anyone working with shorter lifetimes than that is inexperienced.


I love that quote, and that article as a whole!


I am entirely in agreement contingent upon there being some language construct which makes it possible to explicitly limit which variables from one block can be seen in subsequent blocks. For example

    DoAll() {
      // A
      a1 = ...
      a2 = ...
      a3 = ...

      withOnly a1, a3:
      // B
      b1 = f(a1)
      b2 = g(a3)
      
      withOnly b2:
      // C
      c1 = h(b2)
    }
Since this doesn't exist in any language, as far as I know, I guess I'll stick to using functions for now.


C-style languages tend to have squiggly brackets which can limit scope somewhat, but not as totally as you describe.

For example:

   doThing(){
      x = 7; //x visible in whole function
      {
         y = x*2; //y only visible in this block.
      }
      // x still visible, y no longer visible
   }
Explicitly removing things from scope doesn't seem to be a feature though.


The application i am working on right now has a 766-line setup method written like this. Each block corresponds to the setup of one component. If there is something from that block that needs to be reused (typically, the channels that the component published data to), it is declared immediately above the block.

Many small bits of logic are broken out into their own methods (checking if today is a public holiday, configuring the HTTP server, configuring the KV store client, etc), but the setup of the domain logic components is all in one method.

There's no way to split the setup of the components into methods without either introducing a comparable volume of boilerplate, or having to write some kind of framework to make it easier. Neither would be a win.

A caveat: this codebase uses a language which doesn't have out parameters or multiple-value returns. If it could use those, the boilerplate involved in breaking up the setup method would be considerably smaller. But still probably not worth it.


It does exist, through closures. The syntax in most languages doesn't improve the readability of the code though:

    DoAll() {
      // A
      a1 = ...
      a2 = ...
      a3 = ...

      // B
      b2 = lambda(x,y) {
        b1 = f(x)
        b2 = g(y)
      }(a1,a3);
      
      // C
      lambda(x) {
        c1 = h(x)
      }(b2);
    }
I've added explicit intermediates to better visualize where the scope binding happens, but most languages allow the compiler to bind these variables implicitly whenever an anonymous function references variables from outer scope. That helps with readability, but not with scoping.


But that doubly defeats the purpose.

1. If you have good reasons to restrict variable access in subsections of your long procedure, then it DOES make sense to factor them out as separate functions.

2. Closures automatically close over variables from the surrounding scope (duh), so they don't protect you from accidentally using a variable in a section that you're not supposed to use it in.


The lambda approach avoids "I can't think of what I should call this, I'll call it `foo` for now" problem...

Which isn't a problem we should be avoiding. Naming things in a way that other coders will understand is an important skill, and can be difficult.


How does it avoid that problem?

The lambda doesn't help me understand what that part actually does any more than calling the function "foo_helper" or "foo_partN" or something like that would do.


> Which isn't a problem we should be avoiding

I believe we're actually arguing the same thing.

Anonymous functions aren't the end of the world, but self-documenting code is better.

In the JS/JQuery world, breaking away from anonymous functions would represent such a fundamental shift in practice, I'm not going to do it on my own, but in Angular, I stay away from anonymous functions.


IDEs could make it readable and encourage to split code without splitting perception. A solution is: allow calls to be ▶expanded inline in the source view, as if it was written here* Renaming arguments to their caller-site counterparts would also help.

* Code folding is not the same thing, as it doesn’t eject scope+args out of the “caller” and cannot be used in expression.

Upd: I also would like to add that sources lack usual style formatting. //-sections usually should stand out, but one can only make comment bigger (2-3 lines with === filler), but not the text itself. It would be nice to have some rtf in code and css in IDE. Books are written like that, and that’s well-acknowledged way to structure instructions and descriptions.


> allow calls to be ▶expanded inline in the source view,

If I understand what you mean, Visual Studio Code allows this. I guess Visual Studio does as well.

The implementation feels a little clumsy to me, but hey, I'm the guy who thinks Netbeans and Eclipse where the best IDEs I've used, so I guess a bunch of you guys will think the solution is perfect.


That’s interesting, I thought no one does that yet. Can you please link to this feature? (Can’t figure out what keywords to search)


Right-click the method call, "Peek Definition". Or Alt-F12.


Thanks! It is close, but sadly that overlapping is not much different from just “going to”, as it hides the code behind it. Still may be better than nothing though.


It doesn't hide the code behind it actually, it pushes it down. I agree it's not perfect though.


What would a perfect solution look like, and what would it be worth?


IMO, things evolve when a function grows.

First, you have “a hundred lines of code”

Then, you add white space to separate somewhat independent parts.

Then, you add comments that describe those parts.

Then, you notice some of these comments aren’t good, and refactor the code a bit and tweak the comments to make them good comments.

Once you’re there, refactoring the parts into decent functions is just a further iteration. It may be hard work, though, as it often means introducing new types that must be given good names.

Where you stop with this is a bit a matter of taste, but also depends on project size, development team size, (expected) longevity of the code base, etc.


Keeping your procedure short still has value, even if they just do one thing after another, and are only used once.

That's because separating them out into separate procedures enforces a simple interface: communication only happens via arguments and return values.

When a reader sees a block like

    DoAll() {
      // A
      ...
      // B
      ...
      // C
      ...
    }
then they have to manually verify that the variables in A don't interfere with what's happening in B or C. (Either accidentally or by design.)

In C-style languages you can use {}-delineated blocks to get most of this benefit of the short-method style in the long-method style.

In eg Python you can use nested functions to achieve your readability goals of not having to jump around while still keeping your state from interacting.


I prefer the first one if the functions are well named. Especially if there is any potential for A, B or C to be reused by another part of code. Even with no reuse potential function names are more likely to be kept up to date than comments.


Me too. I often do that to compartmentalize. I'm not sure why anyone would prefer the second one. Now if A/b/c/d are only a few lines each, then sure go for it. AKA they fit in 24 lines ;) . I don't like hard rules but if it starts being more than 1 page long I get antsy and start refactoring my function/method. It might be 80 lines exactly but it's a demarcation point. So I'm probably more of a 100/40 person :). I would say that I use a medium size font, I'm not a tiny font person either.


I'd say the first is always better than the second, provided the DoA..DoC aren't sharing data via global variables.

The reasoning isn't so bad. Complexity is a function of length and communication. Complexity is the enemy. When you optimise for speed, the other side of the equation is always complexity. Or: if you can decrease complexity and increase speed it's always a win.

Spitting code up in your example up makes it obvious the steps are utterly independent. In fact if there are go global variables those steps can be done in any order - or indeed in parallel. Since splitting it up makes that fact self evident whereas to learn it before you had to read the code, it has reduced complexity.

But of course you almost never split it up as you have done, because there is always a tradeoff between function length and communication. Reduce one you almost always increase the other.

Worse, communication is a multi-faceted beast, so measuring its contribution to complexity is hard. Communicating by passing parameters to pure functions increases complex less that communicating by global variables for instance. And can't check communication complexity using code length - if you pass a single value "a" that is in fact just a copy of the local variables, you've achieved nothing! Using length as a measure for complexity is equally fraught. If DoA() is a single like of code, I doubt anybody would say you've improved things much. But if DoA() was, unlike your illustration, lines intermixed it lines from DoB() and DoC(), then you've made a huge impact on it.

Put it all together and there is rarely a a simple answer to the "should I split code" question. The only time there is an unambiguous "yes" when when you reduce both code length and communication.


I have this for Big Ol' Switch Statements. It's actually easier to have all the cases inline and let the function get huge than it is to have the cases call subroutines (because you then have to go find the subroutine, and you can't compare two subroutines in situ easily).

Like any rule, there are exceptions. Mostly I agree with 80x24, but I wouldn't implement it in a linter, because I need to break it every now and again (and Big Ol' Switch Statements are a common cause of breakage).


It depends on the switch statement, but I often use this style for state machines which are particularly handy to see a lot of times.


"Inlining" A, B, C is OK when they go down in abstraction about the same depth. If not, you have a mix of simple and low level code. Using A, B, and C masks this.


From a testing standpoint the former is often superior. It will also tend to focus your attention on organizing your corner cases better.

But as my peer alludes, there are a lot of people who happily write a lot of functions full of side effects and string them together. And in those hands a great mess can be made. I used to be more tolerant of this sort of code but I’ve heard enough complaints to scrutinize this pattern and it does seem to result in quite a few more errors.


In theory yes. But the first approach makes it easy to reorder steps without making silly copy-paste mistakes. It also allows you to easily add testing for each of the steps. With second approach, before you can start testing each of the substeps you'd have to put them in a separate function anyway.


I would like an IDE that can virtually inline code. That way we can write code like your top example and avoid jumping around like the second example.

There could be a + to the left and maybe a red box around the fake inline-ing.


I’ve wanted this too. Probably for most of my career. Particularly for code in other files.

Although I’ve also found that memorizing the advanced navigation features of the IDE is a pretty good way to get around without upsetting your working memory.


Microsoft Visual Studio has this. You can right-click on a function name and select "Peek Definition", and it will open an inline window with the code for that function.


Nice, it actually render the definition inline without placing it in a floating window which is the normal way this feature is implemented: https://docs.microsoft.com/en-us/visualstudio/ide/media/peek...

In gnome-shell-mode [1] we can evaluate arbitrary expression and paste the content (virtually) into the buffer. Since javascript functions evaluate to their source it makes for a poor-mans peek-definition: https://imgur.com/a/7Ot7KFy

It would maybe (sometimes) be nice to even intelligently "macro expanded" the call:

    function helper(a, b) {
      let c = a + b
      return a + c
    }
    
    function p(y) {
      let x = helper(1+y, 2)
    }
    
    --> 
    
    function p(y) {
      // inlined x = helper(1+y, 2)
      let x
      {    
        let c = (1+y) + 2
        x = (1+y) + c
      }
    }
Not the best example, but you get the idea. Expansion can be tricky when the arguments are non-trivial expression. Could fall back to assignment to a variable named after the corresponding parameter.

[1]: https://github.com/paperwm/gnome-shell-mode


VS Code has this in some of the more developed language plugins and so does eclipse (at least it does for the c++ and python plugins)


   DoA();
Is an API like thing.

   // A
   ...
Is not an API like thing.

API's are hard to get right. API's create dependencies. Most internal API's are hot garbage.


It's not an API if the methods are private or declared inside the method. I just rewrote a portion of code the other day with the discussed pattern because it made the high level step very clear:

  DoAll() {
    DoA();
    DoB();

    DoA() { ... }
    DoB() { ... }
  }


One thing I like about the "straight code" version is that it is clear that things are executed sequentially, and only one time. With named functions it is possible that they could call each other too.


I find this interesting. Your point would align with the thread a while ago about the giant kubernetes function (https://news.ycombinator.com/item?id=18772873).

I have a theory that SRP is the driver behind a lot of this, and that the way it's commonly understood is incorrect (in part because the idea itself is somewhat ambiguous and possibly incoherent). I'll try to go into more detail in a blog post.


I prefer the Python rule of 79 characters tops per line, but I lean more towards 120 characters. I think helper methods (private methods) are useful, but having a bunch of methods called one after the other like you point out seems pointless. My other rule of thumb is: DRY, if I'm repeating code, it should be a helper method. Otherwise if it's a one off method, it shouldn't be broken out unless it helps me follow the flow better.


About a year back I moved from 120 back down to 100. The rationale was that, even on a large monitor, it was difficult to fit multiple source files side-by-side at a font size I could read comfortably. On a laptop monitor, it was hopeless.

I also noticed, during code reviews, that people read and digest statements that are broken up into multiple lines much more carefully than they do compound statements that live on a single line. Leaving me to think that line width is not just an aesthetic matter. It's really a very impactful decision that can have an important, if subtle, influence on overall software quality.


> About a year back I moved from 120 back down to 100. The rationale was that, even on a large monitor, it was difficult to fit multiple source files side-by-side at a font size I could read comfortably. On a laptop monitor, it was hopeless.

It very much depends on the screen. With a 32 inch screen you can fit three 120-column views at the equivalent of 100 DPI at default zoom. Or even better, six 120x40 views.


Yeah that's one argument I have heard, I guess I try to go 79 and if it gets lost and I'm at 120 I am not gonna cry about it too much, but anything over 120 characters is pushing it. But yeah, some super long lines I try to bring back. I do know about the split screen thing since PEP-8 outlines some of these things for Python code and I've read a lot of articles about why PEP-8 makes sense.


I feel the first is better, mainly because if the functions are named appropriately, from a glance you can glean the general idea of what the function does. And with modern ide’s, I think navigating that structure is easier than scrolling through the latter.

Of course I just got stuck writing jupyter notebooks for a month so I’m against all the scrolling crud.


'A long sequence of code that flows in one direction'. I guess I agree in that case but how often does that happen? Generally a 100 line method will have loops, ifs and be quite deeply nested. In that case it should be split. The most important benefit is that you get to give the smaller blocks a, hopefully, descriptive name. That way it becomes much easier to follow what is happening.


Came here to say exactly this.


The question is, why didn’t the author write his article in smaller 50-word block tree, conveniently connected by a-hrefs and/or separated to multiple pages, all links well named.

It is a straight wall of text instead, that is hard to follow and using implicit references all across.


Actually, the more technical a text is the more likely it is to contain references and (sub)headers. Also, I certainly would not describe this text as a 'straight wall of text'. A straight wall of text would be a text without paragraphs or headers.


Breaking text up like that is the equivalent of chucking some line breaks in a function and a few comment headings, not creating a bunch of pointless functions and scattering them around the file (or codebase).


Oh but maybe he did. It's just that you're looking at the compiled version.


1: a very good

2: That's

4: question

Comment: 2 1 4.


<3


I've seen this idea cargo-culted way too often. There's nothing wrong with long functions. In fact, long functions are often easier to grok than ones split apart into dozens of pieces.

Create useful abstractions. Don't create abstractions for the sake of creating abstractions.


I’m danish, like I assume the author is, and what he preaches here is very typical of meta-programming in our C# crazed nation. I get it too, it’s a lot easier to have a lot of short functions with single responsibilities in a neat structure while you’re working on a C# project. As long as you know what you did and where you put it, it’s absolute bliss. It’s also really easy to sell your clients a lot of useless automated tests for this approach.

Five years later when I come in to debug the monster you’ve lost track off, however, I’ll curse you to the death as I go through endless loops of “go to definition”.


What specifically about C# makes this approach appealing?


I guess it's one of the few languages (Java is another) with proper IDE support. Most other language IDEs mostly provide syntax highlighting and basic support for debugging and go-to-definition, which also tends to break.


Visual Studio is amazing at juggling things that are wonderful while you have a mental map of your code and nightmarish once you don’t.


I share your sentiment. Some of the programmers I’ve worked with would build functions for every thing within a larger function.

Need to chunk the data? Function. API Call? Function. Parse the response? Function.

So I asked: “Are you ever going to use these internal functions for anything outside of this function?” And their response was: “No, but it makes debugging easier.”

It does not. It makes it exponentially more difficult. Maybe I just work with too many terrible programmers...


Depends. In complicated code with lots of dependencies, I've always found it easier to understand functions that are written as:

    function do_task(a, b, c) {
        d = subtask1(a, b);
        e = subtask2(b, c, d);
        f = subtask3(a, d);
    }
instead of one long function. Of course, this only makes sense if the subtasks are indeed different conceptual steps in the computation.

As an example, I remember writing finite elements code that was structured something like this:

     function solve_fem_problem(problem_description) {
         basis_functions = generate_basisfunctions(basisfunctions_type.b_spline);
         quadrature_rule = generate_quadrature(quadrature_type.gaussian_quadrature);
         A = assemble_matrix(problem_description.differential_equation, basis_functions, quadrature_rule);
         b = assemble_righthandside(problem_description.differential_equation, basis_functions, quadrature_rule);
         apply_boundary_conditions(problem_description.boundary_conditions, A, b);
         solution = linsolve(A, b);
    }
which I think is very pleasing to read, and preferable to a single huge function. This also makes it easy to use another linear solver, for example. On the other hand, the functions that are called inside solve_fem_problem are probably larger.

Note: During work, I haven't actually encountered a lot of code that is structured in this way.


I'm hijacking this since your function shows the kind of case where I like to make an exception to the line length rule.

It looks much better, and is more understandable due to the repeating pattern, as you have it, but some people I've collaborated on open source code with would insist upon this:

     function solve_fem_problem(problem_description) {
         basis_functions = generate_basisfunctions(basisfunctions_type.b_spline);
         quadrature_rule = generate_quadrature(quadrature_type.gaussian_quadrature);
         A = assemble_matrix(problem_description.differential_equation, basis_functions, quadrature_rule);
         b = assemble_righthandside(problem_description.differential_equation, basis_functions,
             quadrature_rule);
         apply_boundary_conditions(problem_description.boundary_conditions, A, b);
         solution = linsolve(A, b);
    }
(For this example, I'm assuming an arbitrary maximum line length of between the A = and b = lines.)


Woah, bransonf? I met you at OpenSTL a few weeks ago! What a funny coincidence.

And yeah, I agree. Even Carmack had something to say on this: http://number-none.com/blow/john_carmack_on_inlined_code.htm...


With risk of offending both sides of the argument, both long functions, and split functions are fine. Functions within functions are great for async code as you can make use of the closure/lexical scope. eg. When the network request come back you still have acces to variables declared in parent function. And you can also freeze state/value of variables in time - by making them into function parameters copied by value.


One middleground option if you have a long function but also want to break it into chucks for easy readability: In C# and some other languages you can put braces around a block of code without actually having any condition on it.

Or just throw some big comment headers in.


VS Code has regions, which look like preprocessor instructions but I don't think they actually have any affect on the code. They do, however collapse while providing a nice label.


Jetbrains IDEs have those too


Thanks, I've been using PyCharm for years and didn't know that.


In Swift I use locally-scoped functions for this. It has the advantage of ensuring that this single-use function can't be called from outside the intended scope, but I can still chunk the work into easy-to-reason-about pieces.



> It does not. It makes it exponentially more difficult.

You should realise you are falling victim of your own assertions here. You're cargo-culting the idea that abstracting functions makes debugging exponentially more difficult, without having provided any substance to qualify.


The assertion, which I believed was implied, is that the habit of creating many sub functions will make it more difficult to trace a bug.

Imagine having a function with 5 sub functions, and since this is your style, those functions have 5 internal functions.

If I want to trace an error, first I have to find it in the main function, then the sub function, and maybe even within its sub functions. Then I have to deal with several different scopes as well.

I don’t mind this if a function is a useful abstraction, that is it does something well and is used repeatedly. My gripe is with functions that are written just for the sake of writing functions.


> It does not. It makes it exponentially more difficult.

is not implicit. That is very much an explicit assertion.

I'm simply pointing out that in your pointing out of cargo-cult, you've done the exact same thing of drawing a conclusion with no basis.

> Imagine having a function with 5 sub functions, and since this is your style, those functions have 5 internal functions.

is unrealistic. There's plenty of room in between "no private functions at all" and "5^n functions for every layer". It's hyperbole and bordering rhetorical.

As for tracing bugs in the way you mention - nobody in my sphere of influence has needed to do that in years, at least not on a regular basis. Tests demonstrating IPO are far more valuable and also far more quickly than tracing, narrow the scope of the problem to the point that you'll now be within one small function to read than a potentially much larger one.


I have the opposite philosophy. The ideal function is only called once. If it is called more than once, I have a "call diamond" instead of a "tree" and this is a code smell.


Harsh constraint. Probably difficult to come up with new sorting algorithms or substitutes for basic arithmetic functions after a while.


I mean for functions that are called in the same context. Obviously external and well-abstracted functions can be called several times.


I'm sooooo tired of this carve things up into functions cargo cult.

Go grab a Vulkan tutorial, and try to understand the logical flow. Good luck.

All the Vulkan tutorials atomize things into functions because it "encapsulates" actions into "easier to understand" pieces.

And then creates a mutable uber-object that holds a whole bunch of Vulkan handles that needs to get passed to every function so you don't know who is updating what and when.

But, hey, writing that as a 1500 line single function where you only expose the variables as you need them is far too complex for us mere mortals to read. <rolls eyes>


So, they're advising to recreate a custom version of the old OpenGL statemachine?


That's the result, but it's not their intent.

The problem is that Vulkan requires about 1500 lines of boilerplate to put a simple triangle on the screen. The issue is that you have to wire as much as possible up statically so the graphics hardware can fly once you start feeding it vertex information. However, for someone who doesn't need maximum performance, the setup boilerplate is a nuisance as Vulkan doesn't do the idea of "sensible default".

So, everybody tries to "encapsulate" this boilerplate.

The problem is that this obscures the fact that you are setting up things which have distinct dependency orders. And sometimes those orders are different depending upon what you are trying to do.

If you already understand advanced graphics card hardware, dependency encapsulation is great. If you really don't yet have the ideas of things like subpasses, pipelines, graphics queues, and queue synchronization really crystal clear, you don't want things abstracted as you are still experimenting and prodding the code in weird ways.


Indeed, it is a fundamentally flawed principle. Code needs to be considered as a graph. All indirections within the code create an edge. Linear blocks of atomic instructions make up a node.

The first fallacy is only considering the size of the nodes and ignoring the number of edges. This is like micro-optimising without paying attention to the Big-O factor.

The second fallacy is failing to consider the difference between what I will call static branches and runtime branches. An if statement or a for loop are static branches. Other forms of indirection happen at runtime (with "goto" being famously difficult to analyse). Static indirection is easy to analyse, runtime indirection is hard.

The code graph (I use this term to mean something slightly simpler than the call-graph, which includes things like recursion) tends to get considerably more complex at runtime. One of the best ways to make your runtime code graph more complex is to add more functions, especially ones that are not private to the current file.

As soon as this happens, you have no immediate way of knowing how many runtime edges there will be going in and out of each one of those functions. So hypothetically, let's say you take a 21-line function and break it into 3, 7-line functions. While each of those is easy to "read", you still have to understand 21 lines of interdependent code, but now the total complexity has potentially increased exponentially.

How bad this problem is depends on the scope of the functions (can dramatically increase how many call-sites can potentially form new runtime edges - something you cannot tell from looking at the code in front of you) and whether that code has any side-effects. Certain types of runtime indirection like DI frameworks can completely thwart attempts to statically analyse this runtime graph.

If you are only dealing with pure functions then it can seem like this isn't a problem because of referential transparency. However, this ignores two problems: the name of the function must precisely describe what it does, otherwise it will mislead "distant code readers" (people reading the call but not the source). That must also never change in the future, otherwise you will still have to deal with all that extra complexity if you ever want to refactor safely.

This is not to say that abstractions are bad or that long functions are good. Premature abstraction however tends to produce bad abstractions, which are worse by far than no abstraction.


> I've seen this idea cargo-culted way too often

maybe, but it should still be more widespread


The basic ideas, sure, but not the specifics. The ideas as I see them are:

- keep functions simple and single purpose, which makes them easy to understand quicky - avoid long lines of code since they don't fit uniformly on one screen - make functions as testable as is reasonable

Whenever you introduce hard rules, you get weird workarounds for cases that don't fit nicely. Dogma is always bad, concepts are immortal.

So yeah, if this helps teach a concept, use it, but don't force everything to adhere to a strict code style.


> Novice Programmer: "Putting all this code in one function would be easier to write."

> Master Programmer: "You fool! You can't just stuff all your code into one function because it's easier to write!"

So the Novice Programmer separated all the code out into different functions and showed the results to the Master Programmer.

> Master Programmer: "You fool! Why have you broken this out into different functions when it all belongs together in the same function?"

And in that moment the Novice Programmer achieved enlightenment.


> achieved enlightenment

I haven't. What are you saying?


"And at that moment the novice achieved enlightenment" is the usual punchline for this story setup.

The message is that neither of the alternatives is correct all the time.


I believe that it means that some things aren't always very clear cut, and one could make a reasonable argument to do it either way.


The best I could interpret it is to have it both ways; break it down into small private functions, then nest (edit: encapsulate) these in a larger one where they can't be seen by anyone else.


Both and neither.

It depends on you and your team, the project requirements, the rest of the codebase, and so many other things. The choice you make is less important than _why_ you make it.


Maybe the western approach of actually giving useful information beats the zen approach of confusing the fuck out of people.


You’re going to hate reading this: https://www.mit.edu/~xela/tao.html


One should group functionality into a single function, not because it's "easier to write", but instead because "it belongs together"


Maybe that, no matter what you do, there will be always somebody telling you there's a better implementation for it.


I always took it to mean, "Don't put all the code into one function because it's easy, do it because it's logical."


I would argue exactly the contrary: https://redbeardlab.com/2019/02/07/write-long-function/

Very seldomly the problem is to understand one function, most of the time the complexity is in understanding how everything fit together. Having dozen of little functions does not help at all.

On the contrary having few, well designed, well interfaced functions that takes care of all corner cases and just work helps tremendously. Those functions are almost always long.


I find this an artificial metric. Normally code just gets split organically as you notice that you can reuse this bit here or that bit there. Some functions won't split and that's OK. (Reference to the 2k-plus-line main interpreter loop in Python.)

Actual reuse is the primary drive for this. Trying to adhere to some other concept instead is dangerous. I've lost countless hours trying to make code "properly object-oriented", or "isolated", or "pythonic" until I realized it's all a delusion.


It wasn't always artificial. When you really were working on a screen that showed 24 lines of 80 characters, it was a good practice to be able to view your entire function at once. That was especially important since at the time we lacked tools like debuggers, and so there was more focus on being able to find bugs by reading.

Today the hardware improvements make those numbers obsolete, but the numbers were picked in the first place because a "screenful" was picked to be a reasonable amount of information to handle at one time -- not just for code, but also for code.

I was never religious about the rule, but it still feels like a good code smell to keep an eye on. If a function gets too big, I start wondering if it's really still doing one thing.


I prefer 120 characters of width, modern monitors are wide-screen, I think it's ok to adjust a little to accommodate this reality.


If you haven’t already heard the counter argument, then it’s less about screen width and more about being able to view multiple pieces of code side by side (split window editing, and side by side diffs for code review.)

It’s also important to accommodate people wanting to do this with large type sizes, which becomes important as eyesight deteriorates.


All the editors I use have supported soft wrapping, so line length isn’t such an issue. For me it’s mostly about being able to grow each line of code without frying my brain like Perl’s leaning toothpick syndrome does.


I hate soft wrapping. It makes the code much harder to read and reason about. Especially if someone else wrote it.

I rather choose to scroll to the right to see long lines when needed.


I find wrapped code much harder to read, unless it's indented properly. The wrapping breaks the flow.


(Extremely late edit: “grok” not “grow”)


On most monitors today you can still have 120 characters and see 3 or 4 side by side tracks including the filesystem in another track.


How big is "modern" these days? I have a multiple 22" 1920x1080 screens instead of one huge one, and after making the font as small as is still comfortable at my viewing distance I get 202 visible columns in maximized VS Code. At a still-workable but small enough size that I'll start unconsciously hunching at my desk I can get up to 271. Common laptop screen sizes are even more constrained.

I also like to have one monitor turned to 1080x1920 for text editing and consoles. I'd never try to do a side-by-side diff on it, but it really makes you see just how many GUIs are downright wasteful of horizontal space.


2? sure. 3 or 4? Either you're using a monitor that's far outside the realm of "most" or your font size is set to a value far below the mean.


In my experience the majority have 27 inches monitors.

I have 3 of those at my desk, 2 vertical and one horizontal. it's a popular system from what I can see.


Oh... you mean 3 or 4 across two or more monitors? That makes a bit more sense, but I find it an odd way to compare things. Obviously if you have n monitors you can fit at least n windows "side by side".

(Also disagree about most people having 27" monitors, but I don't think that makes much of a difference)


You don't even need a wide-screen monitor for that. I like to turn my coding screen 90 degrees, to portrait mode, to get more vertical space and still can see the line at 120 characters.

I call it the TIE-fighter setup, the screen in the middle is in landscape mode and the two others are in portrait mode, gives a lot of screen-space without having to turn the head much to see it all


I also prefer the TIE-fighter setup, and didn't have a name for it until now. However, I find that 120 characters is too wide as soon as you want to do anything other than code on that screen.


> I call it the TIE-fighter setup,

I have the exact opposite setup, with the vertical screen in front of me. How would you call that setup?


Maybe Dumbo setup? The outer screens would be his ears. Not happy with the name, but best I could think of today. I'm a developer, so by definition bad at naming stuff. TIE-fighter setup is the lucky exception


Satellite setup! (side monitors are solar panels?)


not as awesome as "TIE fighter setup", but it does the trick


Inverse TIE-Fighter?


This is my favorite monitor configuration and now I have a handy name for it, thanks!


I'm using a 30" widescreen monitor and 8pt font. I get four tmux panes side by side, each with a little over 100 characters.

I think getting an extra pane is worth way more than getting extra 20 characters of width per pane for the occasional long line.

If it were up to me, I'd be happy to add a fifth pane and go for 80 columns. Unfortunately people at work picked long naming conventions and they're also writing massive functions with deep nesting :(

I'd actually encourage a 80-character limit and deep indents (at least 4, preferrably 8 spaces) if only to force people to give their code better structure. I don't have a problem with long functions as long as they read straightforward, like a recipe from top to bottom, without too much branching and looping and jumping around. Unfortunately these massive deeply nested functions tend not to read straightforward at all.


Same here; 80 characters just feels too narrow to me in most cases. I'd rather see 100 or 120 characters of width and be less constrained, at the expense of maybe having to scroll if I've got multiple files open in my editor.

I also usually run my terminals at 120 characters wide, FWIW.


It's not like 80 is a hard width limit. The rule exists mostly to limit nesting and not text. More (nonredundant) text is better to a high degree. Likewise 24 line height limit should not include proper comments.


My monitor may be widescreen, but my editor typically isn't.


I prefer 80/120. I try to write 80 character lines, but occasional 120 lines are accepted.


This is one of those things that is older than the printing press. We learned a long time ago that there is a more or less optimal column width for text and it's based on the physiology of the eye. I haven't been able to find a great reference online in a brief search, but this[1] has the basics.

[1] http://maxdesign.com.au/articles/em/


I first saw this in the Linux kernel style guide back in the 90s. The idea was that you should be able to read a function without scrolling. So that meant it would fit on screen in a standard terminal. It's a bit extreme for practical use, but the concept is important to keep in mind.

I was helping a Jr programmers with a code review a few months back. He'd been told that a particular chunk of code was "unclear", with specific emphasis on 4-line bit of logic in a 30-ish line function. He had no idea how he could make the code "more clear". It did what it did; the logic was a bit complex but the code was exact and correct.

The solution: factor out the 4 lines of code into a function. It was only ever called from that one place, but by making it a function you give the code a name, with clear inputs and outputs. Nothing changed about what the code did, but it became a lot easier to reason about.

It's a helpful lesson that generally doesn't really sink in for the first few years.


I'm also conflicted about this. Did those 4 lines of code correspond to something that was a natural concept in the problem domain? In that case, outlining it makes sense.

Otherwise, you're probably better served by keeping it where it is and adding an explanatory comment. There are times when code really is inherently complex, and hiding that may do more harm than good by making the code harder to read for everybody who comes later: they'll have to jump back and forth between the main code and the outlined function in order to follow what the code does.

Point is, what you did may have been the right choice in that particular case, but I find that it often isn't.


Do you think that's a good practice? I have always been conflicted.

For one, writing a new function isn't just for vanity--it changes the actual machine code generated (unless an optimizer is smart enough to fix it). I don't like changing the actual operation just to make a program more readable.

But sometimes it just is easier to move a bunch of crap to a function and abstract it away.


> For one, writing a new function isn't just for vanity--it changes the actual machine code generated (unless an optimizer is smart enough to fix it). I don't like changing the actual operation just to make a program more readable.

Inlining should, but does not always prevent this. Because compilers have all kind of limitations when they are no longer able to inline. From what I read I have got the feeling it got better in recent years, but I have not studied the issue.

That said for > 99% of the code written a function call has no measurable performance impact. Few of us write code in a hotspot of an operating system or a very tight loop of some number crunching application. A bit more than 1% might possibly write vectorized code, but still the huge majority never does.

Programming suffers much more from bugs than performance issues. And if there are performance issues, they are typically because of improper algorithms, wrong I/O patterns or high memory consumption, not because of a couple of additional function calls.


I’d bet that the compilers mostly commonly used would probably inline a function only used once. If not, many languages provide an `inline` keyword to tell the compiler to in-line the function directly.

Even if there was no way to inline a function, I tend to prefer code that is easy to understand over code that manages to shave off a few extra instructions. Code that is hard to understand will be hard to maintain and can lead to bugs in the future.


Terrible advice. A method perform one coherent action... if that action takes 1000 lines of code due to some wacky business logic, then the function should be 1000 lines, not split into multiple functions just for the sake of some weird rule.


did not agree about it, Wacky business logic can break up into small function or smaller business logic. The way to keep track in a huge function will make your brain struggle/question in middle of way. The good engineer need to find out where need to break. We need write code for readable and archive business logic.


Or, to understand this one level deeper: Functions/methods should do one thing, and do that one thing well. In general, each of those one things should be smallish things that you use to compose bigger things, which also do one thing and do it well.

24 lines is as good a rule of thumb as any, as long as you're not actually counting the lines. Your goal is flow and readability (ultimately: maintainability), not some countable metric.


John Carmack's take on this seems a lot more sensible to me: http://number-none.com/blow/blog/programming/2014/09/26/carm...


There is a lot of disagreement here with the ideas presented, but i think most of the arguments are taking the ideas too literally and extremely... as with any advice there are no hard and fast rules.

I have some guiding principles similar to the author which make me end up essentially doing the same thing (I also like to keep under 80w and roughly 24h, but there are many exceptions)... yet my principles are seemingly contradictory e.g: 1. minimalism, 2. holism.

Notice that they are not imperative, they are merely principles / desirable attributes, by minimalism I usually mean what the author is talking about but in a more general way, keeping small code blocks is one such desirable... but this does not mean split everything up into micro functions that cause a horrible nest of layered calls, because that goes against the principle of holism... this I believe is also similar to what the author wants - holism is part of legibility, whereas reducing your function scope and interface are only immediate legibility of the local code.

Ultimately what i'm talking about is balance, which is why I think it's better to present these ideas to people as principles rather than imperative rules which can be followed rigidly (wrongly).


If his point is that you should dogmatically reduce long functions into a bunch of small functions, I think he is completely wrong. If his point is that when the clearest way to implement some functionality is with a single long function and not several small ones, and that that is likely indicative of poorly architected code and/or data models, I probably agree with him.

Iff you have some poorly architected code that results in you writing a 500 line function, breaking that poorly architected code down into 20 small functions will only make things worse. However it might be that you could re-architect the code and data models in such a way that functions don't need to be large to be clear then that is probably a good thing.

So please don't go breaking a long function into a bunch of smaller ones, where your small functions' names are effectively code comments, but if you can, rearchitect the code so that a set of smaller functions can accomplish the same task in a more clear way, I am all for it.


Me and my coworkers are working with multiple, widescreen monitors. These can display, even in a crowded IDEs with larger fonts hundreds of characters in a line. Oh, and they have the capability of wrapping text.

Yet, a lot of the purported best practices, a lot of the frequently used linters etc. complain about that 80 character limit.

Now, when I'm reading Python that follows the actual best practices of named arguments and descriptive variable names, I might have to follow a function call so far down that I forget what it's name even is.

Fun!

I've also had the experience of trying to follow code that follows the other best practice of breaking out everything into a separate function. It's always a joy to have to follow that trail out of the call stack when trying to debug, and having to back out potentially tens of files while trying to navigate some unfamiliar code, losing all the rest of the context on each single jump.


> wrapping text

Oh my. Who uses wrapping text for coding?


Why wouldn't you?

You do need some sort of indication in your editor that the line is soft-wrapped. If you have line numbers in the gutter, then you already have that.


Someone who wants an extra large font for whatever reason, perhaps due to sight problems.

The point is that this is an option.


> You've probably heard about the 80/20 rule, also known as the Pareto principle. Perhaps the title lead you to believe that this article was a misunderstanding. I admit that I went for an arresting title; perhaps a more proper name is the 80x24 rule.

You got me there.

The 24 lines convention is interesting. Made me realize that linting rules exist for function length: javascript: https://eslint.org/docs/rules/max-lines-per-function Can't find the equivalent for python though


Sorry, this sounds a lot like motivational "feel good" advice that's in essence empty and confuses more than helps.

"write small functions" sure, how does it help managing complexity? Now you have several tiny functions and need to jump across files and functions to understand how everything works (+ the overhead each function adds - parameters, return values, etc)

Sure, small functions have their place, but sometimes it is easier to have a big block of code that does what you need. Or as another comment said, write small functions but big procedures.

Now for my favourite pet peeve: that 80 columns is some kind of "best practice". In reality it has been cargo culted since the days of the initial IBM PC and there is no reason for such an arbitrary number to exist. But the church of 80 columns keep repeating their mantra mindlessly, mindlessly uglyfying lines of code which go over only a couple of characters beyond 80 columns and wasting coder time with this instead of actually asking the question: is the code actually easier to read now or not?

Because to me, calling a function with one parameter in each line is extremely ugly (especially under-indented as the examples presented in the article)

Is a line that's 85 characters essentially harder to read (especially now with multiple screen sizes and widths) than an 80 char one? No.

I'm not advocating for writing 200 character lines of code, but to not be absolute about it.


Old Forths with block-based I/O used 64/16 (so that definition takes ≤ 4096 bytes and fits into disk block).

Makes no sense in the modern world of course, but, in the context of block editors, it creates a development environment where making complex and bloaty word definitions is actively discouraged -- an idea that Aaron Hsu briefly mentions in his APL talks [1].

[1]: https://www.youtube.com/watch?v=gcUWTa16Jc0


Of course, in the Forth environment, the more any word definition bloats, the harder the stack tracing you have to do in order to understand the operation flow, so keeping things neat and to the point is even more important.

It might be a failing of my own, but I've always wanted a concatenative language IDE to have a stack window, with manual labeling, so I could see what all is on the stack and follow the information flow better.


I prefer another rule for short functions. You should be able to explain what the function is doing in one sentence. If you need two sentences, you should break it into more functions.

On the other hand, if your descriptive sentence is basically just duplicating the statements in the function, then your function is most likely too small or too granular. Depending on situation, you might be able to fix it.


I tend to have a rule/aspiration that I could frame as "80 || 24": don't force me to read too much in two axes. So a longer routine that does maths, maybe even with nested loops, but using short variables and few function calls is quite easy to follow.

As are short routines that have long identifiers. Like your average OOP or Lisp code.

I even think that multiple statements per line are falsely maligned. Wirth used them quite a lot, and not just in stuff meant for publication. If you've got everything on a few lines, it's not that hard to follow.

Interesting sidenotes for this would be syntactic elements beyond functions/methods. Starting with nested functions, where I actually found the function-hoisted JavaScript variant better than Lisp's flets/labels or modern JS arrow functions.

Other areas that don't seem to be investigated a lot would be inner-procedural refinements (ELAN, some C preprocessors), folding/structured editors and Knuth-style Literate Programming.


This is really good advice. There are a ton of different reasons why you can end up breaking this rule and almost all of them are things you want to avoid: multi-purpose functions, adding to old functions that should be refactored, adding an extra case to an if statement that should be broken down, etc.


I love this. For as long as code needs to be maintained, code will always be just as much about communicating with people as it is about people communicating with machines.

One thing I still enjoy about message passing OO (by which I really mean Ruby) is how it’s really easy to separate concerns to the extent that each file contains one module/class with one public and a few private methods in it. I went a bit bonkers this year and found myself doing one method per module, which is the extreme version of this.

Languages with mixins are great for this too. Mixins are, amongst other things, a nice shortcut around composition that lets you put individual facets of functionality in separate files, albeit with fewer guarantees about code remaining uncoupled.

It’s a shame Ruby’s module system doesn’t better support private for modules though.


According to the SICP “Programs must be written for people to read, and only incidentally for machines to execute.”


I don't see how splitting code in multiple modules makes it easier to communicate to others. It forces you to constantly switch places, which, knowing the limitations of the brain with regards to short term memory, would make it harder to keep a mental model. With mixins you don't even know where a particular function might come from, must be a nightmare following a trail when you don't know the code.


> Most mainstream languages, however, seem to be verbose to approximately the same order of magnitude. My experience is mostly with C#, so I'll use that (and similar languages like Java) as a starting point.

I see this, too. For a couple decades, all mainstream languages had roughly the same order-of-magnitude verbosity, and then when GC languages arrived, it took a big leap. Now we've plateaued here for a while. When closures became mainstream, that was a smaller leap.

This leaves me optimistic for the future. I find Lisp to still be much more compact than mainstream languages. Maybe once this generation has fully accepted the current level functionality, and gotten tired or frustrated with it, we'll see another jump in mainstream language expressivity.


A reasonable line width contributes so much to readability.

While I'm fortunate enough not to experience much of it in the code I read, I do see a lot of websites which grossly violate sane line width rules by stretching the body text to the full width of the browser (looking at you Wikipedia ). Take https://en.wikipedia.org/wiki/Cyclomatic_complexity as an example.

Edit: Martin Fowler's blog (https://martinfowler.com/) is probably a good counterexample to this.


> Like all other programmers, other people's code annoys me. The most common annoyance is that people write too wide code.

> This is probably because most programmers have drunk the Cool Aid that bigger screens make you more productive. When you code on a big screen, you don't notice how wide your lines become.

This is part of why I only ever edit code in a half-screen-width window; the other half of the screen is always filled with either another editor, or a terminal. This ends up coming out to more like 100 characters per line rather than 80, but the principle is the same.


I disagree with article. For me function is an abstraction of re-usable code to achieve a single goal(or Intended state). Think of the list of function in system as a list of APIs for a service, the more you create for unintended purpose, the more likely it is a tech-debt.

so, I will say make function for a Goal/Intend in mind, abstract only generic logic to small functions, which have guessable behavior through identifiable by name. Long functions are good if they need to be.


Here’s a case for 80rule: our vision is (mostly) two dimensional, research shows reading comprehension drops as lines get longer, that’s why books and newspapers mostly have fairly narrow columns, comprehending code is less about reading but more about scanning and picking up context for things, much easier to do when perimeter is minimized.

Lots of gut instinct here, not enough citations, maybe one day I’ll compile a more substantiated post about this.


Agree fully, and as mentioned in the article

> I tend to stay within it, but I occasionally decide to make my code a little wider.

No coding rules should be religiously held. If you're trying to re-write an 81 column line to fit the rule, even though it's concise and more readable at 81 columns, you're doing it wrong.


I don't know how everyone uses such small font sizes. I think mine is set to 16pt.

The one thing that frequently makes me ignore line lengths is URLs. They seem to be most readable when on one line, even if it's 200 characters long.

Also, what kind of variable name is `maîtreD`? I doubt many regions have î on their keyboard.


Their eyes must be much better than mine. I think I would probably have eyestrain and be even more tired at the end of the day if I had to squint at 8pt font all day.


HiDPI displays make it much easier.


I had some issues when giving 80 char wide code into print. 70 worked.

Just as a small info :)


I cannot overstate how important this concept is. Write code (and DESIGN SYSTEMS) that fit in your brain, because the faster you can reload it back into your brain, the quicker you can dispatch bugs and new features down the line.

Nothing in my professional career has caused me more anxiety than encountering complex code to solve a simple problem. It's infuriating, and has long lasting deleterious effects for the life of that code.

What's especially insidious about this problem is that for folks who value short term success over sustainability, they'll struggle with this for a long time, and have all kinds of reinforcement to be skeptical that any of this is even a problem. Once you get used to struggling with complex code some people start to think it's all that can be written.


Artificially slicing up functions does not make the code fit in my brain, it just means that I have to load in more functions in order to have the required context.


Definitely, which is why doing it just to get small blocks of code won't work. It has to be part of the entire design, and sliced at real, logically separate spacings.

It's hard! But immensely rewarding if you get it right. You have to be thoughtful as you design and build.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: