Hacker News new | past | comments | ask | show | jobs | submit login
John Carmack on Inlined Code (2014) (number-none.com)
161 points by tosh on Jan 21, 2019 | hide | past | favorite | 105 comments



My personal rule is: if a simple (<=10LOC) function is used fewer than 3 times in code, it's not really a function, it should be inlined. Only noobs avoid copy&paste at all costs. There's time and place for it. Consider that code is written once but read multiple times, and it's a lot harder to read it than to write it, in particular if there are endless chains of function calls, as you commonly see in e.g. Java and the like.

Edit: I don't even care about the downvotes. This is hard won wisdom. If you don't see it, that's your problem, not mine.


No need to be adversarial, we're talking about coding styles.

I actually sometimes write very short helper functions that are actually meant to improve readability by abstracting some details and making the code more self-documenting.

For instance I have this code in my current application (a device driver):

    static unsigned get_machine_id(struct manager *mgr)
    {
        u32 conf = readl(mgr->regs + OFF_MACHINE_ID);
    
        return conf & 0xff;
    }
It's called twice in the entire code. By you rule I ought to manually inline it but I think it'd actually make the code worse, not better. It's always a matter of balance, writing good code is an art.

>Only noobs avoid copy&paste at all costs

Only noobs rely on rules of thumb indiscriminately.


Of course that should be

    const uint32_t conf = read1(mgr->regs + OFF_MACHINE_ID);
and the 'mgr' argument should be const:ed, too. :) That makes it much more clear that this kind helper is just reading something out, and not poking stuff around. Always const all the things.

Have you even bothered checking if your compiler inlines this? It sure looks like a very likely candidate.


I agree that the pointer to mgr should've been const, that's a mistake on my part. On the other hand "const"ing the local integer is a bit overkill IMO (I do think that languages like Rust are right to default to const, but doing the same thing in C adds a lot of noise IMO).


From the article (and follows my experience):

"Const parameters and const functions are helpful in avoiding side effect related bugs, but the functions are still susceptible to changes in the global execution environment. Trying to make more parameters and functions const is a good exercise, and often ends in casting it away in frustration at some point. That frustration is usually due to finding all sorts of places that state could be modified that weren't immediately obvious -- places for bugs to breed. "


Where did I say anything about relying on anything "indiscriminately"? Avoiding repetition at all cost is definitely a mistake. Avoiding repetition where it's justifiable could be a sensible choice. Though in your case I'd just say:

  u32 machine_id = readl(mgr->regs + OFF_MACHINE_ID) & 0xff;
and be done with it.


>Where did I say anything about relying on anything "indiscriminately"?

Well maybe I read a little too much into your comment but I interpreted it as "if you have small functions called only a small number of times in your code then you're a noob". I attempted to provide a counter-example.

I'm sure that you didn't mean it quite as literally but there probably are a bunch of young, inexperienced coders browsing this thread and I think these rules of thumb can be pretty counterproductive when they reach impressionable minds. Then you end up with people cargo-culting weird ad-hoc rules without even understanding the reasoning behind them.

>Avoiding repetition at all cost is definitely a mistake. Avoiding repetition where it's justifiable could be a sensible choice.

I completely agree on that.


   #define machine_id() (readl(mgr->regs + OFF_MACHINE_ID) & 0xff)


Please no. C-style macros are too leaky.


So your example isn't germane at all to the comment you replied to. The reason is that the function you quoted is an API call, it crosses a boundary in the code base. As such it separates two concerns. And actually reduces dependency. (Whatever function calls get_machine_id() shouldn't know anything about how that bit of sausage get made.)

OP is talking about about tying disparate parts of code together via a function call. This creates dependency. Creating dependencies without a really good reason is BAD.


>It's always a matter of balance

Depends on your language and environment, too.

I'm more comfortable inlining C# web code than unsafe device driver code.

But remember when you inline, you can comment around code and use more expressive variable names too. It's the same logical abstraction and similar levels of readability, just accomplished with different language features.


It’s a lot easier to read one function that calls 10 functions that are named what they do than to read one long 300 line function.

Inlining functions is not a readability thing. It’s for performance when needed. In C, you can give a compiler hint to online a function and get the best of both worlds.


Hard disagree on this. I read quite a bit of code for a living and one of my single biggest complaints about most codebases are trivial functions (or useless objects).

If your code does n operations in sequence, just list those operations. Don't hide them off somewhere.

Code that reads linearly is by far the easiest to read. At least when you are familiar with the language and idioms. It's like the difference between a novel and a choose-your-adventure book.

Modern IDEs help, but there's only so much they can do.


My favorite rule is orthogonal to length. Functions should ideally be at one level of abstraction. If you are doing 100 things at a single level of abstraction, and it flows logically, then by all means, keep it togeather. But if in a high level protocol specification you, say, concatenate strings and describe the details of fetching from files etc, then it might be better to isolate those.

Of course, what exactly a level of abstraction depends on how you model, and personal opinion. But I find it a more useful starting point then "line count".


Yes, absolutely! A hundred line function can be trivial, and a oneliner can be complex. It has nothing to do with function length but how much complexity and dependencies they isolate.

What you say about specifications certainly rings true. In those cases the reader's preferred abstraction level is pretty much given.


> If your code does n operations in sequence, just list those operations. Don't hide them off somewhere.

In principle I agree. In practice, if your function needs to do more than say five things one after another then something has gone wrong. If you were sending a human a set of instructions on how to do something, that's the point where you'd start grouping them into logical "meta-steps" (e.g. prepare the ingredients, make the broth, prepare the meat, finish cooking the dish).

Chopping a 12-step function into 3 4-step functions without regard for what each step actually does will make it less readable, just like doing the same to a recipe, or putting paragraph breaks and full stops into a passage at random. But a single function that does a long sequence of things is a code smell in the same way as an overly long sentence; it's well worth looking for a way to logically group these steps into a structure that will help the reader understand.


Here's a counterargument to that: you have to read and understand those "trivial" functions only once and just assume they are correct and do what they do when you keep encountering calls to them.

> Modern IDEs help, but there's only so much they can do.

Exactly. In my experience (YMMV) 100% of the time the complaints about using "too many functions" (or similar) come from people who use poor tools or don't really know how to use them.


Speaking of IDEs, a cool feature would be automatic read-only inlining for short functions. Could probably help understanding what's going on in code with lots of calls.


Usually I’m not just reading other people’s code to understand it, I’m stepping through it while running it. In that case, if it is broken up by function, I get to put watches on interesting variables and step over functions and see the data change. If I see something interesting I can step into the function. With one big blob of code, I have to set breakpoints to skip over the non interesting parts.


"Inlining functions is not a readability thing."

Sometimes it is for readability. For example we have code that serializes commands sent to a device. It used be built with a command pattern which was hard to debug. We have replaced it with a huge switch statement and now the code is much easier to understand and refactor. Obviously this approach often doesn't work but there are cases where inlining makes code easier to read.


Even with using a switch statement, how would it be less readable to put each part of the statement in a function?


Imagine there are two or three key variables, and a bunch of repeated blocks of code that modify some or all of the variables. (That sounds messy but it’s the kind of thing that could easily come up in an embedded controller.)

If you turn all those code blocks into functions, it’s no longer easy to visually inspect the code to see all the places where variable X is modified. It might not be a net win for readibility.

Personally, I wouldn’t define a function purely to contain a repeated block of code; I only do it if the code is related, i.e. all operating on the same data.

In my first example, before trying to wrap things in functions, I’d try to figure out if some of those variables can be tied together into a single object with a nice clear interface.


If you turn all those code blocks into functions, it’s no longer easy to visually inspect the code to see all the places where variable X is modified. It might not be a net win for readibility.

This goes back to not modifying global (or class global) variables and using a functional style. If you need to pass four or five variables put them in your languages equivalent of a pass by value struct.

In the case of my favorite IDE, Extract Method and creating a struct/class from a parameter list is an automated, guaranteed safe refactor.


Good explanation. Thanks?


> Inlining functions is not a readability thing.

Inlining for readability is precisely what the linked article is about.


If you are changing global state, and each function has an order dependency, I could see that. But it seems like even then his general takeaway is not to inline functions but not to change global state and use a more functional style of programming.


I think the takeaway is both. Avoid state changes and use a functional style wherever possible. But, when you do need to make state changes, keep it inline rather than hiding it behind function calls. Better to have one long function, sectioned off by comments, than to have state changes with subtle order dependencies in several different functions.


I would go one step further. Don’t section them off with just comments, use nested scoping blocks. At least most modern IDEs then can collapse the sections and you still have the concept of local variables.


Or use a state monad to distinguish between steps that mutate state and functions that don't, making your order dependencies visible and easy to work with...


And this is how we end up with classes like AbstractSingletonProxyFactoryBean


No, just the opposite. Languages with inadequate composition facilities forcing overly specialised single-purpose classes lead to AbstractSingletonProxyFactoryBean. Small composable abstractions are how you avoid it.

Zygohistoric prepromorphism has the same kind of meme status as AbstractSingletonProxyFactoryBean, but the difference is that a zygohistoric prepromorphism is a one-liner you can compose together yourself out of the functions that the library in question makes available, whereas AbstractSingletonProxyFactoryBean is a class in its own right that the library in question has to define.


If it calls 10 functions, then the equivalent "inlined" function written under my rule is at most 100 lines long, which means it fits on one page easily, and it's almost certainly easier to read than one where you have to follow 10 different function calls. Function naming usually sucks. As they say it's one of the hardest problems in computer science. If your code is 10 lines or less, it should be self-evident (perhaps with a short comment) as to what it does.

Also, as an aside, inlining "for performance" with the inline keyword doesn't always inline. You have to use a GCC hint to force it.


> If it calls 10 functions, then the equivalent "inlined" function written under my rule is at most 100 lines long

Only if those 10 functions weren't already inlined themselves.


But then they won't be 10 lines or less anymore.


Yeah, but if you have A() calls B() calls C(), and they're all a few lines long, how do you decide which ones to inline? If you just do it as you're writing, don't you end up missing inlining opportunities further ahead just because you've already inlined before, and therefore the resulting function is too big to be further inlined?


>> how do you decide which ones to inline

Use whichever setup maximizes readability. Lift code out into functions if it's a logical, reasonably self contained block used 3 or more times. Use your best judgment, in other words. If you require me to jump around the file (or worse, files, plural) to read your code, make sure there's a good reason for all this cognitive overhead.


Here's a counterargument to that: a developer wants to do exactly what's done in those 10 LOCs, but there's no function encapsulating that so the dev in question can't intuitively, or through the documentation, find a reusable piece of code that will save him/her 9 LOCs. So the dev goes ahead and writes the exact same logic one more time. And because it's almost impossible to find an identical piece of logic in a large codebase, you now you have 4 places where that logic is being used and will likely never be replaced with a single function call.

Had you put that logic into a nice little function the first time, maybe with a bit of documentation, it would be reusable, better tested (because more people use it) and will save 9 LOCs every single time that logic will be used.

But, hey, I'm a noob so what do I know.


Just because you have some helper function somewhere in your codebase doesn't mean somebody else is going to find it. It's not uncommon that you end up with multiple versions of the same helper function.

And even if you did reuse that code, you might end up with more work because while you can you use it in two places, the third place that also uses it now has slightly different requirements. Do you add a bool parameter to that function? Now you have two code paths in the function, is it still tested as well as you thought it would? Or you could inline it...

You can't predict the future. It's not worth thinking along the lines of "but maybe somebody might use it later". Only if they really do is it time to act. Don't go looking for abstractions. Let them come to you.


> doesn't mean somebody else is going to find it

You either failed to document it / name it properly or the other dev was too lazy to find it. Making 10 different variations of the same thing is purely a mistake, not a good reason to justify not encapsulating code into reusable chunks.

> Now you have two code paths in the function

But the whole point here is that you don't need to know how those code paths are implemented, only that they work the way they are supposed to.

> You can't predict the future.

Just because you can't predict the future doesn't mean you shouldn't prepare for it.


actually curious: in a codebase worked on by multiple devs, how do you acquire the hint that those few lines you are about to write have been made in a reusable function by someone? Is there some naming convention, a fuzzy search, or some other technical thing? Or is it a email/standup "look at my functions" social thing? And what about functions that 95% but not exactly satisfy your requirements? What is a practical way to sync up multiple devs on small helper functions?


Documentation, Ctrl+Shift+F (on most editors) or Ctrl+T (symbol based search on VSCode) and you try looking for something that "looks like it". It's not hard, it's just that most don't do it because they are lazy and think their use case is "too special".

It's the same exact thing as not bothering looking for a decent library that does what you want and instead write it from scratch. Laziness. Pure laziness. (Unless there's a very valid reason not to do it)

Since I use Django Rest Framework an example comes to mind: finding out that serializers give you the chance to define `validate_{field_name}` methods rather than put all the logic into `validate`. This allows the API to return field based errors rather than a generic error message. But even experienced devs often don't know about this feature. They go by intuition and can't be bothered to spend a few more mins looking at the docs.


You put them into a "common" or "helpers" file or module and you do code reviews in the team.

Even if you're not completely up to date on what's already there you'll probably bump into it when trying to write your helper in the same place.

It's not a silver bullet, of course, but don't let perfect be the enemy of the good.


I mostly follow the 0, 1, N rule myself. The second time the code is needed I start looking into making the snippet a function and there's no doubt of doing that on the third account.

The reason I do it is having several small functions gives the program a custom language. So I can effectively start programming in my own semantic terms even if I was writing C. Building high-level constructs is easier in high-level languages, of course, but you'll always have to start with the lowest building blocks and even that helps regardless of the language!

The small functions will be marked for inlining and practically also inlined anyway so there's no performance cost. But it's immensely more practical to hypothetically write:

    const size_t len = get_data_size(fragment);
rather than:

    const size_t len = ((struct packet_header *)&fragment->databuf[4])->sz + fragment->offset;
The latter one is easy too because you know the code but you will have forgotten a year later.


Obviously people are fairly opinionated about this.

I agree with you 100% - code reuse is far too overvalued and copy and pasting has a bad name.

If something "fits" a function then yes, by all means lift it as a function. But not all long snippets of code make sense when lifted out of the scope - there's no reason to wrap those.

Most of all, forcing all code to fit "merely calling functions style" is an antipattern that can make the code less understandable.


Good code is like a good book, it's written to be read. A lot of people don't get that. Small single-use functions are like footnotes.


... and maximizing entropy makes for a terrible read. Good code, like child poetry, should be allowed to rhyme.


Fully agree with you.

The problem is many people don't know how to read code properly: they try to read all the details at once, instead of surfing at the top level(s) and going deeper only when they actually need. Such people are the bad developers who write bad code bacause they cannot read.

My personal mantra with copy&past while programming is: if your mouse gave you a little electric shock on pasting, would you paste it anyway?


>code is written once but read multiple times

I'm familiar with the expression but it's not really written once, it's written and iterated upon. If I change something in a function that's called twice, I can be confident it will be the same change both places it's called: there's no programmatic way of verifying the change is the same if I copy and pasted, there I have to compare text. Humans aren't good at that.

So in Carmack's example, what if someone adds side effects to the pure, inlined code which don't break anything at the time? They could cause issues when someone later assumes the inlined code is pure. Whereas, if you have pure function calls, this is all done for you by a machine designed for it.


> Only noobs avoid copy&paste at all costs.

If only that were true. I've seen plenty of people with 10-20 years of experience do this too. You know those coding standards that absolutely forbid functions longer than X lines? They weren't written by noobs. They were written by old-timers who assume authority based on mere longevity.

The fact is that there are tradeoffs. The "giant function" approach does create risk that code will be copied and tweaked incorrectly, or violate the assumptions under which the original was correct. This risk can be ameliorated with asserts and/or good tests, but it never goes away entirely. OTOH, the "splatter" approach can be disastrous for subsequent navigation and reading of the code, and the kicker is that it doesn't eliminate the copy-pasta problem at all. I've had to fix many bugs caused by copying between many small functions, followed by either not updating them consistently or someone calling the wrong one. Generally I don't like it, but that's also a bit language-specific. In languages that have explicit guard/context constructs or objects with RAII there's little excuse. In Plain Old C function nesting is a valid way to deal with the cascading-error-cleanup problem. It's still not the approach I prefer, but I'm not enough of a jerk to deride others as noobs for using it.


I think the mistake that people make with DRY is that they apply it at the textual level. They think that if they see the same particular sequence of characters in multiple places that they've violated the principle and need to refactor.

I rather see it at the behavioural level. Important domain logic ought to have a single implementation to avoid discrepencies that can put the system in an inconsistent state. For example your e-commerce platform really ought to just have one way of calculating the total cost of an order.

But, if you've copy pasted the boilerplate code that sets up your logging subsystem in a dozen different places, who cares?


The ops team care. When they'll want you to switch to another logging output config, you'll answer one hour later saying "done". Two days later, you'll fix another two or three of those config. One week later, you'll fix the final one.

It's annoying to everyone.

If it's not config, it's library versions, it's a change in a "boilerplate" argument, ...

Code duplication makes it hard to update all the code that should be "the same". Everyone involved in actually having code updated every now and then cares about it.


Rate of change of a particular piece of code is a factor too, given. In my experience logging boilerplate has a very low rate of change, so I figured it was a good example, but I guess everyones experience is different.


That you took the time to edit your comment to say you don't care about downvotes is proof that you do care.

It is ok to care, btw. You don't need to dismiss other people who don't agree with you though.


The counterpoint to this is that if you find a bug, it has to be fixed in multiple different places, and you have to remember where those are, and that they exist.

I write java (or try to) in such a way that the major function calls logical, sensible smaller steps, and in such a way that it is readable. If there are, generally speaking, three things a method needs to do before it returns, I don't do two of them in separate methods and then dive into the minutiae of the third right there.

So I'm afraid I don't really see it. Modularisation and encapsulation can help comprehension.


I am not sure the downvotes are from people who disagree with you. I think more likely they are from people who take issue with the way you said it. Just food for thought...


Unless, of course, these 10 lines are an invariant-preserving data manipulation code. If you inline that, then good luck ever needing to change that invariant.


Amen I wish I could up vote this 100 times - a modern compiler shouldn't need you to inline a function for speed.


couldn't you just "static inline ...." and let the compiler inline it? He is talking about C....


ah, scarface74 said the same thing :)


I feel lucky that I learned about functions late in the game. I used to write php in long long blocks of code.

When I learned about functions I was very reticent to use them and very seldomly did.


One has to keep in mind that these thoughts are very language- and application-dependent. In some other languages, programming environments (IDE, static analyzers) or even fields (i.e. less performance-critical) priorities change. Generally, less code is better. Testable functions are better than copy/pasted inline code. Readability is important. Thread-safety is important (if your functions mutate some global state, good luck with that, whether inlined or not...). But it's also hard to argue against the C++ and game programming god himself.

As stated in the beginning of the article, these thoughts were from 2007 (not 2014).


I think you're missing his point; you're on two sides of the same coin. Your both saying "it depends", but he's taking the difficult route and highlighting the benefits of the non-dogmatic approach. I think why you should use a function is obvious to most of us. But an analysis of when/why you shouldn't, with concrete list of things to look out for, is more significant. Also, I'm not sure that his general goal (improved productivity, fewer bugs in large codebases) isn't a priority in all languages, environment and fields. I also think it's hard to argue that his specific recommendations aren't globally applicable, especially when he uses the word "consider" in many of them. Is there really a language/env/field where you shouldn't consider inlining a function that's only ever called once?

Aside, but can we all agree that this is what good technical leadership looks like? It's not some arbitrary law passed down about superficial stuff, it's a thought provoking _discussion_ about code quality. It specifically doesn't argue for a one-size-fits-all appraoch, but rather encourages people to consider the situation on a case by case basis. It also includes a concrete list of actions.


> Is there really a language/env/field where you shouldn't consider inlining a function that's only ever called once?

Readability IMHO.

Contrast -

  // This function does three high level things
  // doHighlevelThing3 is only called once
  func () {
    doHighlevelThing1()
    doHighlevelThing2()
    doHighlevelThing3()
  }

  func doHighlevelThing3 () {
    doLowlevelStuff1()
    doLowlevelStuff2()
    doLowlevelStuff3()
    doLowlevelStuff4()
    doSomeBitshifting()
    writeSomeMemoryLocations()
    doLowlevelStuff5()
    doLowlevelStuff6()
  }
With -

  // This function does three high level things
  func () {
    doHighlevelThing1()
    doHighlevelThing2()
    doLowlevelStuff1()
    doLowlevelStuff2()
    doLowlevelStuff3()
    doLowlevelStuff4()
    doSomeBitshifting()
    writeSomeMemoryLocations()
    doLowlevelStuff5()
    doLowlevelStuff6()
 }
To me, the second one loses the readability of the first. There are three main actions func takes, but in the second example we lose the clarity on that. If we bring in all three HighLevel funcs into the low level one, we can end up with something hard to follow and digest, and the intent gets lost.

Whether that gets inlined for performance reasons is entirely up to the compiler, IMHO, and if I have very strong feelings I can probably hint to the compiler (making it static in C, private in Java, whatever mechanism in other languages) that it can do what it likes in terms of optimisation.


To me, the second one loses the readability of the first.

The second one makes it clear that this function changes the state of something by writing to certain memory locations, the first one doesn't. To me that makes the second one more readable.


> The second one makes it clear that this function changes the state of something by writing to certain memory locations

Seriously? So all state changes need to be made in 'main'?

Pretend there was no explicit state change in there, does that change your mind?

What if doHighlevelThing3 is actually called "changeTheThing", and the actual write to the memory location is simply the low-level implementation of that?

That's a really weird objection (to me, I acknowledge this is all style and I am opinionated).


Carmack has been consistent for a long time that state changes are a major source of bugs. It's true that video games might be more susceptible to this.

I think all things considered, readability should generally take a back seat to correctness. I mean, the code is readable, yay!, but the game crashes randomly...not so yay...

Also, he does offer a suggestion to help address your concern:

"Using large comment blocks inside the major function to delimit the minor functions is a good idea for quick scanning, and often enclosing it in a bare braced section to scope the local variables and allow editor collapsing of the section is useful."

Finally, note that he's only suggestion that you CONSIDER it, which was my original point. It's hard to say his suggestions are wrong, when his suggestions are merely that you consider it.


> I think all things considered, readability should generally take a back seat to correctness.

I think that lack of readability is a sure path to lack of correctness, personally. Carmack may have had the luxury of working with consistently great engineers... I do not :)

I'm not saying his suggestions are wrong, merely that there are times when encapsulation makes sense. Code can be readable and correct.


I think it's interesting how, as we push more into "this code absolutely must be correct" territory, code starts looking more and more like PLC logic than like high level code. Run everything every time you go through the main loop, don't overwrite values, calculate your results by choosing which intermediate values to use.


John's observation, that having state-mutating functions that get called only once is actually less readable and more bug-prone, flies in the face of many companies' coding style guidelines that place an arbitrary upper limit on the size of functions. So you have engineers needlessly refactoring 250-line BigFunction() into:

    BigFunction()
    {
        LittleFunction1(); // 50 lines
        LittleFunction2(); // 50 lines
        LittleFunction3(); // 50 lines
        LittleFunction4(); // 50 lines
        LittleFunction5(); // 50 lines
    }
Which complies with the ill-advised policy, but is often not any more readable, and could be more bug prone. Does LittleFunction4() change some state variable that LittleFunction5() depends on? Who knows, better go find the source and check it. Yuck!


If you named them like that, then sure that's less readable. But I feel there's a lot of straw man arguments in this thread; the main reason it's generally considered good practice to split up big functions is that humans aren't good at keeping a lot of context in their heads, so if you can provide a higher level abstraction then you can make it easier to follow.

If we pick a more realistic

    RenderThePage()
    {
        FetchStateFromBackend()
        ProcessState()
        RenderTemplate()
        ReturnHTMLToUser()
    }
I would find that a lot easier to follow than jamming all of that code into a single massive function that did all those different things. (Not to mention, I'd find it easier to test all those things in isolation, too).

To take a completely contrarian position to that being put forth in this thread, I often find it clearer to pull a single hard-to-follow line of code into a function, just to encapsulate it behind a simpler named interface.


Anecdata: Years ago, I refactored some code written by an inexperienced programmer. It was a multiple-thousand-line Perl CGI script where everything was on the top scope, i.e. there were not really any functions (except for a few one-liners). When this was called out by the lead architect, he improved the script by splitting everything into two functions:

  ($a, $b, $c, $d, ..., $x, $y, $z) = GetData();
  PrintResult($a, $b, $c, $d, ..., $x, $y, $z);
The variables actually had reasonable names, but there were at least 20 of them. He just splitted the code at some random point and all variables that needed to be used in the second half became return values and arguments, respectively.


I'd argue that even that can be a small step in the right direction if there were more than 20 variables at the start - at least now you have an upper bound on how much state from the top half of the code can be being used in the bottom half of the code.


He just splitted the code at some random point and all variables that needed to be used in the second half became return values and arguments, respectively.

I've done basically the same. It's as good a place as any to start getting a handle on the problem, as long as you don't leave it there.


Add a code comment to that single line of code.


It is precisely because humans aren't good at keeping a lot of context in their heads that it is a good idea to keep that context together on screen.

In this example it is not clear whether the reader may need to see the content of multiple functions simultaneously to understand what's going on. Or it may depend on the circumstances whether they do. So if you have to make the choice between inlining or not there will always be pros and cons.

An optimal solution gives you the best of both worlds: have an IDE that lets you inline-expand function calls (read-only), optionally replacing parameters with arguments, and simple commands to 1) actually inline the code to make locally unique changes, 2) edit the function in-place to make global changes.


> In this example it is not clear whether the reader may need to see the content of multiple functions simultaneously to understand what's going on.

My argument is that if you've named the functions correctly, and abstracted at the appropriate level, then you shouldn't need to know what's going on in each function to understand the top level. And, if you've separated your concerns well, then the implementation of one function should not depend on state in the other function.

If on the other hand you have global state being manipulated in each of those functions, then sure, it's going to be confusing. Don't do that?

Note that my position is actually closer to what Carmack is arguing for;

> The real enemy addressed by inlining is unexpected dependency and mutation of state, which functional programming solves more directly and completely. However, if you are going to make a lot of state changes, having them all happen inline does have advantages; you should be made constantly aware of the full horror of what you are doing. When it gets to be too much to take, figure out how to factor blocks out into pure functions (and don.t let them slide back into impurity!).

He's saying that pure functions are better than inlining, because they get the readability benefits of splitting your code up, while also avoiding the problems of mutating state that's shared between multiple peer functions.


That’s assuming the functions use global state. In the second half of the article, John advocates keeping things as pure-functional as possible.


Don’t use global mutable variables, it’s almost always a bad idea.


When I've done gamedev stuff in the past I have used the "single big function, collapsible scoped blocks rather than calling separate functions" pattern a fair bit. I do find it makes the code clearer, especially in the root level game loop or world state setup stuff where the code reads (or at least should read imo) a lot like a flat script.


I feel like it made up for the overengineered bits that read like enterprise java instead :)


It is just me or does anyone else appreciate the tone of this letter? I've dealt with many a senior dev and architect that would have totally thrown a mandate down and wouldn't have even spent the time to convince or sell the idea. To me the mutual respect the letter demonstrates (given Carmack's role and talent) is refreshing. I don't know Carmack on a personal level and have never worked in any capacity that would put me in contact with him so maybe I'm off base, but the message he wrote wasn't as cocky, line-in-the-sand, my-way-or-the-highway, etc as some letters and messages I've read by far less accomplished/respected individuals.

Some gems from the article were - "I'm not going to issue any mandates, but I want everyone to seriously consider some of these issues" - "[what I'm proposing is different] so I am going to try and make a clear case for it" - (at the very end)"Discussion?"


A fun thing is that using a function at most once has a strong relationship with linear / unique type system.

Of course it would have been even closer to the type system if he wrote that also variables should be used only once, but now Rust showed us how many advantages it can have.

It would be also interesting if a function in Rust would have to be cloned to be used multiple times (as it's an expensive operation for inlining), but of course it would have UX disadvantages in the programming language.


(Edit: Wrong in the sense of disagreement.) It’s different, calling a function from one place, or an enumerated set of places, than using a value once. It’s more related to effect systems. For example, if you only load a database config in one place, then you only connect to the database from stuff connected to that source location, and so on up through callers, which helps you track down how a certain class of database transaction might be performed by your program.


If you look at lambda calculus where all values are encoded as functions, the difference between using a value multiple times and using a function multiple times goes away.

I'm not a type system expert, but I'm sure some of the theories that work on unique types work on non-function values and function values at the same time.


We're not talking about using a function value once, we're talking about using a function value repeatedly, from one callsite in the source code. So the function is not getting consumed.

But you were correct. There is a sense of uniqueness that could be captured in a linear type system. If we want the function's computation run exactly once per frame, you might imagine a linear value created at the top of the game loop, threaded down to the callsite where the function consumes it. Alternatively, the function could emit the linear value, which then it gets returned all the way back out to something which explicitly consumes it at the end of the loop. Or maybe you encode this mechanism into the type system or as a language feature some other way, but the point is, it's not the function itself that gets used once -- it's an input or output parameter, combined with some sort of access control that guarantees only the outer game loop code can create or destroy such a value.


This is one of those articles that completely went in one ear and out the other when I first ran into it, fresh out of college. A few years later, after completing my first difficult and performance-intensive project, I saw it again, and this time it was as if someone was reading my mind. Carmack’s thoughts lined up point-by-point with ideas about “development-optimized” programming I was coming around to on my own. A great bit of timeless wisdom!

(By the way, Swift is especially great for writing “Style C”:

label: do { /* code here */ }

It’s self-commenting and you can break back to label: at any time, which IIRC will skip the rest of the block.)


Refreshing to hear a point of view where it makes sense to have large functions to guarantee sequence and avoid reusing functions. For some time I thought it was _always_ good practice to split functions up into smaller functions where reasonable. On the other hand I've always hated linters that complain about method length or cyclomatic complexity, as they don't really measure how easy it is for people to understand what is happening. After all, code is for people, not machines.


Cyclomatic complexity might be a textbook example of Goodhart's law. Of course few-paths code is easier to read, but that's often because it's doing something that's fundamentally simpler. And refactoring just to hit some arbitrary CC target seems very unlikely to be a route to happiness.


For the unfamiliar:

Goodhart's law is an adage named after economist Charles Goodhart, which has been phrased by Marilyn Strathern as: "When a measure becomes a target, it ceases to be a good measure."

https://en.wikipedia.org/wiki/Goodhart's_law


From the article:

"To sum up:

If a function is only called from a single place, consider inlining it.

If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that.

If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters.

If the work is close to purely functional, with few references to global state, try to make it completely functional.

Try to use const on both parameters and functions when the function really must be used in multiple places.

Minimize control flow complexity and "area under ifs", favoring consistent execution paths and times over "optimally" avoiding unnecessary work."

What I found particularly interesting was: "I now strongly encourage explicit loops for everything, and hope the compiler unrolls it properly". I suppose looking at the generated listing code will always be the final arbiter in whether the compiler does so or not for tiny loops.


One overlooked aspect of the Gripen example is not one function vs. many, but the fact that the control flow was expressed as a form of state machine. As John mentions, the real problem is often that people don't fully understand the context in which a piece of (copied) code might be called. State machines make this more explicit, which makes expectations clearer and easier to verify or even enforce. You can implement a state machine within one function or across many, so it's really a bit orthogonal to the issue of inlining/copying and I think it's a bit of a shame that the OP kind of conflates the two. The more important lesson IMO is to manage the execution state, and how to break that down into functions is secondary.


The only valid counterargument to this is testability. You can't unittest a code block that's somewhere deep inside another function. At least not in today's languages. If you don't care about testability, a 5k LoC main() is fine.


Well, it's always a trade-off. When you inline stuff that is in separate functions you certainly increase the debuggability and readability and maybe even the performance of that one usecase you currently think about. If your goal is rather straightforward and your code is unlikely to be reused in high percentages in later developments (i.e. if you are coding a game in C++) then it might be a really good idea.

But if you work for instance on MS Word your goals will change 2-4x a year, you will have a list of hundreds of goals, your code might be reused in MS Excel without people even telling you, etc. In such kind of situation it is much, MUCH more important to encapsulate everything in classes and methods and functions (methods changing states, functions not) in a way that it can be treated as a black box. So in that kind of situation you are rather building lego blocks and hope that they can be connected easily enough to build the house that the little child, which is your boss, wants. In that case your lego blocks each need to be very testable and very connectable.

And the truth is that most of us live somewhat in the middle. We have a clear, highest goal that needs to achieved asap, while at the same time having loads of competing medium-priority goals that change all the time. So, while John's insides might be awesome by itself if you haven't thought about this topic before, please don't go full steam in that direction for a few years now. The best result is usually a little unclear and in the middle between two extremes.


> A large fraction of the flaws in software development are due to programmers not fully understanding all the possible states their code may execute in.

I keep that quote in the back of my mind every day while working. I think it has the largest influence on how I write code now.


Long before endless Windows vs Linux debate there was "object-oriented vs functional" debate. But even before that there was a debate on Usenet about subroutines vs inlining in Fortran. It is weird to see this ancient debate back.


Am I the only one wondering how the discussion to his email thread went back then?


> if you are going to make a lot of state changes, having them all happen inline does have advantages; you should be made constantly aware of the full horror of what you are doing.

That's perhaps the main issue: crucial and unexpected state change hidden in functions.

Functions should do a single thing and should have explicit names.

We've all encountered functions with innocuous names a la "printSomething" that actually also changed internal state by stealth...


Not one word about Duff's device?


What does Duff's Device have do with the article?


Loop unrolling is obviously a form of inlining. Perhaps OP is considering the case. "No backwards branches.

For a long loop no branch back is clearly silly. Is unrolling and inlining part of it worthwhile? How does it help or hider readability, unnecessary operation identification, (eg repeated intialisation) and bug identification? Does it have a tension with execution time? These answers would likely be different in 2019. SIMD intrinsics anyone because compilers can't do it well at all. Function with a descriptive name that is called or just write it all right there probably with a comment right where the function call would have been and right where it would have returned..?

We're talking low level optimisation or he wouldn't be having the discussion in terms of c++ etc. Duff is frequent touchstone is such a discussion although not mandatory by any means.

(He said trying to get a little of such discussion going ;-)


This article is not about low level optimization.


Try to be more constructive. I think it is. I explained why I think so.

You've not explained anything, just 2 really negative posts of little productive use to anyone. No need to even post that kind of thing, right? Contributes nothing to a discussion. Explaining your point of view, might.


What does loop unrolling have to do with inline code and not invoking the procedure call overhead? well apart from being the ground case, nothing I guess. I just expected that if you want to talk about procedure call inline, you also talk about other inline


Neither loop unrolling nor avoidance of call overhead have anything to do with this article, with the exception of a small segment actually arguing against manual loop unrolls/structure unpacking. The article is about code structure for clarity, defect reduction, and predictability.


Right here: "I now strongly encourage explicit loops for everything, and hope the compiler unrolls it properly."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: