I have always liked much longer functions then the average programmer. The main reason is that we read sequentially. It much easier to follow something sequential, then something jumping around. A lot of bugs happen when you see a function call that you think you know what it does but doesn't.
There is a reason regular front-to-back books are more popular then choose-your-own-adventure books. You know where you are, and you dont spend all day flipping back and forth.
I also find that when you start splitting up your functions down to 5 lines, it becomes almost impossible to differentiate what the different functions do clearly, and that makes good naming almost impossible.
EDIT: Also, I dont understand why people complain about functions being longer then 5 lines making code hard to navigate, and then turn around and write header only files with 10000 lines of code....
On the opposite end, reading long functions can be hard when they mix levels of abstraction, or their lack of abstraction buries the overall logic with details.
If I'm reading a novel, it would be jarring to read "and then he took 45 steps of average length to reach the kitchen" instead of "he went to the kitchen".
I guess the real problem is that writing good short functions is hard, so some people might find it better to just write long functions. By good short functions I mean functions that, as you say, don't mix levels of abstraction, use proper names, and follow a consistent pattern so their behavior can be safely assumed most of the time. And sometimes you can't have such a clear view of the problem because it already got too messy at another stage of the process, and you want to murder the system designer plus maybe some of your coworkers too.
Of course, for certain tasks long functions are still reasonable, but they tend to be the exception rather than the rule, and it takes quite some practice to realize.
Best practical advice: do more design and planning on paper.
I’m always much more proud of a good clean function I’ve written than I am a larger, less clean one. The shorter one takes longer to write, and requires more understanding and thought — and it’s usually the second or third pass refactoring the long less clean one, but it’s always satisfying when I get there.
I'm dealing with functions thousands of lines long with switch/case and if/else statements five or six levels deep, and it really is the wrong abstraction. It's massive 'handle request' functions, with a switch over a 'ok what kind of request, fetch or update or restore from backup?', then a switch over 'ok what data do you want to fetch or update or restore from backup'. That kinda thing.
That's a bad example. A better example - and yes, I'm totally biased - is the main() method for the rebuild I'm working on - it's basically 'read config', 'set up logger', 'set up service x', 'set up service y', 'validate runtime dependencies like files and scripts', 'create & start http server', 'wait for shutdown signal', 'gracefully shut down http server on shutdown signal', etc. All in one place, no need to extract methods in my opinion. That said, it really is not the critical path of the application.
It's hard to discuss in the abstract, especially with bad names like MinorFunction1() and MinorFunction2(). In style C, you may have to scan over lower levels of abstraction than you care about:
MajorFunction() {
// fast inverse square root
float x2 = number * 0.5F;
float y = number;
// several more lines of code
// find decimal point
char *c = input;
while (*c && *c != '.')
++c;
}
A goal of style A or B is for the names of the minor functions to make the shape more obvious, ideally self documenting:
This is the kind of advice you'll find in, say, Code Complete. Although I generally accept the premise, there is a readability cost to indirection. The bigger and more complex MinorFunction() is, the more likely I'm going to have to jump into it and remind myself what it does.
There are two concepts that underlie DRY: coupling and cohesion. There are good expositions on this in old writings on structured programming and design (e.g., Yourdon and Constantine). If MinorFunction() is cohesive, and MajorFunction() is appropriately coupled to MinorFunction(), then style A/B is likely to be superior to style C. One of Carmack's points is that "very little of our real code" ends up that way.
I don't like long functions because they're harder to write unit tests for. I make too many mistakes, and I need the unit tests to verify the code and bring down my error rate.
I agree with this... If a part of a function is a hotspot of unavoidable complexity then it's good to split it out and test it.
But then some people will say that you're testing the internals of the caller and that you shouldn't do that. If we follow that logic to the absurd, there should only be one test for the entire application, one for main(), or whatever your entry point is.
I've never seen an answer to this problem coded as an intelligible, easy-to-follow guideline. Empirically and intuitively I think my "hotspot" approach kinda works in maximizing reliability.
I sometimes find myself refactoring helpers by folding them back into the caller when I'm pretty sure that the helper won't be reused and that the caller won't end up being too long.
But I don't think I'd like writing long functions as a system or philosophy; I think there is a threshold where a function becomes fat enough that it hurts readability.
Function length shouldn't be based on philosophy. Decide on a case by case basis. Start small and keep adding functionality until you find a way to divide the function along some sensible boundary. Generally you can say that long functions contain more functionality and the more functionality you have the higher the probability that functionality can be extracted into it's own function.
As you said, organizing code is an art form. There are no hard and fast rules, only your own judgement and experience.
Long functions are fine when you read sequentially, but when enclosed in a loop you have to scroll back up to recover context about the loop and this is disruptive. Carmack says it in the essay too
> Enclosing pages of code inside a conditional or loop statement does have readability and awareness drawbacks
I've seldom found use for editor based code folding, and I've been trying to use it long enough to remember being excited when vim first got it (20+ years ago?). I mean I'm sure there are people who find it useful, but to me it's one of those 'nice on paper' things. I think it's mostly that you don't know how much is hidden away, and you don't get a feel for the balance of say the if then and the else part of a decision. I've been working on refactoring a 3000 line single function Fortran 95 program lately, with 6, 7, 8 levels deep loops and if/then etc; you'd think this sort of work would be the prime candidate for folding. In my experience it hasn't helped my understanding at all though. 'jump to matching brace' and 'search under cursor' along with keeping an eye on the scroll bar work much better for me.
Anecdotally it (psychologically) helps some of my beginners students in Processing (young designers), because they feel less overwhelmed by their code when it has started to grow over their comfort threshold.
As beginners their brain isn’t used to selectively focus and block out the rest, so folding functions helps them focus.
It’s also a good way for me to end the moment when I show them how to extract some code out of the Processing main loop function into a separate function, “now that we invented our own command we don’t need to look at it all the time … and we could also put it in a library…”.
> There is a reason regular front-to-back books are more popular then choose-your-own-adventure books. You know where you are, and you dont spend all day flipping back and forth.
I don't think that has anything to do with it. It's a roleplaying adventure where you need to keep track of things on a piece of paper and use dies for encounters. People that want to read don't want to participate, just consume. Active vs passive.
Regardsless, splitting 4-5 lines of code out into a function and giving that a proper name can enhance readability significantly. And modern IDE tools make it simple to read that without disrupting the flow. Visual Studio and IDEA for instance has a peak definition that makes it trivial.
If that function does not do what it say, that is a different problem.
Instead of cramming the description into a NonPunctuatedFunctionName, just lay down some scoping brackets and write a comment.
IDEs are food at folding code.
If the function doesn't need to be called many times, it doesn't need to exist.
Code folding is not good for readability, and creates a significant overhead when reading the code. Even to the point where some static analysis tools default to highlight the use of "regions" as problematic.
Jumping up and down in the file while hitting f12 is not good for readability and creates a significant overhead when reading the code. C# finally introduced local functions in a recent version, which will hopefully remind other enterprise-y languages and IDEs that humans are programmed to read lines on a page in order.
In VS you can press Alt+F12 to peak the function, no need to jump back and forth. Regardless, programming in that way requires that the functions adhere to SRP and a clear name can be written.
We're talking about pulling out 2-10 lines of code out into its own function. There is rarely going to be multiple levels, and if there is, the original function would be way to large anyway.
I think it can be good to have functions that are only called in one place if they do really well defined things. load_settings, or similar is probably one called on one place, but if it makes sense to have it as a separate function.
I prefer clear naming over comments, because comments doesn't "travel" everywhere the named function/type/variable does.
If you've got an algorithm or hot loop that's hard to follow because it needs to be, some simplifying functions can keep it from overflowing people's brains, which IMO is the worst overflow error.
Functions are about code reuse, abstracting variables and easy handling of their lifetime, and most importantly invariants and specifying reasonably sized pieces of code (even if you "specify" only in your own head). And then maybe testing because once you actually have a spec, why not test it. If you need none of that, you don't really need to split. As a bonus small functions tend to be more naturally and easily programmed with less mutable variables, which is then easier to read and maintain.
Especially if you don't need reuse (or if you only need it in a very limited fashion) nor unitary testing, you can handle things using different techniques than splitting into functions. Like simply having a succession of blocks in a bigger function, with inline notes about invariants between each blocks, etc.
If done in a disciplined way I don't see much avantage about that approach than about just using functions anyway, esp with a good IDE or code editor. Of course nothing is absolute but if it does not fit on a screen maybe at least consider splitting it (and improving the doc, etc.; or at least just improve the doc). Blindly requiring e.g. 5 lines max strictly everywhere is insane though, never do that. There can be excesses on both sides.
As a more personal rule of thumb, I avoid splitting things too much esp. if the result takes more lines than inline versions, except if not splitting results in a high number of duplications (small in that case, but still duplications). A moderate increase of LOC when splitting is OK but when trying to split like crazy you can fall to N fold increases and at some point this is sometimes counter productive. On the other hand some very moderate amount of code duplication is ok, at least if the duplicated parts are not too far away and can be found easily, and not in an unreasonable quantity, modulated by the size of duplications. For extremely tiny duplications even in huge quantity, replacing by a function call does not necessarily makes sens, if e.g. that's (i + base) % mod repeated 30 times in the same file maybe it would give no advantage to replace that with rotbufindex(i, base, mod), the call is not more easy to reuse than the inlined formula.
I have a hard time imagining a large monolithic function where I wouldn't want to test any parts, ever. Also, I find that most people can only keep a certain amount of code in their head at the same time, so even if you have a monolithic function, I think you'd still want nested helper functions so that the more complicated portions of the code that are hard to read and understand stay manageable.
I feel similarly, and usually only separate out functions when there's opportunity for code reuse, which usually ends up with one or two very large functions per module. That's been fine in most of my career (fancy algorithms) or hobbies (games), but lately I've been writing a web backend, in which this approach can get very untenable very fast. Some of those stylistic differences are probably well-suited to what other people are working on.
I see it as a tradeoff of where you want your dependencies.
In analysis using the structured program theorem, you only have three tools, really: Sequence, selection, iteration. If you copy-paste, you lean on the sequence more; if you add a loop or a cursor structure, you're iterating; and if you add more names for things, you're selecting on the name. All three create dependency risks, but not in equal measure.
What a really long function mostly indicates is that the inherent dependencies are mostly sequential. This maps with the domains where they appear most naturally: game loops are notoriously single-threaded, embedded code often needs to operate to hard real time constraints.
I think the appeal in splitting out more names has something to do with linguistic comfort zones: Rather than examine long sequences, assume the functions used by the one you are looking at are trustworthy. Then you are "improving" the code each time you factor it out because you can read more names, and because each function is small there is little concern about sequencing errors. It's intuitive, but shows its flaws as soon as you use another lens like the structured program analysis; adding the new function makes it harder to examine the sequence across the function boundaries, causing the "flea-jump" code you describe.
What I've found works is to let the large functions accumulate and mature, then derive a new dependency - a function, class, or other abstraction - that will simplify maintenance. Not every problem is solved with a new function, sometimes it really takes a compiler to get the desired improvement.
Long functions usually do several different things. If you split them into functions by the function (ha), and name those functions accordingly, you will have short and readable functions.
This is advocated in some popular books, but IMO it's the worst reason to break up a large function.
You end up with the same amount of code, but now you have to jump around. In most cases it also forces you to turn local variables into instance variables, globals, or pass state in parameters in order to carry state between the new functions, which further decreases readability.
You might as well add an "index" at the top of the large function and get the same readability benefits from doing it.
What should be done instead is trying to find code that can be abstracted without hurting readability. Most of the time in large functions there are mixed abstraction levels, such as business logic mixed with I/O, generic error handling that could be somewhere else, or object transformation that exists because of different libraries working differently.
When you automatically refactor into smaller functions you carry those problems into the small functions and you get code that is probably harder to follow, generates worse stack traces and only looks good in a shallow inspection.
You shouldn't have to jump around, because the function names tells you exactly what it does according to reasonable expectations. You can just skip over it, if your not interested in its work.
If your function does an unexpected side effect, thats a problem with the code. Not with the idea of writing functions that do a specific thing.
I can't help but think people who complain about this, are just bad at creating functions at the correct level of abstraction.
It's also why i'm a functional fan, because it makes it more difficult for functions to do unexpected things.
function names tells you exactly what it does according to reasonable expectations. You can just skip over it, if your not interested in its work.
I understand the argument to be that by breaking everything out into functions it makes it too easy to hide what is actually going on and you can miss a lot of fine detail. If your code looks like
x=doA();
y=doB(x);
z=doC(y);
then it becomes very hard to spot for example that both functions doA and doC recomputed the same intermediate value that you could just compute once and reuse or that doB validates its input, but in this case you know it's unnecessary to do so, since you know that doA can only return data that is valid for doB. If everything was inline then these things would jump out at you a lot quicker.
Of course in many cases this doesn't matter and recomputing that intermediate value twice costs so little in the grand scheme of things that the extra readability is worth it.
Yes, but these about micro optimisation for video games. People here are complaining about this in standard situations. Where being able quickly grasp what function is doing is far more important then identifying recomputed work.
I think you might have misunderstood my post. I'm not advocating having large functions over smaller functions. What I'm advocating is instead of breaking functions based purely on a visual criteria or by grouping functionality, it's better to use proper abstractions when breaking the function.
> You can just skip over it
The point is that if I blindly break a larger functions by functionality, I still have to understand the deeper functionality. So, I can't just skip when I'm debugging or deeply reviewing/inspecting the code, or doing a rewrite or trying to understand it.
> I can't help but think people who complain about this, are just bad at creating functions at the correct level of abstraction.
My point was exactly that you should create functions at the correct level of abstraction, rather than just splitting large functions purely by functionality, like you would with a cooking recipe.
> If your function does an unexpected side effect, thats a problem with the code
But this is exactly what I'm trying to avoid. If you need side effects, keep them visible, by keeping them in the main function, instead of just blindly putting them into smaller functions.
> The point is that if I blindly break a larger functions by functionality, I still have to understand the deeper functionality. So, I can't just skip when I'm debugging or deeply reviewing/inspecting the code, or doing a rewrite or trying to understand it.
That's exactly what a well abstracted function should let you do.
It only comes difficult with global state, and unexpected side effects.
I think you've put your finger on the vital issue: catering to readability. A function call is great if you can give it a name that allows readers to read past it without navigating to the implementation. If most readers are going to be interrupted by needing to navigate to the function implementation to understand it, it might be better to inline it.
algebra does not have this problem. a key idea of functional programming is that it is essentially raw math. it is the for loops and array mutations that have caused the code to not be math and therefore needing 100s of lines to express simple dataflows.
Oh, how often I wish that someone would have followed this advice.
Working mostly on Java these days, Carmack's focus on efficiently doing the right thing sounds like fairy-tales of the promised land ;)
Just yesterday, I spent much longer than I believe is reasonable to fix a bug in a Java class whose sole purpose is to do an HTTP(S) call and determine if the URL is online (200 code) or declared offline (40x code).
The first thing that really slowed me down was that instead of having a clear location for the crash, a Java .war inside tomcat tends to vomit 40-50 lines of stack trace onto the console. The reported Exception was "java.io.IOException: Server returned HTTP response code: 400 for URL: ..." in HttpURLConnection.getResponseCode(). But debugging the issue, I noticed that getResponseCode exits just fine without any exception.
A bit of Googling around revealed that the root cause was this: https://stackoverflow.com/a/54837353/525608 Inside getResponseCode(), an exception is thrown, caught, and suppressed. And then later, it is retrieved out of HttpURLConnection.rememberedException and thrown around by a different HttpURLConnection function. But of course the Exception still had the old = wrong stack trace in it. So those 40+ lines of stack trace that Java garbled up? Completely useless.
I kid you not, Java source code with Exceptions is like your own private hell, especially because the control flow tends to be somewhere between "nested inside 30 functions" and "completely random". The latter usually happens when libraries do bytecode patching so that what is actually executed doesn't even match the source code anymore, and that's also the point where stack traces become almost completely useless.
I'd take a 2000+ lines monolithic function every day over this unholy mess that ["Enterprise Java" == stacking random libraries on top of each other] has become.
Just out of curiosity, are there any Java web frameworks with reasonable stack depths? After working on .NET Core codebases for several years (which rarely have more than 4-5 calls within the framework itself) I was shocked to see stack dumps from a Spring application I've been working on. Three to four nested exceptions, with up to 100 nested calls (and sometimes more) within each one. If you don't have a smart IDE folding the less interesting parts (and copious amounts of amphetamine), it's simply unreadable.
Haha, I think the worst part of that unintended joke is that the class description starts with Convenient. I wonder if the author of that text noticed how weird the whole thing is?
"Convenient proxy factory bean superclass for proxy factory beans that create only singletons."
In some future decade, when Banks start trying to migrate their "Enterprise Java" systems, anyone that can make sense of that bizarre and rather absurd model of computation is going to get well compensated.
Keep in mind that Spring was originally a reaction to the extreme complexity of Enterprise Java. I still remember a talk where they said everything would just be simple POJOs: plain old Java objects.
It was certainly well marketed. But in reality, Spring is also too complex. It has far too much dynamic dispatch, byte-code weaving magic and runtime failure. This is now causing them problems with GraalVM and fast startup times needed for cloud.
There is an architectural error in Java frameworks that I've noticed. (Other languages have it too, just nowhere near as common.)
Basically, the error is that when some system uses virtual function calls via an interface, what that's really saying is that: "This can be changed at runtime."
So if there's some function "void foo( IBar bar )", then what that's really saying is that "bar" can change at runtime. Literally a different implementation from call-to-call.
Is that actually required?
In 99.9% of such function calls, no: the implementation of IBar will always be the same concrete class.
To see a language that has gone to the opposite end of the spectrum, take a look at Rust. It use a lot of parametric typing, which is very flexible, but by default tends to use compile-time static types instead of run-time dynamic types.
Java never went down this path because it added parametric types very late in its design.
C# had parametric types in v2.0, which is when it started getting popular.
For some odd reason, Go still doesn't have parametric types (they call it generics), yet it seems to have remained surprisingly sane despite supporting generic interfaces.
My guess it’s the different idea of what an interface is and how errors are signaled.
An interface in Go can be defined by the client code and types just need to implement the functions. In Java Interfaces need to be defined before any code can use it. And exception handling makes code hard to reason about because it introduces hidden control flow.
I think Generics don’t give you that much leverage. Without them you need to duplicate the code for each type or class, but duplicated code is much easier to work with. Yes, it’s not really dry, but copy paste and search/replace is something which increases developer productivity, fixing bugs in code with too many generics decreases it.
And one thing to keep in mind: historically a lot of interfaces in Java were necessary because Java had no lambdas. In Go functions are first class citizens and you don’t need an Interface just to dynamically call different functions.
The appeal of generics is in writing data structure libraries that are primarily abstract containers. With parametric typing you can bind the contained type early, rather then being bound through runtime indirections and checks like a dynamic type system, handles to another container, interface compatibility, or other such tricks.
That's an important role, especially in performance-sensitive domains, but outside of that, it never seems to work very well, which makes it a easily misused feature in every language I've used that offers it. Go's original design basically took the form of offering this for the built-in structures but not letting you roll your own, ushering you towards interfaces if you really needed it, and because Go makes interfaces convenient, it mostly works.
I agree, the native array and dictionary support in go makes it easy to forget that they don't have generics.
BTW, as an example of generics abuse, crypto++ comes to mind. It's such a huge pile of templates referencing each other and exceptions being used to control the command flow, that I was once hired as a freelancer just to link the whole thing into a static library, because the company using it was unwilling to deal with the source code, but needed it for FIPS compliance.
It's an incredible engineering feat that they managed to let you mix and match cryptographic functions and containers and storage formats as you see fit, but having a C++ stack trace with 5 abstract templates in it makes it very painful to debug if stuff goes wrong.
Thanks for pointing out lambdas. I hadn't considered them, but I wholeheartedly agree :) My go projects tend to have small helper lambdas and lots of defer calls, which are effectively lambdas, too. Without them, I'd need a lot more source code and/or an IDisposable interface for pretty much every io class.
golang is basically Java pre-generics. Working with large golang codebases is quite annoying and error prone, partly due to the lack of generics. They didn't get it right at all.
Having built an offline-capable distributed database in go, I noticed that even "larger" projects in go tend to be a lot less source code than building something similar in Java.
And maybe that's just my personal style, but to me defer is much more important than generics.
I can't even remember how often I've had to debug Java bugs where someone forgot to close a ResultSet for an SQL query... And then you leak memory so badly that restarting tomcat with a daily cron is still not enough :(
With defer, you can expect people to write the cleanup code immediately after opening something, which makes it much easier to confirm that you're not leaking resources.
Java already solved this with try-with-resources, which the linter or IDE can warn you about (because the instance has to inherit from `AutoCloseable`). Not so with golang's defer, where it is much more likely that the developer can forget to call it.
I got a feeling "hardcoded" repeating code is underrated. I work in a code base right now that has alot of configs floating in from everywhere and functions taking function arguments with deep call tree.
I just wish there was long crappy linear functions with inline configs so that I could search this mess and see line by line what it did without having to keep so much in my head.
Maybe it was easier to write but it sure as hell is harder to read ...
Agree. In c++, I really like the JUCE editor exactly because of this. You have a GUI to configure things and it's stored as an xml config, but it'll also autogenerate a header for you with all the constant things hardcoded.
In my opinion, that provides one huge advantage that Java currently lacks: you can use tools like ReSharper to statically analyze control flow, find uninitialized variables or unused branches, etc.
Agreed. This design deserves a moment of leaning back in your chair with your jaw dropped: as if exception handling wasn't awful enough to begin with, this construct completely undermines the whole purpose of it and on top of it adds another layer of complication.
I feel for the poor guy who had to figure out the bug that was caused by this -- the waste of his time was completely predictable by this idiotic design.
I feel like Java('s libraries) forces this kind of exception inception sometimes. But there's a useful way to do it at least: set the original exception as the cause of any new exception thrown, and make sure your exception logger walks the cause chain. You'll have an insanely long stack trace in your logs, but at least it'll be accurate.
> set the original exception as the cause of any new exception thrown
100%, not only did this person make a pretty strange design, he has completely misunderstood how exceptions are to be used. even when he remembers exceptions, he fails to make them the root cause of the new exceptions he throws. truly a remarkable design, in all the wrong ways.
> this construct completely undermines the whole purpose of it
Absolutely, a bad design that says more about the designer of that piece of code, than the language (Java). Error handling can and should be discussed, but I hope people don't use this as an example to illustrate how bad Java checked exceptions are. This is just some nonsense that can be constructed when you misunderstand underlying concepts, which is possible in any language.
My jaw dropped the first time I learned that JSF was translated (not sure compiled is the right word) into Servlets full of print statements. And of course no JSF-based application has ever managed to produce valid well-formed HTML ever since.
It should be noted that you are now referring to the sanest part of Java Server Faces, i.e. its first template system: Java Server Pages or JSP. If think JSP that compiles to a bunch of Servlets filled with print statements is bad, you really have to look at the rest of that stuff; because it gets a lot worse :-)
The stateful design, which fills up templates with a bunch of hidden html-input fields, in order to open web-pages in the same "state" as you left them and all sorts of other crap.
You also need to look into Portlets, which "splits" up a web-pages into seemingly stand-alone components. These are then combined into horribly complicated Servlets as you deploy, where http paramters are namespaced with some random _portletId=lsøjfksjo1920 things in order to figure out which parameters goes to which "part of the web-page". Again, all of this is so bad, you'd think it was a joke; but it was Java's offering for web-development, and a huge number of people CHOOSE that technology in the early 2000's.
Oops, yes I mean't JSP not JSF. I almost want to learn more about JSF, out of morbid curiosity. I'm guessing one can't bookmark any pages and the back button is broken?
> out of morbid curiosity. I'm guessing one can't bookmark any pages and the back button is broken?
Everything is broken :-)
It really should be taught in schools as how not to design anything. I actually once worked on some extension (at least I think it was) of Java Server Faces which sent serialized Java objects as hidden html-input elements (base64-encoded binary blobs) back and forth with each request(!!). At least I hope that was not part of the core JSF technology. You really should run away from most of the early Java web-tech, it truly was horrid.
It's also strange that a language, which was so "connected" to the early days of the web (even had their own html element <applet>) ended up with such strange things for web-development shortly after.
I've worked with Java for more than a decade and I've seen plenty of stack traces that were helpful. Stack traces aren't the problem; the problem is that programmers silently swallow exceptions without either handling them or passing them up.
> Suddenly the Golang authors ranting about exceptions seems somewhat rational.
Error handling in Golang is nothing to write home about, either.
Frankly the only error handling that I haven’t actively disliked is in ML-derived languages with Either<L, R>, or Nim’s (and the latter really isn’t that far off from what we’re complaining about, but the community seems better at using it without the nested exception hell that Java and in my experience PHP can end up in)
> Isn't swallowing (and passing) exceptions pretty much what they are for?
No. You either pass them up or handle them. Silently swallowing them is a recipe for disaster, because nobody gets any info about what went wrong. In very limited cases -- e.g. "I don't care about this harmless error" -- it's correct to swallow them up, but Java devs tend to overuse it and disaster ensues.
> Should you instead handle every error close to their source [...]
You shouldn't. Handle those errors that it makes sense to handle, pass them up otherwise. Just don't swallow them.
> Should you instead handle every error close to their source
I find that this is very rarely the case. Micro-operations failing means that a MUCH higher level operation has failed.
If there's an error writing to the output file descriptor from deep within the decompression library, do you know what failed? The HTTP request. What can the decompression library, the IDS inspector, the logging system, the tracing system, or anything else do? They can just pass along the error.
Hence so much "if err != nil { return err}" in Go. The vast VAST majority of errors are not handled, they are reported.
It's extremely rare that something can recover from an error. And the things that can be recovered from are not exceptions in C++ (though they seem to be in Java, brrr).
Having worked on large golang code bases, I'd take Java exceptions anytime. golang error handling is not only atrocious (you end up having to manually re-invent exceptions anyway), but also error prone and dangerous.
Exceptions for deeply nested control flow drive me mental in other languages for the same reasons you’ve mentioned. Explicit Either<L, R> or similar have meant slightly more code at times, but much clearer to understand where data is going and why, in my humble opinion.
Exceptions are bad and languages using them should feel bad. Go and Rust got it right, it is a major reason why Rust code Just Works. Yet some Rust folks are moving towards exceptions, and I find it unsettling.
So Go got it right by calling it panic/recover instead of throw/catch? Should we petition other languages to also change the keywords and correct whatever is wrong with throw/catch? Is Pythons raise/except safe or did they miss it when they cleaned up the master/slave mess?
Panic/recover isn't used nearly as much for error handling in Go as in Java. An online resource being unreachable does not cause the net package functions to panic.
Used or not, they ARE exceptions, and you need to write exception-safe code.
fmt.Print can throw. And HTTP handlers throwing exceptions is silently hidden. If you don't write code assuming anything can throw, then your code is broken.
And don't judge exception by how they are in Java. That's just a clusterfuck. No other language I'm aware of gets exceptions so wrong.
Panics Are exceptions but they should be seen as a fatal error. It is very discouraged from being used and the official doc insists that you should not use them for error handling. Qualifying a feature as exceptional use does count, just as having goto in C++ (and go) is possible but should be (and is) generally avoided should impact your perception of c++.
> Panics Are exceptions but they should be seen as a fatal error.
I agree. Well, "fatal" needs to be defined. If an HTTP handler throws an exception, is that fatal for the whole webserver?
Java seems crazy about this. Exceptions seems like it's being treated as just another return value. And that leads to a mess.
But C++? What parts of the C++ standard library have unreasonable exceptions used for errors? (there may be some, I just can't think of any)
And note that you have to throw (no pun intended) away large parts of the language if you remove exceptions. E.g. you can't have constructors without exceptions. How else would you signify "those arguments you gave to the constructor are no bueno".
Go doesn't have constructors, so it's consistent with what it says.
Also see my comment here, about how common or not, discouraged or not, the mere existence of exceptions in a language changes how you must write code to not have it be buggy: https://news.ycombinator.com/item?id=25275580
Yes, in C++ 'goto' is a code smell. It's not in C (greatly used for error handling), but C++ has RAII so `goto` should be rare outside of "clever" code (where "clever" is rarely good).
The mere fact that C++ doesn't have 'finally', and Java does, tells you a lot about how exceptions and RAII differs. If you write a macro for "finally" in C++ then you're doing it wrong.
Pretty much all of my `catch` clauses are in main() (or the root of an event handler), to pretty print the error and/or log to central service, or in a top level event like HTTP handler. `catch` should be about as common in C++ as `rescue` is in Go.
I feel like you're missing the whole point, here. An interface that is extremely hard to use correctly without turning small bugs into major outages is not a good tool.
And one way to make sure this doesn't happen is to write exception-safe code, because Go has exceptions.
You could also argue that your C++ code shouldn't throw, and I agree. It should very rarely throw. But when it does it should be safe.
If Go had simply not had exceptions then this would have been easier.
> It's called panic in Go so panicking should be hopefully rare for the peace of mind
If you write a web service that hits a bug that panics about once per million requests, and you run 1000 qps, that means your Lock();dothing;Unlock() will deadlock the whole webserver once every 15 minutes.
If you write exception safe code, then it does not.
I was replying to the comment above -- Go or Rust did not simply rename exception related keywords, they have a completely different approach.
I don't know of any language that does mudane error handling with exceptions that is not a mess. C++, for instance, is extremely difficult to write exception safe code in.
I find C++ exception safe code to be pretty much trivial. Once you get used to "no naked resources" RAII just makes everything exception safe automatically.
I'd argue that Exceptions are good, because they attempt to illustrate all the ways code can go wrong and because you typically need to be at least aware that you're ignoring them.
However, i'd argue for allowing no unchecked exceptions at all that can be thrown at runtime and instead forcing developers to handle every fail state, that can be encountered in the code that they call.
If a method that you call can fail in 60 different ways, you should at least handle the 20 of them that you understand in their own ways and the rest in a blanket statement. All of that should be checked at compile-time, of course.
Java, .NET and most other ecosystems (both languages and frameworks) don't really seem to want to bother with that, though.
One of the problems with exceptions is that they bubble up the call stack, and then you've got layering violations: higher level modules should never depend on lower level modules. The "checked exceptions were a failure" meme goes towards this direction, it results either in nasty boilerplate because you've got to handle and re-wrap the implementation-specific exception in something more general to avoid leaking things that should have been hidden (like Hibernate does), or unsound error handling.
Maybe the solution to that is
1. Only allow checked exceptions, and force error handling.
2. Only have 2-3 exception types, not user-extendable: Retryable, Unrecoverable, and Error (for non-user code errors like OOM). OTOH, how to distinguish between different exceptions thrown by the same method, and how to add additional error information (like status_code) would become a problem. Javascript doesn't seem to care tho?
A Result type à la Rust solves most of that: you might still have to wrap lower layers, but at least the noise & syntax is much more sane (never thought I'd say that about Rust!).
JavaScripts Error type is so loosey-goosey that even TypeScript doesn’t try to work anything out about it. e: any; parse it yourself in your catch haha
In Java there is Exception (checked) and RuntimeException (unchecked). Depending on which base class you use, the compiler will force you to handle them or not.
There are times that exceptions work really well, I think the problem isn't so much that they exist, as they're too often used wrong. Getting an exception due to a real error is nice, getting them as a generic flow control device, not so much.
Exceptions form a secondary, unseen form of control flow. A program can exit via ‘return’ calls (as normal), or at any time by some internal function way down the stack throwing an exception for something that could have been an error code.
Exceptions are not Structured Programming: they are worse than goto, because at least goto is scope-local. If, maybe, the language enforced that all throwable exceptions are declared by the prototype, then it’d be alright. But I don’t know of any that do. So, given a function call, will it error out? Can you know without inspecting the source, if you even have it?
They will be used for flow control in small, individually innocent ways until they've accumulated into the type of hell the OP of this thread is describing.
A solution is to use languages that make explicit control flow available, instead of pretending that exceptions are a good way of performing non-local jumps.
Have you investigated Common Lisp, its use of BLOCK/RETURN-FROM and TAGBODY/GO, and how it is possible to close over these lexical constructs to achieve non-local jumps to predefined points on the stack ? All of Common Lisp's error handling system is based on this primitive mechanism (and therefore written in Lisp itself).
> You need to write exception-safe code. But also you're not allowed to use exceptions. So it's the worst of both worlds.
I wasn't aware that this is a real problem in Go land, thanks for the explanation. I wouldn't say the language is "great", was just using it as an example of a modern language where there is a consensus that "they got it right" and not using exceptions, even though error handling is still tedious right now.
I think in Rust you have to consciously fuck up the panic handler to cause similar issues, but I'm not sure.
And this gets extra complicated by the fact that in Go defer runs at end of function, not end of scope. This makes every single for-loop that needs to lock anything hard to read and annoying to write.
for _, a := range stuff { if err:=func()error { mu.Lock();defer mu.Unlock(); return a.stuff()}(); err != nil {return err}}
You use RAII, to put it in C++ terms. To use your variable names, someoneElsesObject is inside mu, and is returned by mu.lock. When it goes out of scope, it will unlock.
You can also use defer if you want, but I've never seen it in real code, it's more error-prone and not as flexible. You can build it on top of RAII.
I'm saying "mu.Unlock()" except when deferred, is essentially always a bug. At the very least it's a bug waiting to happen. You need to prove that everything between Lock and Unlock is exception-safe. And that's rarely possible.
As I've said elsewhere you cannot rely on panic triggering program exit, since e.g. HTTP handlers swallow panics.
The fact that you seem to be saying you never see an Unlock deferred proves my point that saying "Go doesn't have exceptions" hurts Go programming. And Go code is in fact full of bugs because there's all this exception-unsafe code.
C++ is naturally exception safe, because RAII. Where it's not exception safe it's because RAII was not used.
Even something as simple as:
mu.Lock()
stats[metricName]++
mu.Unlock()
will throw if there's a path where stats[] map was not inited, and if called in an HTTP handler will leave the lock in place, leading to probably a deadlock of the server. Not great.
The Rust code would be, roughly (I tried to keep names similar even though Rust would use a slightly different naming scheme)
let someoneElsesObject = mu.lock();
println!("{}", someoneElsesObject);
return;
when someoneElsesObject goes out of scope, the mutex is unlocked. This happens no matter how it goes out of scope. You cannot forget to do it, because the only way to get access to someoneElsesObject in the first place is locking the mutex, because mutexes wrap the data that they are protecting.
https://crates.io/crates/defer exists, but it would be weird to try and use it here, because it can't really be combined directly with this. I guess in theory you could put drop(someoneElsesObject) in the defer block, but like... that already happens for free.
Well, no you can't, because that leaves no way to directly return from the function.
Hence my example where the lambda has to return an error, and the loop has to check for the error. It's A LOT of boilerplate.
Edit: Actually now I don't know what you mean. You clearly replied to my comment that gave a clear example with a return from within the lambda, so what did you think that I didn't know? You took my example and removed extremely commonly needed functionality. So… huh?
golang definitely did not get it right. golang errors are error prone, I've seen several cases in production code where they were either accidentally ignored or overwritten. Makes it terrible for writing safe software.
I've read this before and I agree with it. As an elixir programmer, it frustrates me to be reminded of this article because it reminds me of the only thing I find torturous about programming in elixir.
Without a way to exit a function early, you either end up with deep nesting (which becomes hard to read/follow after 2-3 levels), or a lot of small function calls to continue/break. In many cases, you can use a `with <-` but if the statement only returns a variable, you need to wrap it into a function (are have a hard-to-read guard clause, which will only work in some cases) to support matching:
Say `User.load` is out of your control and it returns nil | user. I'd love to do:
user = User.load(id)
if user == nil do
return :not_found
end
But I have to either introduce nesting, or decrease the readability with either a guard or a function.
Guard:
# if we just let `nil` flow through to the `else` we won't be able to tell this `nil` from another
with user when user != :not_found <- User.load(id) || :not_found
Function:
with {:ok, user} <- get_user(id)
...
defp get_user(id) do
case User.load(id) do
nil -> :not_found
user -> {:ok, user}
end
end
I have always wondered why functions have to be declared inline when it is more up to the caller to declare whether to inline the contents of the function at call site.
So instead of
inline int f(int x) { return x * x; }
int main() { f(3); }
you'd want to do
int f(int x) { return x * x; }
int main() { inline f(3); }
I have seen this implemented in Zig for loops (inline for, inline while). Is there support for this in other languages?
In C this is because the inline qualifier has at least as much to do with linking and controlling symbol definitions as it does literal inlining of the code into the caller.
Relatedly, in languages like C that have global mutable state, it's the callee who knows better whether and when it's safe to inline code, not the caller. Ditto in languages (like C) that can have complex linking semantics, multiple function definitions, etc.
Modifiers like inline and register are left overs from the days the compilers had to work in 64KB or less and needed every help they could get from the developers.
Nowadays most optimisers ignore them anyway and use heuristics to decide if and when they actually care, and also inline when not asked to.
If you really want the original behaviour you need to use non standard modifiers like forceinline.
Sort of. I think CL has the right direction but the details are wrong. This pattern requires declaring the function notinline, and that is a much stronger statement than just “wait I take it back”. Using this pattern, the function can ONLY be inlined in the contexts where you allow it. Similarly, compiler macros are disabled for notinline functions. I would like the compiler to use its normal judgement for inlining and continue using compiler macros for call sites where I don’t provide an inline declaration.
I think you overlook that inlining in common lisp is a lot less trivial than in very static languages like C++, because it is perfectly valid to redefine functions at runtime. This does obviously not play well with the compiler randomly inlining things behind your back. You can fix that by keeping track of where you inlined f and then re-compile all the inlined sites as well, but that obviously has significant pathologies as well. Having said that, performance oriented common lisp compilers like sbcl (and its predecessor) will generally allow you to block-compile stuff which will get rid of the extra indirections at the cost of less runtime flexibility. IIRC that will also mean the compiler will use its judgement whether to inline a function.
Yes, the standard allows this, but that doesn't mean inlining is straightforward, in the sense that the compiler can't just magically figure out unaided whether to inline stuff or not and all is well as the GP wanted. Whether two functions were defined in the same file has basically zero bearing on the desirability of inlining, so the fact that recompiling a whole file rather than an individual function is not particuarly onerous doesn't really help that much. Also, correct me if I'm wrong, but I don't think anything that sbcl does if you pass :block-compile t goes beyond what would already be allowed by the standard as default (by contrast, CMUCL's block compilation supported multiple files and thus essentially also whole-program optimization). The fact that it's not the default (despite likely non-trivial performance benefits) probably indicates that a majority of users would find this behavior problematic as default, regardless of whether it is standard conformant.
> Block compilation or whole program compilation are used for delivery of applications.
Mostly, sure just like LTO in C/C++. But if I were doing any scientific programming with common lisp I'd try to use block compilation in a few select places during interactive development as well. I'd imagine that for some code this would get rid of a lot of boxing and unboxing as well as type checks, in addition to getting rid of indirection. I'd not be surprised if you could get an integer factor speed up if you got many small functions which mostly operate on doubles or double arrays. I should probably give it a try just to satisfy my curiosity.
Inline in c++ doesn't do what many people think it does, i.e the keyword is neither necessary nor sufficient for performing the inlining optimization (although the compiler might treat it as a weak hint).
Instead is a directive to the compiler to ignore the One Definition Rule for the function and it is an artefact of the header inclusion model of compilation.
In a language like guile scheme or racket this could be implemented as a macro. In guile, the (define-inlinable ...) Is already defined as a macro, and while doing it in reverse would be harder it could be done and tested in an hour.
I would probably do it by overriding (define ...) to make it also store the source of the function somewhere and make (inline ...) just insert it as an anonymous function:
Yeah, I believe the point was to inline source code so it's easier to reason about as you read it, not to inline at compilation for some runtime advantage.
Nice food for thought. For those like me that read the comments first:
"To sum up:
If a function is only called from a single place, consider inlining it.
If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that.
If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters.
If the work is close to purely functional, with few references to global state, try to make it completely functional.
Try to use const on both parameters and functions when the function really must be used in multiple places.
Minimize control flow complexity and "area under ifs", favoring consistent execution paths and times over "optimally" avoiding unnecessary work.
This is one of those pieces of writing which I always tend to re-read every time it pops up on one of my feeds.
What I like about learning from Carmak's approach to programming is that he's always seemed to work in domains where he had to squeeze every ounce of performance out of what he's doing. I think that kind of programming forces you to focus on what's actually true about computers, and about programming concepts. As programmers, when we talk about design patterns and best practices, a lot of what we talk about actually comes down to opinion, philosophy and aesthetics.
When you are really forced to get the right bits to the right place as quickly as possible, it focuses the problem of software design in a very specific way. It's interesting to observe that for Carmak, that appears to be in the direction of removing as much abstraction as possible.
This is an argument that I've found myself making fairly regularly, and often frustratingly after attempting to hunt down a bug through 10 levels of abstraction that didn't need to be there. Add "debugging complexity" when it's warranted, but don't add it by default. More files and functions mean more jumping around and more potential blind spots.
That being said, John Carmack makes the very important point that this decision (whether to inline or not) should be made case-by-case. Someone in these comments used the flow of a novel as an example and I think it's actually the ideal way to describe the balance you want to strike with code as well. Using "The Tortoise and the Hare" as an example:
Inlined too much: "The race began. The Tortoise took a step. The Hare took a step. The Hare took another step. The Hare took another step. The Hare took another step..."
Inlined too little: "The race began. The Tortoise won!"
It's interesting because some authors actually make mistakes in this realm - overdoing it on details, or leaving the reader confused without enough context. Thinking about the flow of code like the flow of a novel is probably a good idea - both should delicately balance complexity with readability. The best examples of both often describe complex and nuanced concepts while remaining surprisingly straightforward.
If you think this example is contrived and it's always obvious where to abstract and where to inline, it's likely you're over-abstracting. The question of "where to cut" varies greatly from one bit of code to the next and often needs some thought to get right. John's list under "To sum up" here is a great set of guidelines to answer that question.
I read this article a few years ago when I was in a "CARMACK IS AN ALL KNOWING GOD" phase. So I started using these concepts at work, for business stuff. It ended up becoming an absolute unreadable, illogical mess that only I could possibly understand. Proceed with caution and don't blindly assume that this is the best approach for all situations like I did just because Carmack did it in the Quake engine.
Maybe I'm oversimplifying here, but game engines are like marble statues. You start with the block of marble, you sculpt it and at some point it is finished.
Ongoing maintenance issues are not really top priority, the game is not going to be a living software project for that long.
Game engines are usually maintained for years if not decades, as they are often enormous investments. However, it's more common to fork (or at least freeze) for a specific project, so in that sense there is more room for ad-hoc solutions.
Out of curiosity since I haven't touched C/C++ since university, but has much changed since 2014? Or even 2007 in the world of C that a date is needed?
I (naively) thought that C was pretty much "done". And any articles written recently were just for gilding the lily.
I'm going to disagree with sibling and say that a lot has changed since 2007. The biggest game changers arrived in C++11 like auto and move semantics, but C++17 also introduced some interesting stuff like optionals, destructuring, folding expressions and lambda capturing this by value.
This quote from him is very telling how good engineer he is:
> That was a cold-sweat moment for me . after all of my harping about latency and responsiveness, I almost shipped a title with a completely unnecessary frame of latency.
Excellent article. Haven't read something so opinionated but clear and well argued about programming in a while. His summary rules at the bottom are great:
To sum up:
If a function is only called from a single place, consider inlining it.
If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that.
If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters.
If the work is close to purely functional, with few references to global state, try to make it completely functional.
Try to use const on both parameters and functions when the function really must be used in multiple places.
Minimize control flow complexity and "area under ifs", favoring consistent execution paths and times over "optimally" avoiding unnecessary work.
I don't think I understand the relationship between inlined functions and functional programming established in the article.
As I see it, whether you inline or split into functions or subroutines, you can have functions be pure, in the sense that they don't mutate state. In my experience with functional programming, splitting code into smaller functions is not a problem but rather encouraged too, for the sake of readability (so no different than other paradigms).
There must be something that escapes me, possibly at the systems level. Can someone please explain?
Carmack agrees with you. He wrote that functional functions are the ideal, but often not possible, and non-functional functions are bad and should be inlined. This is the "functional core, imperative shell" pattern where you have as many small neat functional functions as possible, and one big nasty non-functional main loop for only your mutating state, where every line must be studied carefully in relation to all other lines.
The "functional core, imperative shell" pattern is how I naturally gravitate to writing code. It helps so much with readability and understanding what the code does.
Trying to read other peoples code that jumps around between many functions with side effects, often being nested 3-4 levels deep, hurts my soul.
I guess it makes some sense in a tight game loop. I really don't think most non-game programmers should adopt this style. I have actually seen code where lots of stuff was inlined. This resulted in functions that are many-thousands of lines where nesting was so deep that at a particular point you would have no idea inside what 6 levels of control structures you were.....
"If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters."
I have years avoiding to do this because there is a known code smell that happens when a function execution changes based on parameters. My personal rule is to be splendid with functions.
From JC: "the exactly predicted off-by-one-frame-of-latency input sampling happened"
I'm assuming this happened because the game, starting with an empty input buffer, queued input for the subsequent frame, rather than use the input as-is and process it right then and there?
There is a reason regular front-to-back books are more popular then choose-your-own-adventure books. You know where you are, and you dont spend all day flipping back and forth.
I also find that when you start splitting up your functions down to 5 lines, it becomes almost impossible to differentiate what the different functions do clearly, and that makes good naming almost impossible.
EDIT: Also, I dont understand why people complain about functions being longer then 5 lines making code hard to navigate, and then turn around and write header only files with 10000 lines of code....