Hacker News new | comments | show | ask | jobs | submit login
Small functions considered harmful (medium.com)
172 points by grey-area 11 months ago | hide | past | web | favorite | 115 comments



During my earlier years, I would get into all types of dogmatic debates, such as "DRY-considered-harmful" or "small-functions-considered-harmful". With experience, I've realized that such abstract debates are generally pointless. Any principal can lead to bad results when taken to an extreme, or badly implemented. Thus leading to people declaring that-principal-considered-harmful, swinging the pendulum to the opposite extreme, and beginning the cycle all over again.

Now, I find such discussions valuable, but only in the context of concrete examples. Devoid of concrete and realistic examples, the discussion often devolves into attacking strawmen and airy philosophizing. If this article had presented realistic examples of small functions that should have been duplicated and inlined, I think we can then have a much better discussion around it.

That said, I do have to offer a word of warning. It's possible that the author is a good programmer who knows how to incorporate long functions in a way that is still clear and readable. Unfortunately, I've had the misfortune of working with horrendous programmers who write functions that are hundreds of lines long, duplicated all over the place, and are a pain to understand and maintain. Having short-functions and DRYness is indeed prone to abuse, but it still works as a general guideline. Great programmers may be able to ignore these guidelines, but at least it prevents the mediocre ones from shooting themselves (and others) in the foot.


I think it's more likely that your "great programmers" simply understand the difference between the same functionality and accidentally similar functionality. The latter is where you have two use cases that are very similar, so you spend all this time deduplicating. Then one of the use cases changes... The correct response would be to duplicate the code again, because the two use cases are no longer similar. In reality, they should have never been combined in the first case. They weren't the same; they were only accidentally similar.

But instead what you usually see is minor tweaks to the common functions. Pass in a flag here, tweak the inputs there, add an if statement over yonder... And before you know it, it's all a terrible tangled mess that is full of branches and technical debt. The two use cases have the same functions, but don't even follow the same branches within the functions.


Yeah, I've been saying that removing accidental similarity isn't improving code but compressing it. "Huffman coding", if you will...


Thank you for putting that into words, and rather few too. I have only felt that anti-pattern before, not had words for it. Does it have a name? If not, we should give it a name!


I believe this pattern is referred to as accidental-duplication. Some existing discussions on this:

https://softwareengineering.stackexchange.com/questions/3000...

http://www.informit.com/articles/article.aspx?p=1313447


I also have a bias against '...considered harmful' titles, but that does not invalidate the points being made.

The author is responding to specific examples of respected and influential people advocating a very extreme and dogmatic approach to the issue. As this article does not advocate for large functions, but against going to extremes, it is saying the very things that you say you stand for.

I don't have a good example on hand, but imagine a complicated mathematical expression. Up to a point, writing functions for parts of the expression might well make it easier to understand, but encapsulating each and every operator in a function will not. That, of course, is an absurd idea, but that is the point: it is no more absurd than the idea that anything you might want to see explained in a comment would be better done by creating a function.

It is possible, I believe, to write functions that do less than one thing, and I will be looking for examples - and also for examples where a desire for small functions has been accompanied by the creation of hidden dependencies.


"Considered harmful" essays considered harmful

http://meyerweb.com/eric/comment/chech.html


Author of the article here.

A few things. First, I'm not sure you actually bothered reading the article, because the conclusion very obviously states what you're suggesting here, in that:

"This post’s intention was neither to argue that DRY or small functions are inherently bad (even if the title disingenuously suggested so). Only that they aren’t inherently good either."

As for the lack of examples, I'd imagine most people can extrapolate and draw analogies from their long and storied programming careers of the sort they stake a claim to. If there's something you'd like explained in more detail, let me know and I can try cooking up a contrived example, but it will remain somewhat contrived all the same.

This article isn't for duplication either - as a matter of fact it's against absolutes and blanket generalizations like "Code smells if your functions are longer than 3 lines", which is something I often come across.

I think what you call "mediocre programmers" -- personally though, I'd like to be more charitable and think of this demographic as the average programmer or the vast majority of programmers -- are also the ones most likely to cargo cult a piece of advice that's sold as "programming wisdom". Any "advice" needs to be taken with a grain of salt. The article goes on to state how important this is, as well:

"As with most other things, “the ideal” lies somewhere in between. There is no one-size-fits-all happy medium. The “ideal” also varies depending on a vast number of factors — both programmatic and interpersonal — and the hallmark of good engineering is to be able to identify where in the spectrum this “ideal” lies for any given context, as well as to constantly reevaluate and recalibrate this ideal."


Hi. I'm not sure you actually bothered to understand my comment, because it's making 2 main points, which you have glossed over.

1st: The importance of concrete examples in illustrating your point. Sure, we can all think of and agree on extreme examples of short-functions or long-functions which are bad, but what about more realistic examples. In Clean Code, Robert Martin presents many realistic and reasonable code snippets, featuring long-functions, which he then refactors into a form (ie, short-functions) which he claims is more readable and maintainable. For those specific examples, do you agree or disagree that his changes are an improvement? If you happen to agree, can you present other realistic examples of your own to illustrate your point? Doing so would allow us to have a more grounded discussion, comparing and contrasting two realistic alternatives. It would also allow us to better understand where you draw the line between too short and just-right.

2nd: Yes, too-short-functions and too-long-functions are both bad, but I disagree with your false equivalence between them. In my experience, the latter is a bigger problem than the former. I say this because this is the mistake that most mediocre programmers make. In my intro CS classes, I invariably see most people default towards writing their entire code in a single main function, with lots of code duplication, and this tendency tends to stick even afterwards. At my recent companies, I've even seen senior developers, people with fancy degrees and 200k+ salaries, write code that's horrendously hard to understand and maintain, because it's endlessly duplicated and squashed into a single long function. I don't doubt that there are those who take short-functions to an extreme as well, but in my experience, long-duplicated-functions are a much more prevalent problem, with greater downsides as well. Hence my point that short-functions and DRY works as a much better guideline than the converse.

Granted, my 2nd point above is my subjective opinion, and I agree with you that it's better to aim for the ideal, than to settle for erring on either side. So if you don't want to go down that rabbithole, I understand. However, I do think that presenting concrete realistic examples will go a long way towards enhancing this discussion. Looking forward to your follow-up blog post.


>At my recent companies, I've even seen senior developers, people with fancy degrees and 200k+ salaries, write code that's horrendously hard to understand and maintain, because it's endlessly duplicated and squashed into a single long function.

Ever wondered that they might be senior and commanding those salaries because they think and program a certain way? Have you ever discussed this with them? Many senior developers tend to shy away from abstractions in my experience, and they do it for a reason.


We're going far off into a tangent but to answer your question:

- the code in question constantly produced production bugs

- the code in question was extremely hard to debug when said production bugs surfaced

- everyone complained about that code being problematic, including other more senior engineers with 300k+ salaries

- all the above problems went away and everyone was pleased when I broke it up into smaller pieces


Having short-functions and DRYness is indeed prone to abuse, but it still works as a general guideline

The author disagrees. She's saying people tend to use too much abstraction, so the DRY principle (not principal) is actively harmful as it encourages more of a bad thing. The trouble with abstraction is it hides things from view.

To make this concrete consider a codebase with lots of 1-10 line functions spread amongst lots of files, vs one with half the functions in half the files which are 10-20 lines instead. The work has to get done somewhere, and lots of shorter functions of just a few lines and keeping it dry (no repetition) tends to lead to lots of classes/files/modules which don't do much alone calling each other, making it much harder to reason about execution or read the code.

Rather than functions with just a few lines or with hundreds of lines, I imagine she's talking about functions 10-100 lines long on average, with exceptions where reasonable. There is a middle ground here where a function is more readable and crucially doesn't force you to jump around much to find out what work it does.

There's a good quote from Sandi Metz in the article: "duplication is far cheaper than the wrong abstraction"


I work primarily on lower level components in biotech and I see this whenever I start to dig into the "Enterprise" code base. It takes me _far_ too much time to find the code that, you know... actually does something. While searching I'm just wading through file after file which contains nothing more than types and small wrappers around other small wrappers (around other small wrappers...) It drives me nuts. So much code and so little actually happening.


But how much of that is small functions, and how much is just people blindly applying pattern after pattern while optimizing for business use-cases that are unlikely to ever emerge?

That doesn't mean the patterns are bad, it doesn't mean small functions are bad, it just means the person who implemented them lacked good judgement. Good judgement is somewhat subjective. Fortunately, we can combat this subjectivity by looking at what the long term $ of a decision (programmer time, onboarding time, resources used...), how a particular decision advances an organization's objectives, etc.

We need functionality, not code. That functionality exists to support (what should be) clearly defined interests and goals. "Right" code is code that serves the functionality, and therefore the goals, best.


It's the latter. Too much abstraction for no good reason because it's "what you do". I wasn't throwing out an opinion on the primary concern of the topic. In my experience, good devs write good code.


Rules like DRY or keeping line counts low are based on surface metrics and definitely lead people to over-fit the model based on incomplete information.

Underlying principle of maintainable code is "simply" that it is a concise and flexible expression of knowledge. The "simply" part is in quotation marks because, of course, it takes a good amount of self-awareness and meta-consideration to understand why some ways to express knowledge are better than others. There is also a limit to how "perfect" it can get, because it really depends on the audience too.

This is also why it's OK to let it slide sometimes and move on to more interesting topics - there are diminishing returns on splitting hairs, once broad strokes of good encapsulation and basic readability are applied.


DRY is not a surface metric, although it is often misunderstood as one.

The original formulation of DRY was:

"Every piece of knowledge must have a single, unambiguous, authoritative representation within a system."

Elimination of syntactic repetition is not an application of DRY, although application of DRY will often incidentally eliminate syntactic repetition.


> Rules like DRY or keeping line counts low are based on surface metrics and definitely lead people to over-fit the model based on incomplete information.

The principles are actually sound, but letting metrics drive the design process is a problem all on its own.

https://en.wikipedia.org/wiki/Goodhart%27s_law


Unfortunately, I've had the misfortune of working with horrendous programmers who write functions that are hundreds of lines long, duplicated all over the place, and are a pain to understand and maintain.

As a maintanance programmer I had that misfortune very often. But I don't believe it's really so relevant to the article.

Having short-functions and DRYness is indeed prone to abuse, but it still works as a general guideline.

It doesn't, it's more of a heuristic and, as such, a way of detecting that you might be doing it wrong, not a sound rule to apply.

The problem is exactly that one: people that use "write short functions" as something that you must do. I've suffered such a fool as a boss and it was really sad. The man believed that two lines function (and the calling spagetti that followed) were a superior system to ten lines functions that accomplished a definite goal.

In general the problem is people that use general rules without understanding them.


I think even most people who believe in short functions, would disagree with your boss. The following SO thread does a pretty good job summarizing the short-functions principle.

https://softwareengineering.stackexchange.com/questions/1334...

The whole point of a guideline is that it's not a rule. It's a heuristic that people can selectively apply, based on the specific context and their level of experience.


I haven't yet seen anybody pushing for "short functions" literally that didn't want a hardline on the largest function you should create. Thus, I haven't seen anybody explicitly using it as an heuristic instead of a rule.

People have all kind of heuristics that correlate with short functions, like DRY and single responsibility principle. People do apply those as heuristics. But function length is a number, and it is very hard to keep your conclusions fuzzy when operating over numbers.


This basically boils down to 'bad code is bad'. DRY and short-functions are supposed to be guidelines, not rules, and when you find yourself violating them, you're supposed to question why. Sometimes there's a good reason for it, often not, and it will help newbies get to the point where they can answer these questions themselves.


Author of the article here.

I find it really interesting you mention this:

>>"when you find yourself violating them, you're supposed to question why"

This is essentially what I see a lot of programmers (including myself) who've internalized these rules tend to do. However I wonder if we've got it backwards - in that, should we be thinking more in terms of about how we design our abstractions upfront and optimize for allowing ourselves enough wiggle room instead of applying the so called "best practices" right away and only stopping to think if something might be wrong when we explicitly violate some of these "best practices" like DRY or small functions or what have you.

I find a lot of us tend to lose sight of the forest for the trees when we focus on cosmetic things like function length. It's a bottom-up view of the abstractions we've built, and maybe we actually need to think about the top-down design more thoroughly?


> Any principal can lead to bad results when taken to an extreme, or badly implemented. Thus leading to people declaring that-principal-considered-harmful, swinging the pendulum to the opposite extreme, and beginning the cycle all over again.

This summarizes perfectly the whole OOP vs FP debate.


Also most of political history.


Isn't what you're saying kind of exactly the point of the article in the first place? The conclusion even explicitly points out that the aim was not to actually claim small functions are harmful(!), but rather that the converse - small functions are good - isn't inherently true either.

And while the article doesn't make fully concrete examples, it does propose the outlines of a few; enough to get across a meaningful message (to me). Also; arguments that lean on concrete example are more at risk of attacking strawmen precisely because a concrete example can be flawed in irrelevant ways that obscure the underlying principles. Not that I'm opposed to examples - just that most examples are almost necessarily simplifications, and choosing a sourcecode simplification is not categorically different or better than choosing a pseudocode or diagram simplification.


I think often times people want to say "In this particular case, for this particular situation, with my particular constraints, X was useful for me" , but they end up asserting it for a much larger context.

The people who don't agree with that, ironically also end up making the same mistake when they assert that X is not at all useful, but do it in the reverse way by using their particular scenario to guide their statements.

I've come to believe that the vast majority of peoples opinions on programming are BS. I've come to rely on simple practical things. If you try something, and it works for you, you can safely ignore the people telling you you're doing it wrong.


"Considered Harmful" Essays Considered Harmful: http://meyerweb.com/eric/comment/chech.html


I find a lot to agree with here.

It's all very conceptually neat and (if you're lucky) easy to read from the top down, where you enter one function and read off a list of other functions which are called in order.

But then if you look into any of those other functions they also call more functions and so on, several levels deep. And when you have to debug someone else's code because the data after function 15 of 17 isn't quite right, and you have to unpick all the places it's been passed through in slightly different versions and slightly different lists of parameters, it can be a nightmare.

Same with my linter telling me to close a file within a few lines of opening it. Personally I'd rather keep all the file-munging code in one place rather than scatter it down a rabbithole of nested functions as an exciting Alice In Wonderland story for future developers.

I try to come to a compromise on these things when working in a team, though...


I tend to agree, but not with your particular example; the boundry between how you manage a resource and what you do with it seems like the perfect place to put an abstraction. The file mingling can be in one big function, but there are benefits to separating it from the open/close logic.


Wouldn't the compromise be unit testing?


> Wouldn't the compromise be unit testing?

I agree. Unit testing is done to ensure that all components work exactly as they are expected to work. If any component fails to work but developers only notice it "because the data after function 15 of 17 isn't quite right" then it appears that something is very wrong with the way those units are being tested.


This is a very valuable comment. Indeed pushing the abstraction to small functions actually remove abstraction in the "system of function". So there is a tension.

what I find super interesting in your comment is that you solve this as a team.

So, my point is, this demonstrates (a bit) the fact that coding is also a social activity.

Of course, anything a team produces is a reflection of the team itself.

But here you pinpoint the fact that you adapt your solution to the team as well as to the problem itself (anyone else would have said "I choose this solution because it's the best for the problem")

I like that :-)


It seems like the author's idea of the term abstraction is limited to substituting procedures with functions (or worse, class methods). The claims that "all abstractions leak" and that adherence to DRY makes code "hard to follow" is what gives me this impression. This line of thinking happens if you think of code in procedural terms.

And if your sense of abstraction is to hide procedural side-effects behind function applications then yes... I can see where you might get the idea.

A real abstraction like lambda doesn't leak. Using Monads to compose side effects doesn't leak. These are mathematical abstractions and we use them all of the time: even in programming and even if you don't identify as someone who's any good at maths. Learning the lambda calculus, algebra, and first-order logic will take you much farther than thinking in procedural steps.

Composing more interesting functions from smaller ones removes so many errors that procedural code has: removal of superfluous bindings, a more declarative style, and it makes code more more easy to reason about during refactoring: using familiar tools from mathematics we can manipulate expressions and types. This is where abstractions really shine: you can manipulate higher-level expressions without caring about the details of those at the lower level. This only really happens if you care about purity in languages that don't do it for you and can reason using such tools.


Arguably lambda, monad, etc. are generalizations, not abstractions: http://www.emu.edu.tr/aelci/Courses/D-318/D-318-Files/plbook...

Abstractions that don't leak aren't abstractions, as the essense of abstraction is the simplification and removal of detail. Monad is a generalization of a pattern seen in many places, so it doesn't have to "leak".


I don't think I'd argue with that.

To take the idea to its conclusion is that one can build abstractions on these building blocks. It's typical for software I write to be built up from small functions. The abstractions happen as higher-order functions build up to the domain level language until the implementation meets its specification.

A rather well-defined abstraction I admire is the OSI model [0]. A software system that emulates this model, built from solid generalizations like lambda and higher-order functions tend to be quite strong in the sense that reasoning about them can be done in isolation from layers below or above.

Procedural code, if not well contained and isolated, easily loses this ability and requires the programmer to enumerate the decision tables in their head and all of the possible effects that could be caused by different inputs... such a waste of time. I've been there, done that. Not for me. I like small functions as long as I have ways of composition by means of higher-order functions.


I call shenanigans. Lambdas can easily lead to leaking call backs as your implementation.

Similarly, monadic style can leak if you are sloppy with it. Just see coffee in Haskel where the IO Monday has infected the entire codebase, and not just a boundary.


Fun autocorrect barf: monad <- Monday. That being said, I hate IO Mondays too, they're the worst to start the week.


I kind of agree with you (I think), but I'd present (or think) about it differently to avoid terminological squabbling.

More specifically: I'm skeptical of the distinction that your referenced article attempts to introduce between generalizations and abstractions: that distinction feels after the fact, trying to retconn into existence some classification that just isn't that clearcut. Regardless of what's correct terminology here, emphasizing the distinction simply leads to semantic quibbling, not better programming practices. The criticism that "abstractions always leak" might apply equally to generalizations - or encapsulations. Whatever you're going to call it; the idea at some level is that you're never entirely free of implementation details.

To be clear: I agree the words may well have different connotations and even different meanings, but that does not imply that they are mutually exclusive: what's an abstraction from one angle may well be an encapsulation viewed from another, and a generalization in some other context. And perhaps some aspects of a given construct aren't relevant when you consider it to be an abstraction rather than a generalization; but I'm not sure it matters here.

Instead, I think it should be emphasized the there are two different ways of looking at the statement that "all abstractions leak":

- You can consider that any abstraction in an executing program is merely a way of thinking about something that's ultimately a real physical process; and that that physical process may well have relevant behavior that escapes the confines of your abstraction. For example, the abstraction that your mergesort is O(N log N) is in some sense leaky because more memory needs longer wires means slower access so it's false - in some sense.

- You can consider clearly self-consistent and completely rigorous mathematical concepts that might be used as abstractions as a kind of counterexample. Clearly, there are thus abstractions that do not leak. Mergesort is O(N log N) - not in physical time, but in # of operations. We sort of "define" the problem not to exist.

Frankly: I don't think the "physical process" view (while certainly valid!) holds much value beyond that of a cautionary tale. Rather than give up and say "it's impossible" the lesson should be that it's valuable to choose your goalposts. So, for instance: if you define a correct program that adds N numbers to be one that returns (say) the true sum in at most N milliseconds, you're going to run into trouble: you may not be able to guarantee you won't be swapped out; you may run into trouble if your numbers are absurdly large; and you can't rule out that asteroid impact that's about to destroy the computer. But you can carefully choose those goalposts such that if the program completes, and if (condition XYZ) then it returns the numerical sum.

Does that distinction seem pointless? I don't think so; because as it turns out in the real, physical process world this allows us to separate responsibilities. It's fairly easy to ensure a physical environment that will high likelihood (albeit not fully 100%) satisfy the requirements you pose. And by separating responsibilities like that, it becomes easier to avoid needlessly leaky abstractions, and those abound too. In short: you may not be able to make an abstractions that's reliably perfect, but you should be able to make abstractions that given certain preconditions are perfect. And that matters because it's easy to compose perfect abstractions, and it's not even hard to keep most of the complexity in the perfect and leak-free world.

In short: "all abstractions leak" is thoughtcrime that can all too easily hide lazily constructed, poorly chosen abstractions that don't compose well :-).

I prefer to think that it's inevitable that some abstractions leak, but if you're clever (and perhaps a little lucky) you can contain the leakiness such that other abstractions are leak-free. And if you want, call those latter abstractions: generalizations. But programming can choose to be almost all about those generalizations, not the leaky abstractions.


Thank you for posting this.

We see this a fair bit in formal methods of software construction. I can write a specification that proves that given a precondition a particular state will eventually result in a consequent state. The model is mathematically sound but it doesn't state when that transition will occur or how long it will take. And for the purposes of the specification it is likely not important!

The level of detail that is important should be deliberately chosen. A specification of a distributed system can choose to abstract away the details of the TCP protocol while still proving that the properties we do care about, consistency perhaps, are maintained.


Author here.

> "The claims that "all abstractions leak" and that adherence to DRY makes code "hard to follow" is what gives me this impression."

That's a simplistic -- and cherry-picked -- interpretation of my post, I'm afraid. DRY inherently doesn't make code harder to follow, but an explosion of ultra-small functions (sometimes done in the name of DRY) as advocated for by Fowler and Martin and their ilk most certainly makes the code a lot harder to read.

I'm afraid I'm not against abstractions either - I'm only questioning whether the bottom up form of thinking we generally tend to use is the best mental model around and whether it's doing us a disservice.


Abstract abstractions, that simplify things that aren't real don't leak when you design them well. Concrete abstractions, that simplify things you didn't create do always leak.

Only the second type is essential for programming, and many people do use only them.


The goal is not to make small functions for the sake of making small functions, but it's to compartmentalize some functionality into a nice, easier to reason about thing (function).

Then you compose these easy to reason about things into more complex, but yet still easy to reason about things.

For people like me who struggle to maintain multiple layers of complex abstractions in our minds, being able to see a small function and say, "Ok, I trust this one - it does X." makes it easier to navigate up and down through the abstractions.

Perhaps part of my appreciation comes from living in Clojure and Elixir (after many years of several OOP languages).


Small functions are certainly more pleasant to work with in languages that support first-class function composition. Context switching is less of a problem when you can build large functions with only a couple lines of code, just by stitching together multiple small functions.

One big reason why Unix's "one thing well" philosophy works so well for Unix is the ability to chain commands together cheaply, with pipes and redirects.


Author of the article here.

>The goal is not to make small functions for the sake of making small functions, but it's to compartmentalize some functionality into a nice, easier to reason about thing (function).

This tendency is exactly what the article highlights -- this need to compartmentalize, when taken to extremes, makes code a lot harder to read.

Not everyone does take it to extremes, but many programmers are partial to what I call "the smallest viable function" syndrome, and ergo don't stop compartmentalizing until they've abstracted away every last piece of logic. The article states that:

"Thus, a “single level of abstraction” isn’t just a single level. What I’ve seen happen is that programmers who’ve completely bought in to the idea that a function should do “one thing” tend to find it hard to resist the urge to apply the same principle recursively to every function or method they write."

>For people like me who struggle to maintain multiple layers of complex abstractions in our minds, being able to see a small function and say, "Ok, I trust this one - it does X." makes it easier to navigate up and down through the abstractions.

And that's the problem - needing to maintain what you very aptly call "multiple layers of complex abstractions". This especially hurts programmers new to the codebase (or worse, the language), since they have to juggle so many different layers of complexity. The article calls for reducing this complexity, instead of stacking more and more layers of abstractions in the name of "clean code".


Shortness in a function is correlated with quality in design, but it doesn't cause quality in the design.

When we simply follow formulaic advice (keep all of your functions short) we lose sight of the wisdom behind why this was wanted in the first place.

The goal is to develop the wisdom, that makes you a great engineer, not to "follow all of the rules"


> The goal is to develop the wisdom, that makes you a great engineer, not to "follow all of the rules"

This is why books like Clean Code can be harmful. It can be extremely dogmatic if blindly followed, which is very common unfortunately.


Clean Code can be harmful? Or the person dogmatically applying the advice?

Martin warns against such blind dogmatism. He even goes so far as to explain the thought process behind most of his recommendations so that one may consider whether the rule applies or not.

On balance, Clean Code does far more good than it does bad.

I have yet to see these dogmatic Clean Coders blindly applying Martin's advice. Not in industry and certainly not in open-source. It's painful to work with most OSS.

Eugene Schwartz, one of The Godfather's of advertising, when asked, "How long should an ad be?" liked to say... as long as you can hold the reader's interest.

Much like Martin would say when asked how long a function should be... as long as you're working at the same level of abstraction.


> I have yet to see these dogmatic Clean Coders blindly applying Martin's advice.

You're lucky. I've seen it plenty of time. In one extreme case we even had to let go a perfectly (formerly) competent dev after he read that book and went crazy (yeah, seriously).


I've always found people who read his book to be more dogmatic than Martin and anything in that book. They usually speak in absolutes, and cite the book when challenged. Then I go point out his blog posts where Martin is more nuanced. I believe his book also has a number of disclaimers as well (I have not read it myself).

Overall, I agree with quite a few of his "principles". Just don't apply them blindly.


I'd be interested to see what his code looked like and how the code reviews progressed before firing him. I think these discussions are important, however, we must ground them in a concrete reality. That's what I appreciated about Clean Code, it was grounded in addressing real live code. And the reader is able to choose whether they agree with the conclusions or not.

With that said, all too often, like in this article, we get caught up in cute hypothetical scenarios an examples. There were no real-world examples given, just a bunch of hand-waving.

So in the interest of moving the discussion forward in a concrete way, I'd ask you what you think of these three C lexers; one written in Go by yours truly, GCC's, and LLVM's:

1. https://github.com/denzel-morris/clex/blob/master/lex/lexer....

2. https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c-family/c-l...

3. https://github.com/llvm-mirror/clang/blob/master/lib/Lex/Lex...

I always find it helpful to look at one problem solved different ways. Of course it's not always apples-to-apples but it's as close as you're going to get. Out of these three codebases, there is probably one you'd feel more comfortable working with. I merely threw mine in there because I wrote it as an exercise on what it'd be like to have a hand-written lexer read more like a story.

It has a focused interface, most functions are descriptively named and operate at a single-level of abstraction, and it's easy to intuit what a C lexer does even if you're not familiar with what they're supposed to do. I'm sure it'd be easy for most people to jump into my code and contribute.

Just like with LLVM's lexer. LLVM is far easier to jump in and contribute to than GCC.

However, and this is just an example that immediately jumps out to me since we're talking about it... doesn't it make sense to have this comment and code abstracted out into a respectively descriptive function instead:

  void Lexer::InitLexer(const char *BufStart, const char *BufPtr,
                     const char *BufEnd) {
    // ...
    // Check whether we have a BOM in the beginning of the buffer. If yes - act
    // accordingly. Right now we support only UTF-8 with and without BOM, so, just
    // skip the UTF-8 BOM if it's present.
    if (BufferStart == BufferPtr) {
      // Determine the size of the BOM.
      StringRef Buf(BufferStart, BufferEnd - BufferStart);
      size_t BOMLength = llvm::StringSwitch<size_t>(Buf)
        .StartsWith("\xEF\xBB\xBF", 3) // UTF-8 BOM
        .Default(0);
  
      // Skip the BOM.
      BufferPtr += BOMLength;
    }
    // ...
  }
I mean think about it... we're concerned with initializing the lexer, why am I being bothered with this string manipulation minutia to determine whether there's a BOM or not? That could be extracted into a properly named function, the comment eliminated, and then I could trust that that function does exactly what it says on the tin can.

And there's plenty more places where this line of reasoning applies too.

Jekyll (https://github.com/jekyll/jekyll) is another example of a codebase that's very easy to read, work with, modify, etc. It's very well-written. It follows Clean Code like principles where it makes sense.

On the other side of the coin we have Kubernetes (https://github.com/kubernetes/kubernetes) which looks ready to collapse under its own complexity. It's supremely difficult to read and understand what's going on not because of the essential complexity of the problem but because of all the incidental complexity added by the code structure.

I could spend forever citing examples on both sides because I've spent a great deal of time thinking about this and talking with other seasoned developers.

If you (or anyone really) could, please offer up concrete examples grounded in real production code. I'm always interested to see more examples.

Specifically examples blind Clean Code dogmatism applied in the wild.


I don't know Go, but your example looks absolutely beautiful.

On the other hand, I'm one of those "smaller than that" people :) I mean, I am ashamed that I left the time-to-string function in [1] too long. It could definitely be split up (and I find value in that, I just haven't had time / a reason to revisit that code).

[1] https://github.com/mdpopescu/public/blob/master/SocialNetwor...


Thank you, I really appreciate that.

I understand exactly how I feel. I think your way of attacking it is the right way -- pragmatism. You realize it can be improved/split but (1) you haven't had time, (2) there's been no reason to revisit the code, and (3) it's serving its purpose. That makes sense to me.

It takes time. Time and many passes to refactor code. The idea being that as a developer becomes more experienced: the amount of time and the number of passes should begin to decrease. At least that's the idea.


That is a blog article I would read.

Doubly so if it was on Coding Horror.


Aside from the clickbait-y title, I find it quite disturbing to base that whole piece on, what even the author describes as, problems in "codebases inherited from folks who’d internalized this idea to such an unholy extent".

Indeed, small functions can be bad if you completely and utterly overdo it. But wait, that's true of nearly everything else.


Yeah, the idea of small functions, and the overuse of DRY are separate.


Author of the article here.

Nope, the article doesn't conflate DRY and small functions. But the quest for DRY can lead to an explosion of small functions, which isn't necessarily a good thing.

The vice versa is true as well - many programmers, in their quest to make functions as small as possible, end up DRYing it up to the fullest extent as well.

There's a relationship between the two, but DRY and small functions aren't synonymous themselves, and neither does the article suggest that anywhere.


Obsessive decomposition of the sort Fowler, for example, is cited preferring, very quickly becomes pathological - I suspect that the codebase Fowler describes in that tweet, to the developer as yet unfamiliar with it, reads like one of those old IBM field engineer manuals where a giant circuit diagram is spread across 800 letter-size pages, all bordered with dozens of numbered arrows each referencing the page on which a given trace continues.

But I sort of feel like Sridharan throws the baby out with the bathwater, too. I mean, in the CRUD example, carefully chosen abstractions make the code easier, not harder, to read - if I'm working to comprehend how the application handles UI state changes, I don't want to have that effort complicated by a bunch of user-creation-related database interaction; I'd much rather that be in a method call so I can deal with it it when I care about user creation, and ignore it when I don't. Same for email messaging and event log injection.

And I have to say that my experience gives me to think the idea of preferring duplication over abstraction is just completely, wildly off base. I mean, sure - any given abstraction is likely to change over time as feature requests and bug reports come in. That's just the job. But if the stuff that needs to change is abstracted, it only has to change in one place. If it's not abstracted, then it has to change in N places across the entire application, not all of which are guaranteed to be easy to find - after all, you probably don't have distinctive method or function names. Hope you've got good tests! Except you don't. Or maybe you do - I never have, at any point in my career where I've worked with a codebase in which copy-pasted code was prevalent, because such a codebase is a sign of an engineering culture that's far too weak to support investment in automated testing.


    But if the stuff that needs to change is abstracted, it only has to change in one place.
IMO, the point of "prefer duplication over the wrong abstraction" is that "all the stuff that needs to change" is a partially-to-totally unknown quantity. You can either guess at the time you have the minimum possible amount of knowledge (when the code is initially written) or wait and make that decision later with more evidence.

Even good tests won't save you from premature abstractions with the seams in the wrong places when a change request that cuts across them comes in; they will help as you try to disentangle multiple use cases that go through the same code paths that now need to be different, but it's still a mess. Or worse - you might not disentangle them, and wind up with "generic" code littered with 'if special_case' fragments.


Better to make the case against premature abstraction as such, then, without bringing duplication into the mix. "Don't abstract before you know for sure what needs abstracting" conveys the same valuable point without the same confusion of concerns; so, more concisely, does YAGNI.


The right abstraction means it has to be changed in one place. The wrong abstraction means the change cuts across abstraction boundries, which means changes all over the place, or having to de-abstract and duplicate before making the change anyways. Of that happens frequently, it is a smell that you are abstracting too early and thus building bad abstractions.


> The idea that functions should be small is something that is almost considered too sacrosanct to call into question

Errr... Really?! I thought we all agreed that the first rule of programming style is that "it depends"...

When they say "small functions", they mean "not the 5000 loc VBA macro that has 50 Boolean arguments, and 30 side effects".

Breaking a function that does 1 thing into sub functions just so that each of them is smaller is not a good thing. And I think people realize that fairly quickly.


> When they say "small functions", they mean "not the 5000 loc VBA macro that has 50 Boolean arguments, and 30 side effects".

Clearly, though, not all of them do. From the article, for context.

> The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.

> In my Ruby code, half of my methods are just one or two lines long.

> Any function more than half-a-dozen lines of code starts to smell to me, and it’s not unusual for me to have functions that are a single line of code

> I’ve worked on codebases inherited from folks who’d internalized this idea to such an unholy extent that the end result was pretty hellish and entirely antithetical to all the good intentions the road to it was paved with.

It seems to me there are a lot of people who espouse making your functions extremely short, and for who a 30 line function counts as long.


I've heard rumors of people operating under a "all functions must be one line" rule, though I've never encountered it in real life.


My linter throws up a warning if there are more than N lines, M private variables, P branches, or Q return calls (or more than 80 characters in a line. Grr).

So, yeah. People have definitely internalized these as the law and no longer question it.


Define "a thing". This is the crux of all of these debates, IMO. Most people agree a function should do one thing, people tend to disagree on the granularity of things.


Which is precisely why debates like these can occur in the first place, it's not an exact science. Which is also why the parent started with "It depends".


yes it is actually. if your function has a logical branch then it is doing more than one thing. if your function has a logical branch within a logical branch then it is doing exponentially more things and this complexity continues to grow exponentially.

therefore, if you have a logical branch you ask yourself: does it make sense to create a function here? the answer may be yes or no. if you create a nested logical branch you ask yourself: does it make sense to create a function here? the answer still may be yes or no but the weight of the evidence for yes has increased exponentially.


I blame "Clean Code"; it's recommended reading, but at its core it dumbs down refactoring to mechanically factoring out common code without actually crafting abstractions. I'm still in the process of unlearning this. The prime directive should be "craft sensible abstractions", not "avoid duplication at any cost"; even more so when we're talking about actually modular software, where duplication is much easier to tolerate than blurring responsibility lines. (I'm generally not a big fan of Uncle Bob.)


In a program, there's a lot to optimize for

    - minimize depth of call stack. It's easy to lose track in a deep call stack.
    - minimize function overhead
        - names that have to be invented
        - function calls and arguments that have to be written (and read again).
        - many small functions: "ravioli code" where it's really hard to distinguish functions
          by their "function"
    - minimize/localize state
       - if it's easy to separate significant state into a function, do it.
         (Not making a statement about objects. They are long-lived and their
         state doesn't go away after the first call).
    - DRY
       - Multiple identical or similar code blocks are an opportunity to make
         a function. We can roughly categorize into essentially (conceptually)
         and accidentally similar code, and the latter case is not an indication
         for a new function.
    - obvious thread of control
       - simple code has the nice property that it's easy to understand by
         following or mapping out sequentially. By contrast, highly abstracted
         or callback-heavy code is hard to understand at a global level.
    - consistency
         - ... helps understanding, but can also cause an implementation to be
           5 or 10 times longer if applied dogmatically.
It's good to read all these articles ("avoid state", "avoid long functions", "avoid objects", "avoid functional programming", "avoid abstraction") and to deeply understand them. Which probably means making all the mistakes on one's own.

In the end it's important to know in which situations a particular style works well, and to not be dogmatic when choosing a style.

I would say a good program tends to have both large and small functions, and the large functions tend to be at the top of the abstraction stack.


IIRC the original concept of DRY wasn't "Don't Repeat Yourself" (that's the pithy version that's easy to remember but wrong). I think it was more like "A piece of knowledge should be found once in a code base", which isn't quite as easy to follow or remember but cuts to the issue more deeply.

The problem isn't superficial duplication. I might do a .map(x->x*2) in multiple places, but if they represent different pieces of knowledge (money in one, age in another) then I may not be Code DRY but I'm still Knowledge DRY. An easy way to tell the difference is to think about changing one of your functions. If you'd be happy that it changes the behavior everywhere in the function then you're Knowledge DRY. If you're scared, you're just Code DRY.


Nicely put. The problem with DRY is it is expressed in term of "repeating" so it suggests that code repetition is the problem. The flip side of achieving DRY is breaking isolation between two pieces of code. That's a downside - so DRY is a tradeoff, not a simple one sided win.

The real questions are things like "Is it likely if I change logic A, that B will also need to change in the same way?" and "Is it likely that if there's a bug in B, that would also be a bug in A?". If those things aren't true, you are shooting yourself in the foot because you will certainly be going in and changing A and B to maintain them and now every time you do that for one you are at risk of breaking the other.

DRY is usually right, but it definitely comes at a cost.


I think the author doesn't understand the small-function-philosophy (at least not the same way I do). Let me clarify how I see it:

- Build a ton of small functions that are reusable across any project. You are essentially making useful concepts. (i.e. a library). Bottom-up.

- Once you have those, a business problem often will only be about 3 or 4 of those powerful, reusable functions.

So something like sending out a newsletter might end up being:

function sendNewsletter(letter) { database.getAllUserRowsAsIterator().forEach((row) => { sendmail(x.address, letter); } }

Now if we want the whole newsletter not to fail if there's a single exception, we can make another reusable construct "count exceptions" that's a wrapper function that catches all exceptions and builds a hashmap.

If you want this to work in a larger project, this requires having reliably unit-tested code and doc-blocks so that other people can reuse your abstractions, and then having roughly comparable coding skill to you.


Functions have several functions.

* Functions can be like new words in a language. If there is new concept that is used frequently, it should be named or abbreviated and maybe listed in a index.

* Functions can sometimes be like chapters or sections. They are entry points listed in table of contents (API description)

* Functions are sometimes used like paragraphs. Used only once to make very long section easier to read. Not really functions. Way to structure text.

For paragraph use we might want blocks of code with function like interface. Code editors could collapse and expand them.

    let a, b, c = 100 
    let p = 0

    paragraph "Intro to foo" (constant a, modify p) {
        let k, l, m, ...

    } assert (p < a)


Function borders should be drawn by defining abstractions, not by counting lines of code. Sometimes, the most useful abstractions involve large functions.


Do you have examples?


The first one that comes to mind is main().


There were no real examples of when long functions helped. I can come up with 30 or examples of long functions being bad in the codebase I am working in now.

I stopped reading about halfway through as the contrived DRY twitter post. I did search the article for "test" and a few other keywords and found more flowery words without substance.

All the systems I have seen with long functions have had more bugs and greater difficulty to test. All the systems lacking DRYness have invariably had bugs because things weren't updated in all the places they needed to be. I have seen a clear correlation. I know that correlation is not causation, but the mechanism is simple and easy to understand: Long functions have more places for bugs to hide. The mechanism for dRY to help is equally simple:DRY code means fewer things need to be changed to fully cause a correct change. Both of these reduce the place for bugs to exist and my experience matches that.

I understand that my experience is that of only 1 person, but I have held 9 different software development contracts in the last decade and seen the full range of long to short functions and I have seen DRY and non-DRY code. This pattern has followed into open source code I have inspected. If I am wrong it will take strong concrete examples of DRY or short functions hurting, and I mean links to repos like github.


Author of the article here.

That's primarily the reason why there are no "concrete examples", because one person's concrete example is another person's definition of contrived. Splitting hairs over some toy example wasn't something I thought would buttress the ideas presented, though I can imagine why some might need that scaffolding to follow along.

Re resting - here are some points the article makes about how smaller functions can in some cases hurt testing:

1) "Furthermore, when the dependencies aren’t explicit, testing becomes a lot more complicated into the bargain, what with the overhead of setting up and tearing down state before every individual test targeting the itsy-bitsy little functions can be run."

Disambiguating this for you, one of the myths of smaller functions is that they are easier to test. The article makes the claim that this isn't always true, because many who wax lyrical about the beauty of smaller functions also champion for fewer arguments to be passed to the function. The book Clean Code very explicitly states this (read the book, not as a how to guide but as a cautionary tale), and many a time what I've seen happen in the wild (especially in Ruby) is that programmers who like small functions and don't like making deps explicit - which then means the code (and ergo tests) rely on setting up shared global state.

Of course one can write smaller funcs with fewer args and not do this, but that's not the point here. The main argument is that making functions smaller doesn't always make it easier to test.

The article also does provide two examples when having the smallest possible function does help in testing. I'm not going to rehash those arguments again here.


I have never read clean code and bringing it up as an argument against me and my points is a straw man. I don't think you meant for that though. Most of my knowledge was hard earned from a couple of decades of cleaning up disgusting long functions.

As for testing it is trivial to create examples of larger functions that cannot be tested each line of code introduces another possible thing that gets in the way of testing. I provided a contrived example (which is better than no example). The summary of it was that if a function creates connection to an external resource, uses it, the disconnects there is no way to test that function. Simply breaking it into three functions allows testing of the logic that uses the resource. Adding parameters that allows the resource or resource creator to be passed in allow testing of all three functions.

By skipping examples because someone will complain about how contrived it seems is to throw the baby out with the bathwater. As it stands your point is not falsifiable because it is possible for people on your side to just say "that's not a good long function" just as if it weren't a true scotsman. With examples you can at least ignore people who complain about the contrivance of them just as most successful technical speakers at conventions do.

> The main argument is that making functions smaller doesn't always make it easier to test

Only a Sith argues in absolutes... Sorry, I had to.

Seriously though know one seriously argues that it "always" makes it better. It just makes it better such a preponderance of the time that arguing against is foolish. It is like arguing for reasonable uses of goto, they might exist, but we as an industry have moved onto better designs.


I don't think it's really a contest between 100 line functions and single line functions. It's more about 5 - 20 line functions and 1 - 2 line functions. I could come up with lots of examples where being able to see the entire scope of 10 lines of logic is clearer than breaking the same thing into 5 functions.


The loss of locality argument is a very good one. Having to jump around different files whilst holding layers upon layers of abstractions at the back of your mind to figure out the source of a bug is overwhelming and extremely distracting (you literally need to keep a call stack inside your brain to pull that off).

DRY is all fine until you need more information about the function than what the function name and documentation can provide - Sometimes you actually need to peek inside the code itself.

I don't think that code can ever be 100% black box; especially as you move up higher in the chain of abstraction. This is particularly true for dynamically typed languages - These days in JavaScript I often find myself peeking inside the function's code before I invoke it - It's very easy to make false assumptions about the behaviour of the function based on its name alone and often the documentation isn't enough and doesn't tell you anything about the performance of the function (is it O(1), O(n) or O(n^2)? - You wouldn't want to call an O(n) function whilst in a loop over n because then you'd get very crappy O(n^2) performance).


Functions exist to prevent code duplication not for commenting code, that's what comments are for.


I believe most rubyists would disagree with you.

I had a ruby loving coworker that made you feel like a moral failure if you commented your code instead of making it 'self-commenting'


This discussion is predicated on the concept that function size is calculated by the number of lines which is completely wrong.

function size (function complexity actually) is measured primarily by indent levels not length and when there are multiple indent levels with nested branches and loops this is when you are supposed to create functions. length is not really an issue in most cases.


Most of the arguments in the article are agnostic to the size metric - offhand, I cannot think of any claim that is invalidated by changing to your measure. Furthermore, it is the proponents of short functions quoted in the article who are using line count as a metric, so it is not as if the author of this article has created a straw man by choosing a misleading metric.


Most of the arguments in the article seem agnostic to the size of the function altogether, which is hilariously ironic given the complaint about the "long functions are bad" example at the start of the book.


Most of the arguments concern the side-effects of splitting a given unit of work into a lot of functions, which is an inevitable consequence of the strategy being deprecated.


I feel like an article such as this should be full of code examples, just talking about this or that in which there are many dependent situations in which "it depends" just makes me get sleepy


Even the highest voted post here is commenting on the value of examples and lack of value in higher order discussion.

I must discard the post because everyone I have known professionally with similar opinions was grossly incompetent. Some strong examples would make it foolish for me to disregard this out of hand.


I wrote something myself trying to figure out my thoughts about this a few years ago (note: I was sleep deprived when writing it...):

https://madprof.net/nerdy/refactoring-really/

There certainly has been a over-emphasis in some quarters on 'each function should only do one thing' and even 'most of my functions are only one or two lines long'. Possibly because it makes it easier to write tests for, and be certain you've covered all of the possible options. You just then end up writing 4 million tests.

It's all about clarity, I think, but different people find different things clearer in different situations.


Programming is like writing. There are some general rules that will usually make your meaning clearer. There are also exceptions. Begin following the rules, prioritize clarity and semantics, identify when breaking a rule would make your meaning clearer and break away.

At the end of the day, you're still just communicating--in communication clarity is key.

You must also keep your audience in mind. Thus, if your company follows conventions you stick to them, as your audience will be able to digest said conventions quickly.

It's all language.

Programming only has the additional wrinkle that you are also communicating with a computer--which prioritizes very different things than human readers (efficiency, memory management, etc.)


This was interesting:

> an awful lot of times, pragmatism and reason is sacrificed at the altar of a dogmatic subscription to DRY, especially by programmers of the Rails persuasion.

I'm not making this a tribal thing but it does sometimes feel like there's a cultural tendency in the Ruby and Javascript communities to... well... preach a little bit. All advice is flawed and most maxims are only partially true.

If a community bounces from one "one true way" to another all the time then it's probably not a particularly healthy environment for those who are learning - as they tend to lack the experience to put advice into the correct context.


If you want to have an easy time refactoring code later, forgo OO patterns that have properties and methods in the same class. Instead, make classes for your data, and make sure to give each class a deep clone method.

Then, your logic goes in static functions that do not alter the input, but rather spit out new or cloned versions of the data in the output. Then you can reason and refactor at the method layer and not worry about hidden side effects.


That seems a bit extreme compared to where many teams are at, but I can see how it could work.

How would you compare that to reducing the size of classes to bound where side effects can happen?


Better Stated: (overly) DRY considered harmful.

This is one thing about Go programming culture and its prevalent idioms I appreciate: low context switching cost.


1. Functions which are "longer than they should be" decrease code quality.

2. Measuring the "number of lines per function" is extremely easy.

Code quality tools have a bad tendency to equate (1) and (2) for the same reason that the proverbial drunk searches for his keys under the streetlight. This has legitimately bad consequences.

This is further compounded by the way that small functions ease mock-based testing. While certainly attractive in the abstract, when a code base is overly influenced by this I find that it is substantially more difficult to understand via inspection.

All that said...I find the whole "X considered harmful" formulation almost unbelievably annoying. Here it doesn't even make any sense.


>>> If you have to spend effort into looking at a fragment of code to figure out what it’s doing, then you should extract it into a function and name the function after that “what”.

Fowler is right about smaller functions and OP misinterpreted his statement.

This is what Fowler means https://gist.github.com/hbt/3e71146454a2d6388338af1d76394a13

Abstract the fragment of code in a function, keep it within the function until you need it elsewhere and do not pollute needlessly the API with functions that are poorly named and used one time only.


My theory is that the small functions motif appeared with practitioners of languages where the function is the only tool introducing a new scope / block for variable definitions. Languages like Python or Javascript. What about Smalltalk?

In language where blocks can be introduced at will (stricter Algol descendants like Pascal/C/C++/Java) or are introduced by control structures, then the need for such rule is much less necessary, and only the harm done (friction, readability) by fragmenting and obscuring logic remains.


I've never been served wrong by constantly asking these questions:

1. How hard is this code to change?

2. How hard is this code to delete?

3. How hard is this code to test?

Code that is hard to change, delete, and test is bad. Code that is easy to change, easy to delete, and easy to test is good.

(Note: you can't easily change code if you can't understand it)

Long vs short functions don't really matter. Objects vs functions don't really matter. It doesn't matter if you use or don't use a ton of abstractions if they are easy to delete and change.


one problem with the example of functions

    A()

    B()

    C()
Called in sequence is that they smell of imperative code. A recipe of some kind, of steps performed in sequence. That kind of code, when it occurs, should probably just be performed in a single function. That is - until either A, B or C can be re-used by other code without creating an unnatural abstraction. If the steps as separate functions can be tested separately - great. But if they are only ever used in this sequence - do you really need to test them individually? What good does the breaking up into functions do, compared to just some comment text in a longer method.

    // Step A: 
    ... 3 lines
    // Step B:
    ... 4 lines     
    // Step C: 
    ... 3 lines
Answer: very little. And it forces the reader to scroll to see the relevant code. Some languages allow the use of local functions - which is basically just some trickery to help with variable scope and not have to use a comment line to call the sub-step something. Can be quite useful.

A better example of 3 functions is when you have

    y = A(B(x))
and then turn it into

    y = A(B(C(x)))
If the C can have some kind of semantic meaning (e.g. C just fetches the price of item x before the rebate is applied by B). In this kind of functional code there is usually very little harm in making more and smaller functions. Not sure where I'm getting with this but I assume It's kind of an argument for avoiding procedural code to begin with, and aiming to make actual functions.


There are several advantages you are disregarding.

Functions have names and you get to name the code in a way that shouldn't expire the way comments can.

In most languages separate functions create separate scopes. This prevent incidental use of temporaries from one section into another.

Having a smaller scope to look at means that refactoring or changes to new business requirements can have a smaller place for side effects to occur.

I agree that testing each part on its isn't required, but if it becomes part of the public API and A is setup database connection and C is teardown Database connection and B is use the database connection then as one function B is really hard to test. Breaking it into three functions lets you write unit tests for at least B by mocking A and C.


> Functions have names and you get to name the code in a way that shouldn't expire the way comments can.

Sure that's a benefit, but often times I think the scrolling is a downside that is underrated. The code being right in front of you has huge value.

> In most languages separate functions create separate scopes. This prevent incidental use of temporaries from one section into anothe

Right. Thought I mentioned that. That's why I think local functions are good. Gives the best of both worlds. It's a proper name instead of a comment, and gives the right scope.

> but if it becomes part of the public API

Yes obviously API design is a separate (and MUCH) harder question than refactoring for breaking logic apart I think.


> but often times I think the scrolling is a downside that is underrated

What editor can't take you directly to a defined symbol, even ones in other files, nowadays?

As for API design, all classes and functions implicitly become APIs for more abstract code that calls it. Give any surviving piece of enough time and eventually one part of the code with depend on things provides by what logically seems like entirely different parts of the code. At some the team will wonder why these aren't separate libraries (or gems, pips, rocks, packages or whatever), and the API will be whatever the original classes and function were. And if those things were well encapsulate that process might be easy.

This is why least responsibility, DRY and the smallest functions possible are important. Eventually everything will be part of an unbroken stack connecting your users to a CPU and to fix problems in the middle you need simplicity.


I don't necessarily disagree with the post, but I will say that in almost 30 years of programming, I can only recall one programmer who took small functions to a harmful extreme. On the other hand, I can recall dozens who took long functions to a harmful extreme. So yeah, it can be bad, but I see the other extreme way more. (That may be because I don't do enterprise programming, though.)


This old article explains it perfectly https://whathecode.wordpress.com/2010/12/07/function-hell/

Over-optimizing for local readability can hurt global readability in the end.


For me, I generally try to break in to smaller parts any function where the logic extends over one screen in length.

There are of course always exceptions to this, but having logic take up roughly a single screen size makes it easy to reason about.


I have a 4k screen at some point this pattern stops working.


A function should be X Lines long... but no longer

where X is any number of lines necessary for implementing the ONE thing that function should be doing

Any attempt to replace X with a concrete number will invariably sacrifice simplicity for the sake of that number.


I think you can avoid a lot of these problems by using small inner functions.


How about "following a rule blindly without understanding its rationale and limitations considered harmful?". People need to understand what they are doing and why and make judgement calls.


"Considered harmful"?!? Why, this must be both IMPORTANT and LEGITIMATE, and the author has AUTHORITY. When a negative opinion carries such an academic, aloof, and professional sounding headline then I am inclined to give it great credence. This guy didn't just say "here's something I think is bad", he said it in fancy and upmarket words. Let's closely follow what they say.....

Seriously, HN should have code that auto flags anything including a subject line of "considered harmful".


Anything, taken too far, considered harmful.


So, the author thinks this is harmfull?:

  var area= ( width, height ) => width * height;
If not, it is just clickbait for me.


If I'm understanding you correctly:

It would be harmful if you were calculating something subject to change, like, say price.

  price = (cost) => cost * 1.10;
And you sprinkled that throughout your code. The problem arises when the business says they want to be able to change the logic so that it lowers the price for customers that have subscribed over a year. Now you have to change the logic everywhere that shows up, rather than having a single function where you change it once:

  function decimal Price(cost) { .. }
That in essence, is why DRY is important: code maintenance & refactoring.

Even your example it could be dangerous:

  var area= ( width, height ) => width * height;
What if the business was making sandboxes and they wanted to be able to add a short wall around it so cats have a harder time using it as a bathroom. It would then be:

  var area= ( width, height ) => (width + 1.0) * (height + 1.0);
Again, if you had sprinkled that throughout your code, you would have to search and find every instance of that and change it. This is also harder to test, because if you miss one, chances are, it won't show up immediately.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: