
Small functions considered harmful - grey-area
https://medium.com/@cindysridharan/small-functions-considered-harmful-91035d316c29
======
whack
During my earlier years, I would get into all types of dogmatic debates, such
as _" DRY-considered-harmful"_ or _" small-functions-considered-harmful"_.
With experience, I've realized that such abstract debates are generally
pointless. Any principal can lead to bad results when taken to an extreme, or
badly implemented. Thus leading to people declaring _that-principal-
considered-harmful_ , swinging the pendulum to the opposite extreme, and
beginning the cycle all over again.

Now, I find such discussions valuable, but only in the context of concrete
examples. Devoid of concrete and realistic examples, the discussion often
devolves into attacking strawmen and airy philosophizing. If this article had
presented realistic examples of small functions that should have been
duplicated and inlined, I think we can then have a much better discussion
around it.

That said, I do have to offer a word of warning. It's possible that the author
is a good programmer who knows how to incorporate long functions in a way that
is still clear and readable. Unfortunately, I've had the misfortune of working
with horrendous programmers who write functions that are hundreds of lines
long, duplicated all over the place, and are a pain to understand and
maintain. Having short-functions and DRYness is indeed prone to abuse, but it
still works as a general guideline. Great programmers may be able to ignore
these guidelines, but at least it prevents the mediocre ones from shooting
themselves (and others) in the foot.

~~~
jdmichal
I think it's more likely that your "great programmers" simply understand the
difference between the _same_ functionality and _accidentally similar_
functionality. The latter is where you have two use cases that are very
similar, so you spend all this time deduplicating. Then _one_ of the use cases
changes... The correct response would be to duplicate the code again, because
the two use cases are no longer similar. In reality, they should have never
been combined in the first case. They weren't the _same_ ; they were only
_accidentally similar_.

But instead what you usually see is minor tweaks to the common functions. Pass
in a flag here, tweak the inputs there, add an if statement over yonder... And
before you know it, it's all a terrible tangled mess that is full of branches
and technical debt. The two use cases have the same functions, but don't even
follow the same branches within the functions.

~~~
jacobush
Thank you for putting that into words, and rather few too. I have only _felt_
that anti-pattern before, not had words for it. Does it have a name? If not,
we should give it a name!

~~~
whack
I believe this pattern is referred to as accidental-duplication. Some existing
discussions on this:

[https://softwareengineering.stackexchange.com/questions/3000...](https://softwareengineering.stackexchange.com/questions/300043/illusory-
code-duplication/300090)

[http://www.informit.com/articles/article.aspx?p=1313447](http://www.informit.com/articles/article.aspx?p=1313447)

------
coroxout
I find a lot to agree with here.

It's all very conceptually neat and (if you're lucky) easy to read from the
top down, where you enter one function and read off a list of other functions
which are called in order.

But then if you look into any of those other functions they also call more
functions and so on, several levels deep. And when you have to debug someone
else's code because the data after function 15 of 17 isn't quite right, and
you have to unpick all the places it's been passed through in slightly
different versions and slightly different lists of parameters, it can be a
nightmare.

Same with my linter telling me to close a file within a few lines of opening
it. Personally I'd rather keep all the file-munging code in one place rather
than scatter it down a rabbithole of nested functions as an exciting Alice In
Wonderland story for future developers.

I try to come to a compromise on these things when working in a team,
though...

~~~
jacalata
Wouldn't the compromise be unit testing?

~~~
geezerjay
> Wouldn't the compromise be unit testing?

I agree. Unit testing is done to ensure that all components work exactly as
they are expected to work. If any component fails to work but developers only
notice it "because the data after function 15 of 17 isn't quite right" then it
appears that something is very wrong with the way those units are being
tested.

------
agentultra
It seems like the author's idea of the term _abstraction_ is limited to
substituting procedures with functions (or worse, class methods). The claims
that "all abstractions leak" and that adherence to DRY makes code "hard to
follow" is what gives me this impression. This line of thinking happens if you
think of code in procedural terms.

And if your sense of abstraction is to hide procedural side-effects behind
function applications then yes... I can see where you might get the idea.

A real abstraction like lambda doesn't leak. Using Monads to compose side
effects doesn't leak. These are mathematical abstractions and we use them all
of the time: even in programming and even if you don't identify as someone
who's any good at maths. Learning the lambda calculus, algebra, and first-
order logic will take you much farther than thinking in procedural steps.

Composing more interesting functions from smaller ones removes so many errors
that procedural code has: removal of superfluous bindings, a more declarative
style, and it makes code more more easy to reason about during refactoring:
using familiar tools from mathematics we can manipulate expressions and types.
This is where abstractions really shine: you can manipulate higher-level
expressions without caring about the details of those at the lower level. This
only really happens if you care about purity in languages that don't do it for
you and can reason using such tools.

~~~
jerf
Arguably lambda, monad, etc. are generalizations, not abstractions:
[http://www.emu.edu.tr/aelci/Courses/D-318/D-318-Files/plbook...](http://www.emu.edu.tr/aelci/Courses/D-318/D-318-Files/plbook/absgen.htm)

Abstractions that don't leak aren't abstractions, as the essense of
abstraction is the simplification and removal of detail. Monad is a
generalization of a pattern seen in many places, so it doesn't have to "leak".

~~~
taeric
I call shenanigans. Lambdas can easily lead to leaking call backs as your
implementation.

Similarly, monadic style can leak if you are sloppy with it. Just see coffee
in Haskel where the IO Monday has infected the entire codebase, and not just a
boundary.

~~~
ronjouch
Fun autocorrect barf: monad <\- Monday. That being said, I hate IO Mondays
too, they're the worst to start the week.

------
blunte
The goal is not to make small functions for the sake of making small
functions, but it's to compartmentalize some functionality into a nice, easier
to reason about thing (function).

Then you compose these easy to reason about things into more complex, but yet
still easy to reason about things.

For people like me who struggle to maintain multiple layers of complex
abstractions in our minds, being able to see a small function and say, "Ok, I
trust this one - it does X." makes it easier to navigate up and down through
the abstractions.

Perhaps part of my appreciation comes from living in Clojure and Elixir (after
many years of several OOP languages).

~~~
yen223
Small functions are certainly more pleasant to work with in languages that
support first-class function composition. Context switching is less of a
problem when you can build large functions with only a couple lines of code,
just by stitching together multiple small functions.

One big reason why Unix's "one thing well" philosophy works so well for Unix
is the ability to chain commands together cheaply, with pipes and redirects.

------
moxious
Shortness in a function is correlated with quality in design, but it doesn't
cause quality in the design.

When we simply follow formulaic advice (keep all of your functions short) we
lose sight of the wisdom behind why this was wanted in the first place.

The goal is to develop the wisdom, that makes you a great engineer, not to
"follow all of the rules"

~~~
chinhodado
> The goal is to develop the wisdom, that makes you a great engineer, not to
> "follow all of the rules"

This is why books like Clean Code can be harmful. It can be extremely dogmatic
if blindly followed, which is very common unfortunately.

~~~
Denzel
Clean Code can be harmful? Or the person dogmatically applying the advice?

Martin warns against such blind dogmatism. He even goes so far as to explain
the thought process behind most of his recommendations so that one may
consider whether the rule applies or not.

On balance, Clean Code does far more good than it does bad.

I have yet to see these dogmatic Clean Coders blindly applying Martin's
advice. Not in industry and certainly not in open-source. It's painful to work
with most OSS.

Eugene Schwartz, one of The Godfather's of advertising, when asked, "How long
should an ad be?" liked to say... as long as you can hold the reader's
interest.

Much like Martin would say when asked how long a function should be... as long
as you're working at the same level of abstraction.

~~~
shados
> I have yet to see these dogmatic Clean Coders blindly applying Martin's
> advice.

You're lucky. I've seen it plenty of time. In one extreme case we even had to
let go a perfectly (formerly) competent dev after he read that book and went
crazy (yeah, seriously).

~~~
Denzel
I'd be interested to see what his code looked like and how the code reviews
progressed before firing him. I think these discussions are important,
however, we must ground them in a concrete reality. That's what I appreciated
about Clean Code, it was grounded in addressing real live code. And the reader
is able to choose whether they agree with the conclusions or not.

With that said, all too often, like in this article, we get caught up in cute
hypothetical scenarios an examples. There were no real-world examples given,
just a bunch of hand-waving.

So in the interest of moving the discussion forward in a concrete way, I'd ask
you what you think of these three C lexers; one written in Go by yours truly,
GCC's, and LLVM's:

1\. [https://github.com/denzel-
morris/clex/blob/master/lex/lexer....](https://github.com/denzel-
morris/clex/blob/master/lex/lexer.go)

2\.
[https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c-family/c-l...](https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c-family/c-lex.c;h=3765a800a5799209bca5018dd36c9f3c53525b8f;hb=HEAD)

3\. [https://github.com/llvm-
mirror/clang/blob/master/lib/Lex/Lex...](https://github.com/llvm-
mirror/clang/blob/master/lib/Lex/Lexer.cpp)

I always find it helpful to look at one problem solved different ways. Of
course it's not always apples-to-apples but it's as close as you're going to
get. Out of these three codebases, there is probably one you'd feel more
comfortable working with. I merely threw mine in there because I wrote it as
an exercise on what it'd be like to have a hand-written lexer read more like a
story.

It has a focused interface, most functions are descriptively named and operate
at a single-level of abstraction, and it's easy to intuit what a C lexer does
even if you're not familiar with what they're supposed to do. I'm sure it'd be
easy for most people to jump into my code and contribute.

Just like with LLVM's lexer. LLVM is far easier to jump in and contribute to
than GCC.

However, and this is just an example that immediately jumps out to me since
we're talking about it... doesn't it make sense to have this comment and code
abstracted out into a respectively descriptive function instead:

    
    
      void Lexer::InitLexer(const char *BufStart, const char *BufPtr,
                         const char *BufEnd) {
        // ...
        // Check whether we have a BOM in the beginning of the buffer. If yes - act
        // accordingly. Right now we support only UTF-8 with and without BOM, so, just
        // skip the UTF-8 BOM if it's present.
        if (BufferStart == BufferPtr) {
          // Determine the size of the BOM.
          StringRef Buf(BufferStart, BufferEnd - BufferStart);
          size_t BOMLength = llvm::StringSwitch<size_t>(Buf)
            .StartsWith("\xEF\xBB\xBF", 3) // UTF-8 BOM
            .Default(0);
      
          // Skip the BOM.
          BufferPtr += BOMLength;
        }
        // ...
      }
    

I mean think about it... we're concerned with initializing the lexer, why am I
being bothered with this string manipulation minutia to determine whether
there's a BOM or not? That could be extracted into a properly named function,
the comment eliminated, and then I could trust that that function does exactly
what it says on the tin can.

And there's plenty more places where this line of reasoning applies too.

Jekyll ([https://github.com/jekyll/jekyll](https://github.com/jekyll/jekyll))
is another example of a codebase that's very easy to read, work with, modify,
etc. It's very well-written. It follows Clean Code like principles where it
makes sense.

On the other side of the coin we have Kubernetes
([https://github.com/kubernetes/kubernetes](https://github.com/kubernetes/kubernetes))
which looks ready to collapse under its own complexity. It's supremely
difficult to read and understand what's going on not because of the essential
complexity of the problem but because of all the incidental complexity added
by the code structure.

I could spend forever citing examples on both sides because I've spent a great
deal of time thinking about this and talking with other seasoned developers.

If you (or anyone really) could, please offer up concrete examples grounded in
real production code. I'm always interested to see more examples.

Specifically examples blind Clean Code dogmatism applied in the wild.

~~~
mdpopescu
I don't know Go, but your example looks absolutely beautiful.

On the other hand, I'm one of those "smaller than that" people :) I mean, I am
ashamed that I left the time-to-string function in [1] too long. It could
definitely be split up (and I find value in that, I just haven't had time / a
reason to revisit that code).

[1]
[https://github.com/mdpopescu/public/blob/master/SocialNetwor...](https://github.com/mdpopescu/public/blob/master/SocialNetwork/SocialNetwork.Library/Services/TimeFormatter.cs)

~~~
Denzel
Thank you, I really appreciate that.

I understand exactly how I feel. I think your way of attacking it is the right
way -- pragmatism. You realize it can be improved/split but (1) you haven't
had time, (2) there's been no reason to revisit the code, and (3) it's serving
its purpose. That makes sense to me.

It takes time. Time and many passes to refactor code. The idea being that as a
developer becomes more experienced: the amount of time and the number of
passes _should_ begin to decrease. At least that's the _idea_.

------
dotdi
Aside from the clickbait-y title, I find it quite disturbing to base that
whole piece on, what even the author describes as, problems in "codebases
inherited from folks who’d internalized this idea to such an unholy extent".

Indeed, small functions can be bad if you completely and utterly overdo it.
But wait, that's true of nearly everything else.

~~~
d4nc00per
Yeah, the idea of small functions, and the overuse of DRY are separate.

~~~
copyconstruct
Author of the article here.

Nope, the article doesn't conflate DRY and small functions. But the quest for
DRY can lead to an explosion of small functions, which isn't necessarily a
good thing.

The vice versa is true as well - many programmers, in their quest to make
functions as small as possible, end up DRYing it up to the fullest extent as
well.

There's a relationship between the two, but DRY and small functions aren't
synonymous themselves, and neither does the article suggest that anywhere.

------
throwanem
Obsessive decomposition of the sort Fowler, for example, is cited preferring,
very quickly becomes pathological - I suspect that the codebase Fowler
describes in that tweet, to the developer as yet unfamiliar with it, reads
like one of those old IBM field engineer manuals where a giant circuit diagram
is spread across 800 letter-size pages, all bordered with dozens of numbered
arrows each referencing the page on which a given trace continues.

But I sort of feel like Sridharan throws the baby out with the bathwater, too.
I mean, in the CRUD example, carefully chosen abstractions make the code
easier, not harder, to read - if I'm working to comprehend how the application
handles UI state changes, I don't want to have that effort complicated by a
bunch of user-creation-related database interaction; I'd much rather that be
in a method call so I can deal with it it when I care about user creation, and
ignore it when I don't. Same for email messaging and event log injection.

And I have to say that my experience gives me to think the idea of preferring
duplication over abstraction is just completely, wildly off base. I mean, sure
- any given abstraction is likely to change over time as feature requests and
bug reports come in. That's just the job. But if the stuff that needs to
change is abstracted, it only has to change in one place. If it's _not_
abstracted, then it has to change in N places across the entire application,
not all of which are guaranteed to be easy to find - after all, you probably
don't have distinctive method or function names. Hope you've got good tests!
Except you don't. Or maybe _you_ do - I never have, at any point in my career
where I've worked with a codebase in which copy-pasted code was prevalent,
because such a codebase is a sign of an engineering culture that's far too
weak to support investment in automated testing.

~~~
al2o3cr

        But if the stuff that needs to change is abstracted, it only has to change in one place.
    

IMO, the point of "prefer duplication over the wrong abstraction" is that "all
the stuff that needs to change" is a partially-to-totally unknown quantity.
You can either guess at the time you have the minimum possible amount of
knowledge (when the code is initially written) or wait and make that decision
later with more evidence.

Even good tests won't save you from premature abstractions with the seams in
the wrong places when a change request that cuts across them comes in; they
will _help_ as you try to disentangle multiple use cases that go through the
same code paths that now need to be different, but it's still a mess. Or worse
- you might _not_ disentangle them, and wind up with "generic" code littered
with 'if special_case' fragments.

~~~
throwanem
Better to make the case against premature abstraction as such, then, without
bringing duplication into the mix. "Don't abstract before you know for sure
what needs abstracting" conveys the same valuable point without the same
confusion of concerns; so, more concisely, does YAGNI.

------
d--b
> The idea that functions should be small is something that is almost
> considered too sacrosanct to call into question

Errr... Really?! I thought we all agreed that the first rule of programming
style is that "it depends"...

When they say "small functions", they mean "not the 5000 loc VBA macro that
has 50 Boolean arguments, and 30 side effects".

Breaking a function that does 1 thing into sub functions just so that each of
them is smaller is not a good thing. And I think people realize that fairly
quickly.

~~~
Veedrac
> When they say "small functions", they mean "not the 5000 loc VBA macro that
> has 50 Boolean arguments, and 30 side effects".

Clearly, though, not all of them do. From the article, for context.

> The first rule of functions is that they should be small. The second rule of
> functions is that they should be smaller than that.

> In my Ruby code, half of my methods are just one or two lines long.

> Any function more than half-a-dozen lines of code starts to smell to me, and
> it’s not unusual for me to have functions that are a single line of code

> I’ve worked on codebases inherited from folks who’d internalized this idea
> to such an unholy extent that the end result was pretty hellish and entirely
> antithetical to all the good intentions the road to it was paved with.

It seems to me there are a lot of people who espouse making your functions
extremely short, and for who a 30 line function counts as long.

~~~
jerf
I've heard rumors of people operating under a "all functions must be one line"
rule, though I've never encountered it in real life.

------
_pmf_
I blame "Clean Code"; it's recommended reading, but at its core it dumbs down
refactoring to mechanically factoring out common code without actually
crafting abstractions. I'm still in the process of unlearning this. The prime
directive should be "craft sensible abstractions", not "avoid duplication at
any cost"; even more so when we're talking about actually modular software,
where duplication is much easier to tolerate than blurring responsibility
lines. (I'm generally not a big fan of Uncle Bob.)

------
jstimpfle
In a program, there's a lot to optimize for

    
    
        - minimize depth of call stack. It's easy to lose track in a deep call stack.
        - minimize function overhead
            - names that have to be invented
            - function calls and arguments that have to be written (and read again).
            - many small functions: "ravioli code" where it's really hard to distinguish functions
              by their "function"
        - minimize/localize state
           - if it's easy to separate significant state into a function, do it.
             (Not making a statement about objects. They are long-lived and their
             state doesn't go away after the first call).
        - DRY
           - Multiple identical or similar code blocks are an opportunity to make
             a function. We can roughly categorize into essentially (conceptually)
             and accidentally similar code, and the latter case is not an indication
             for a new function.
        - obvious thread of control
           - simple code has the nice property that it's easy to understand by
             following or mapping out sequentially. By contrast, highly abstracted
             or callback-heavy code is hard to understand at a global level.
        - consistency
             - ... helps understanding, but can also cause an implementation to be
               5 or 10 times longer if applied dogmatically.
    

It's good to read all these articles ("avoid state", "avoid long functions",
"avoid objects", "avoid functional programming", "avoid abstraction") and to
deeply understand them. Which probably means making all the mistakes on one's
own.

In the end it's important to know in which situations a particular style works
well, and to not be dogmatic when choosing a style.

I would say a good program tends to have both large and small functions, and
the large functions tend to be at the top of the abstraction stack.

------
novembermike
IIRC the original concept of DRY wasn't "Don't Repeat Yourself" (that's the
pithy version that's easy to remember but wrong). I think it was more like "A
piece of knowledge should be found once in a code base", which isn't quite as
easy to follow or remember but cuts to the issue more deeply.

The problem isn't superficial duplication. I might do a .map(x->x*2) in
multiple places, but if they represent different pieces of knowledge (money in
one, age in another) then I may not be Code DRY but I'm still Knowledge DRY.
An easy way to tell the difference is to think about changing one of your
functions. If you'd be happy that it changes the behavior everywhere in the
function then you're Knowledge DRY. If you're scared, you're just Code DRY.

~~~
zmmmmm
Nicely put. The problem with DRY is it is expressed in term of "repeating" so
it suggests that code repetition is the problem. The flip side of achieving
DRY is breaking isolation between two pieces of code. That's a downside - so
DRY is a tradeoff, not a simple one sided win.

The real questions are things like "Is it likely if I change logic A, that B
will also need to change in the same way?" and "Is it likely that if there's a
bug in B, that would also be a bug in A?". If those things aren't true, you
are shooting yourself in the foot because you will certainly be going in and
changing A and B to maintain them and now every time you do that for one you
are at risk of breaking the other.

DRY is usually right, but it definitely comes at a cost.

------
alexandercrohde
I think the author doesn't understand the small-function-philosophy (at least
not the same way I do). Let me clarify how I see it:

\- Build a ton of small functions that are reusable across any project. You
are essentially making useful concepts. (i.e. a library). Bottom-up.

\- Once you have those, a business problem often will only be about 3 or 4 of
those powerful, reusable functions.

So something like sending out a newsletter might end up being:

function sendNewsletter(letter) {
database.getAllUserRowsAsIterator().forEach((row) => { sendmail(x.address,
letter); } }

Now if we want the whole newsletter not to fail if there's a single exception,
we can make another reusable construct "count exceptions" that's a wrapper
function that catches all exceptions and builds a hashmap.

If you want this to work in a larger project, this requires having reliably
unit-tested code and doc-blocks so that other people can reuse your
abstractions, and then having roughly comparable coding skill to you.

------
nabla9
Functions have several functions.

* Functions can be like new words in a language. If there is new concept that is used frequently, it should be named or abbreviated and maybe listed in a index.

* Functions can sometimes be like chapters or sections. They are entry points listed in table of contents (API description)

* Functions are sometimes used like paragraphs. Used only once to make very long section easier to read. Not really functions. Way to structure text.

For paragraph use we might want blocks of code with function like interface.
Code editors could collapse and expand them.

    
    
        let a, b, c = 100 
        let p = 0
    
        paragraph "Intro to foo" (constant a, modify p) {
            let k, l, m, ...
    
        } assert (p < a)

------
laythea
Function borders should be drawn by defining abstractions, not by counting
lines of code. Sometimes, the most useful abstractions involve large
functions.

~~~
sqeaky
Do you have examples?

~~~
laythea
The first one that comes to mind is main().

------
sqeaky
There were no real examples of when long functions helped. I can come up with
30 or examples of long functions being bad in the codebase I am working in
now.

I stopped reading about halfway through as the contrived DRY twitter post. I
did search the article for "test" and a few other keywords and found more
flowery words without substance.

All the systems I have seen with long functions have had more bugs and greater
difficulty to test. All the systems lacking DRYness have invariably had bugs
because things weren't updated in all the places they needed to be. I have
seen a clear correlation. I know that correlation is not causation, but the
mechanism is simple and easy to understand: Long functions have more places
for bugs to hide. The mechanism for dRY to help is equally simple:DRY code
means fewer things need to be changed to fully cause a correct change. Both of
these reduce the place for bugs to exist and my experience matches that.

I understand that my experience is that of only 1 person, but I have held 9
different software development contracts in the last decade and seen the full
range of long to short functions and I have seen DRY and non-DRY code. This
pattern has followed into open source code I have inspected. If I am wrong it
will take strong concrete examples of DRY or short functions hurting, and I
mean links to repos like github.

~~~
copyconstruct
Author of the article here.

That's primarily the reason why there are no "concrete examples", because one
person's concrete example is another person's definition of contrived.
Splitting hairs over some toy example wasn't something I thought would
buttress the ideas presented, though I can imagine why some might need that
scaffolding to follow along.

Re resting - here are some points the article makes about how smaller
functions can in some cases hurt testing:

1) "Furthermore, when the dependencies aren’t explicit, testing becomes a lot
more complicated into the bargain, what with the overhead of setting up and
tearing down state before every individual test targeting the itsy-bitsy
little functions can be run."

Disambiguating this for you, one of the myths of smaller functions is that
they are easier to test. The article makes the claim that this isn't always
true, because many who wax lyrical about the beauty of smaller functions also
champion for fewer arguments to be passed to the function. The book Clean Code
very explicitly states this (read the book, not as a how to guide but as a
cautionary tale), and many a time what I've seen happen in the wild
(especially in Ruby) is that programmers who like small functions _and_ don't
like making deps explicit - which then means the code (and ergo tests) rely on
setting up shared global state.

 _Of course_ one can write smaller funcs with fewer args and not do this, but
that's not the point here. The main argument is that making functions smaller
doesn't always make it easier to test.

The article also does provide two examples when having the smallest possible
function does help in testing. I'm not going to rehash those arguments again
here.

~~~
sqeaky
I have never read _clean code_ and bringing it up as an argument against me
and my points is a straw man. I don't think you meant for that though. Most of
my knowledge was hard earned from a couple of decades of cleaning up
disgusting long functions.

As for testing it is trivial to create examples of larger functions that
cannot be tested each line of code introduces another possible thing that gets
in the way of testing. I provided a contrived example (which is better than no
example). The summary of it was that if a function creates connection to an
external resource, uses it, the disconnects there is no way to test that
function. Simply breaking it into three functions allows testing of the logic
that uses the resource. Adding parameters that allows the resource or resource
creator to be passed in allow testing of all three functions.

By skipping examples because someone will complain about how contrived it
seems is to throw the baby out with the bathwater. As it stands your point is
not falsifiable because it is possible for people on your side to just say
"that's not a good long function" just as if it weren't a true scotsman. With
examples you can at least ignore people who complain about the contrivance of
them just as most successful technical speakers at conventions do.

> The main argument is that making functions smaller doesn't always make it
> easier to test

Only a Sith argues in absolutes... Sorry, I had to.

Seriously though know one seriously argues that it "always" makes it better.
It just makes it better such a preponderance of the time that arguing against
is foolish. It is like arguing for reasonable uses of goto, they might exist,
but we as an industry have moved onto better designs.

------
jondubois
The loss of locality argument is a very good one. Having to jump around
different files whilst holding layers upon layers of abstractions at the back
of your mind to figure out the source of a bug is overwhelming and extremely
distracting (you literally need to keep a call stack inside your brain to pull
that off).

DRY is all fine until you need more information about the function than what
the function name and documentation can provide - Sometimes you actually need
to peek inside the code itself.

I don't think that code can ever be 100% black box; especially as you move up
higher in the chain of abstraction. This is particularly true for dynamically
typed languages - These days in JavaScript I often find myself peeking inside
the function's code before I invoke it - It's very easy to make false
assumptions about the behaviour of the function based on its name alone and
often the documentation isn't enough and doesn't tell you anything about the
performance of the function (is it O(1), O(n) or O(n^2)? - You wouldn't want
to call an O(n) function whilst in a loop over n because then you'd get very
crappy O(n^2) performance).

------
Sergesosio
Functions exist to prevent code duplication not for commenting code, that's
what comments are for.

~~~
theonething
I believe most rubyists would disagree with you.

I had a ruby loving coworker that made you feel like a moral failure if you
commented your code instead of making it 'self-commenting'

------
calafrax
This discussion is predicated on the concept that function size is calculated
by the number of lines which is completely wrong.

function size (function complexity actually) is measured primarily by indent
levels not length and when there are multiple indent levels with nested
branches and loops this is when you are supposed to create functions. length
is not really an issue in most cases.

~~~
mannykannot
Most of the arguments in the article are agnostic to the size metric -
offhand, I cannot think of any claim that is invalidated by changing to your
measure. Furthermore, it is the proponents of short functions quoted in the
article who are using line count as a metric, so it is not as if the author of
this article has created a straw man by choosing a misleading metric.

~~~
jacalata
Most of the arguments in the article seem agnostic to the size of the function
altogether, which is hilariously ironic given the complaint about the "long
functions are bad" example at the start of the book.

~~~
mannykannot
Most of the arguments concern the side-effects of splitting a given unit of
work into a lot of functions, which is an inevitable consequence of the
strategy being deprecated.

------
moocowtruck
I feel like an article such as this should be full of code examples, just
talking about this or that in which there are many dependent situations in
which "it depends" just makes me get sleepy

~~~
sqeaky
Even the highest voted post here is commenting on the value of examples and
lack of value in higher order discussion.

I must discard the post because everyone I have known professionally with
similar opinions was grossly incompetent. Some strong examples would make it
foolish for me to disregard this out of hand.

------
deckiedan
I wrote something myself trying to figure out my thoughts about this a few
years ago (note: I was sleep deprived when writing it...):

[https://madprof.net/nerdy/refactoring-
really/](https://madprof.net/nerdy/refactoring-really/)

There certainly has been a over-emphasis in some quarters on 'each function
should only do one thing' and even 'most of my functions are only one or two
lines long'. Possibly because it makes it easier to write tests for, and be
certain you've covered all of the possible options. You just then end up
writing 4 million tests.

It's all about clarity, I think, but different people find different things
clearer in different situations.

------
voidhorse
Programming is like writing. There are some general rules that will usually
make your meaning clearer. There are also exceptions. Begin following the
rules, prioritize clarity and semantics, identify when breaking a rule would
make your meaning clearer and break away.

At the end of the day, you're still just communicating--in communication
clarity is key.

You must also keep your audience in mind. Thus, if your company follows
conventions you stick to them, as your audience will be able to digest said
conventions quickly.

It's all language.

Programming only has the additional wrinkle that you are also communicating
with a computer--which prioritizes very different things than human readers
(efficiency, memory management, etc.)

------
andybak
This was interesting:

> an awful lot of times, pragmatism and reason is sacrificed at the altar of a
> dogmatic subscription to DRY, especially by programmers of the Rails
> persuasion.

I'm not making this a tribal thing but it does sometimes feel like there's a
cultural tendency in the Ruby and Javascript communities to... well... preach
a little bit. All advice is flawed and most maxims are only partially true.

If a community bounces from one "one true way" to another all the time then
it's probably not a particularly healthy environment for those who are
learning - as they tend to lack the experience to put advice into the correct
context.

------
euroclydon
If you want to have an easy time refactoring code later, forgo OO patterns
that have properties and methods in the same class. Instead, make classes for
your data, and make sure to give each class a deep clone method.

Then, your logic goes in static functions that do not alter the input, but
rather spit out new or cloned versions of the data in the output. Then you can
reason and refactor at the method layer and not worry about hidden side
effects.

~~~
sqeaky
That seems a bit extreme compared to where many teams are at, but I can see
how it could work.

How would you compare that to reducing the size of classes to bound where side
effects can happen?

------
matttproud
Better Stated: (overly) DRY considered harmful.

This is one thing about Go programming culture and its prevalent idioms I
appreciate: low context switching cost.

------
mnarayan01
1\. Functions which are "longer than they should be" decrease code quality.

2\. Measuring the "number of lines per function" is extremely easy.

Code quality tools have a bad tendency to equate (1) and (2) for the same
reason that the proverbial drunk searches for his keys under the streetlight.
This has legitimately bad consequences.

This is further compounded by the way that small functions ease mock-based
testing. While certainly attractive in the abstract, when a code base is
overly influenced by this I find that it is substantially more difficult to
understand via inspection.

All that said...I find the whole "X considered harmful" formulation almost
unbelievably annoying. Here it doesn't even make any sense.

------
hbt
>>> If you have to spend effort into looking at a fragment of code to figure
out what it’s doing, then you should extract it into a function and name the
function after that “what”.

Fowler is right about smaller functions and OP misinterpreted his statement.

This is what Fowler means
[https://gist.github.com/hbt/3e71146454a2d6388338af1d76394a13](https://gist.github.com/hbt/3e71146454a2d6388338af1d76394a13)

Abstract the fragment of code in a function, keep it within the function until
you need it elsewhere and do not pollute needlessly the API with functions
that are poorly named and used one time only.

------
enqk
My theory is that the small functions motif appeared with practitioners of
languages where the function is the only tool introducing a new scope / block
for variable definitions. Languages like Python or Javascript. What about
Smalltalk?

In language where blocks can be introduced at will (stricter Algol descendants
like Pascal/C/C++/Java) or are introduced by control structures, then the need
for such _rule_ is much less necessary, and only the harm done (friction,
readability) by fragmenting and obscuring logic remains.

------
joevandyk
I've never been served wrong by constantly asking these questions:

1\. How hard is this code to change?

2\. How hard is this code to delete?

3\. How hard is this code to test?

Code that is hard to change, delete, and test is bad. Code that is easy to
change, easy to delete, and easy to test is good.

(Note: you can't easily change code if you can't understand it)

Long vs short functions don't really matter. Objects vs functions don't really
matter. It doesn't matter if you use or don't use a ton of abstractions if
they are easy to delete and change.

------
alkonaut
one problem with the example of functions

    
    
        A()
    
        B()
    
        C()
    

Called in sequence is that they smell of imperative code. A recipe of some
kind, of steps performed in sequence. That kind of code, when it occurs,
should probably just be performed in a single function. That is - until either
A, B or C can be re-used by other code without creating an unnatural
abstraction. If the steps as separate functions can be tested separately -
great. But if they are only ever used in this sequence - do you really need to
test them individually? What good does the breaking up into functions do,
compared to just some comment text in a longer method.

    
    
        // Step A: 
        ... 3 lines
        // Step B:
        ... 4 lines     
        // Step C: 
        ... 3 lines
    

Answer: very little. And it forces the reader to scroll to see the relevant
code. Some languages allow the use of local functions - which is basically
just some trickery to help with variable scope and not have to use a comment
line to call the sub-step something. Can be quite useful.

A better example of 3 functions is when you have

    
    
        y = A(B(x))
    

and then turn it into

    
    
        y = A(B(C(x)))
    

If the C can have some kind of semantic meaning (e.g. C just fetches the price
of item x before the rebate is applied by B). In this kind of functional code
there is usually very little harm in making more and smaller functions. Not
sure where I'm getting with this but I assume It's kind of an argument for
avoiding procedural code to begin with, and aiming to make actual _functions_.

~~~
sqeaky
There are several advantages you are disregarding.

Functions have names and you get to name the code in a way that shouldn't
expire the way comments can.

In most languages separate functions create separate scopes. This prevent
incidental use of temporaries from one section into another.

Having a smaller scope to look at means that refactoring or changes to new
business requirements can have a smaller place for side effects to occur.

I agree that testing each part on its isn't required, but if it becomes part
of the public API and A is setup database connection and C is teardown
Database connection and B is use the database connection then as one function
B is really hard to test. Breaking it into three functions lets you write unit
tests for at least B by mocking A and C.

~~~
alkonaut
> Functions have names and you get to name the code in a way that shouldn't
> expire the way comments can.

Sure that's a benefit, but often times I think the scrolling is a downside
that is underrated. The code being right in front of you has huge value.

> In most languages separate functions create separate scopes. This prevent
> incidental use of temporaries from one section into anothe

Right. Thought I mentioned that. That's why I think local functions are good.
Gives the best of both worlds. It's a proper name instead of a comment, and
gives the right scope.

> but if it becomes part of the public API

Yes obviously API design is a separate (and MUCH) harder question than
refactoring for breaking logic apart I think.

~~~
sqeaky
> but often times I think the scrolling is a downside that is underrated

What editor can't take you directly to a defined symbol, even ones in other
files, nowadays?

As for API design, all classes and functions implicitly become APIs for more
abstract code that calls it. Give any surviving piece of enough time and
eventually one part of the code with depend on things provides by what
logically seems like entirely different parts of the code. At some the team
will wonder why these aren't separate libraries (or gems, pips, rocks,
packages or whatever), and the API will be whatever the original classes and
function were. And if those things were well encapsulate that process might be
easy.

This is why least responsibility, DRY and the smallest functions possible are
important. Eventually everything will be part of an unbroken stack connecting
your users to a CPU and to fix problems in the middle you need simplicity.

------
no89ntoui
I don't necessarily disagree with the post, but I will say that in almost 30
years of programming, I can only recall one programmer who took small
functions to a harmful extreme. On the other hand, I can recall dozens who
took long functions to a harmful extreme. So yeah, it can be bad, but I see
the other extreme way more. (That may be because I don't do enterprise
programming, though.)

------
realharo
This old article explains it perfectly
[https://whathecode.wordpress.com/2010/12/07/function-
hell/](https://whathecode.wordpress.com/2010/12/07/function-hell/)

Over-optimizing for local readability can hurt global readability in the end.

------
imron
For me, I generally try to break in to smaller parts any function where the
logic extends over one screen in length.

There are of course always exceptions to this, but having logic take up
roughly a single screen size makes it easy to reason about.

~~~
sqeaky
I have a 4k screen at some point this pattern stops working.

------
Newtopian
A function should be X Lines long... but no longer

where X is any number of lines necessary for implementing the ONE thing that
function should be doing

Any attempt to replace X with a concrete number will invariably sacrifice
simplicity for the sake of that number.

------
rhinoceraptor
I think you can avoid a lot of these problems by using small inner functions.

------
maxxxxx
How about "following a rule blindly without understanding its rationale and
limitations considered harmful?". People need to understand what they are
doing and why and make judgement calls.

------
blickentwapft
"Considered harmful"?!? Why, this must be both IMPORTANT and LEGITIMATE, and
the author has AUTHORITY. When a negative opinion carries such an academic,
aloof, and professional sounding headline then I am inclined to give it great
credence. This guy didn't just say "here's something I think is bad", he said
it in fancy and upmarket words. Let's closely follow what they say.....

Seriously, HN should have code that auto flags anything including a subject
line of "considered harmful".

------
DanHulton
Anything, taken too far, considered harmful.

------
twii
So, the author thinks this is harmfull?:

    
    
      var area= ( width, height ) => width * height;
    

If not, it is just clickbait for me.

~~~
Clubber
If I'm understanding you correctly:

It would be harmful if you were calculating something subject to change, like,
say price.

    
    
      price = (cost) => cost * 1.10;
    

And you sprinkled that throughout your code. The problem arises when the
business says they want to be able to change the logic so that it lowers the
price for customers that have subscribed over a year. Now you have to change
the logic everywhere that shows up, rather than having a single function where
you change it once:

    
    
      function decimal Price(cost) { .. }
    

That in essence, is why DRY is important: code maintenance & refactoring.

Even your example it could be dangerous:

    
    
      var area= ( width, height ) => width * height;
    

What if the business was making sandboxes and they wanted to be able to add a
short wall around it so cats have a harder time using it as a bathroom. It
would then be:

    
    
      var area= ( width, height ) => (width + 1.0) * (height + 1.0);
    

Again, if you had sprinkled that throughout your code, you would have to
search and find every instance of that and change it. This is also harder to
test, because if you miss one, chances are, it won't show up immediately.

