
Small Functions considered Harmful - okket
https://medium.com/@copyconstruct/small-functions-considered-harmful-91035d316c29
======
sctb
Previously:
[https://news.ycombinator.com/item?id=14988206](https://news.ycombinator.com/item?id=14988206)

------
manmal
It occurs to me that author refutes one dogma (make functions as small as
reasonably possible) only to propose another dogma (small functions are
harmful). No thanks, I will continue improving my style such that it matches
my abilities, and hopefully also the abilities of people reading my code.

Since I'm in the position to use Swift, I often define small functions WITHIN
functions, in order to give the behavior of a code block a name. I'm also
using LOTS of immediately executed closures (is that the correct term?) to
reduce scope and profit from slightly easier control flow. And I don't see how
this would be bad.

Meta: I thought we were over this dogma thing. I watched the TDD thing become
popular, and then fall from grace - why do people still need to pretend they
know better than me, the reader? Do they want to become gurus, or do they
soothe their own feeling of "doing it wrong"?

~~~
falcolas
> why do people still need to pretend they know better than me, the reader

Perhaps they do? Or perhaps they just see a trend starting up in their day-to-
day work and want to try and nip it in the bud?

I see this exact same trend forming: in my last code review I saw a function
which had exactly one line of code in it - a string format operation. It was
called twice, and had no likelihood to be called more.

That was the most gregarious, but not the only, example.

> And I don't see how [lots of small functions] would be bad.

Well, I am not sure about Swift's implementation, but functions are typically
expensive to execute. They require pushing values onto the stack, switching
control, executing code, popping values off the stack, switching control back,
etc. Lots of CPU operations and memory manipulation (not to mention cache
misses, etc). There's also the possibility of blowing out your stack if you
store too many values on it, or nest too deeply.

From a cognitive point of view, you have to interrupt your current flow of
reading, move to a different spot in the current file (or another file
entirely), to track down the behavior of a certain function. As the
functionality of that function mutates over time, its name is rarely changed
to reflect the updated functionality, which means it can create confusion and
require even more time to consider the corner cases.

~~~
Tarean
You are not going to blow your stack without recursion, even with a fairly
small stack size. Iirc swift has huge stacks for objective-c interop by
default so even that shouldn't be a problem.

Overhead is a fair point for jit/interpreted languages if you want to write
performance critical code in a jit/interpreted language for some reason. Tiny
functions are almost certainly going to be inlined if the compiler does its
job, though, so it isn't a huge deal for compiled languages.

And the point is that the function can be so tiny that you would just replace
it with a new function. Then you can reason locally about behavior. This is a
very functional approach, though, and can break horribly in imperative
languages so there definitely is a balance.

~~~
falcolas
"Almost certainly" means that there will be a number of times where they
"almost certainly" will not.

Inlining is hard (especially with languages which allow for overriding
functions), and is not done perfectly (or at all). The LLVM compiler toolchain
in particular has a lot of properties which can inhibit inlining code; and in
some cases you even have to explicitly tell the LLVM compiler when it can
inline a function.

------
chatmasta
The author is really overthinking this. Small functions, big functions, long
variables, short variables... none of this matters. The goal of programming is
to express a solution to a domain-specific problem, using a general
programming language. The job of a programmer is to arrange a set of logical
building blocks (the general programming language) into the domain-specific
solution. Good code will be naturally express its domain specific logic in a
way that is both readable and maintainable.

Readability is most strongly related to the ease of following all possigle
code paths from a given entry point. That is, can I start reading in main()
and iteratively follow the possible code paths given a specific input?

The answer to this question is more related to separation of scopes than it is
to function size. If a single scope contains many possible branches and code
paths, then it will be difficult for a reader to follow a specific code path
amongst the noise of irrelevant functions (short or long).

All else equal, the length of a function is primarily a matter of style and
preference. The priority of code should be expressiveness and clarity. If
small functions enhance expressiveness, then they are the tool for the job. If
big functions add clarity or aide in separating scopes, then they are also a
tool for the job. Sometimes you need both.

Worrying about anything other than readability, maintainability and
correctness is needless dogma.

~~~
falcolas
Here's the problem - dogma already exists, and it exists in the name of
readability, maintainability, and correctness. DRY is the particular bit of
dogma which has been taken to the extreme.

And this is the thing about extremes - to get back to the center, you have to
push towards the opposite extreme (and hope your push has the right magnitude
to not overshoot the proper location). It's why the title is extreme, and the
article is not. "Small Functions Considered Harmful" is easy to remember, and
a soundbyte which can compete at the same level of "Don't Repeat Yourself".

~~~
marcosdumay
That's just too bad. Because such pushes into the other extrema practically
always overshoot, and because the author can not anymore push the line that
dogmatic programmers are not good programmers.

~~~
majewsky
It's like politics all over again.

------
hyperpallium
One criterion for decomposition into functions is to mimic the structure of
the problem domain. This makes it easier to check against the domain, to
reason about one in terms of the other, and to extend. It's especially helpful
for interactions with problem domain experts. A "domain" includes math.
Efficiency may require a different structure - note though that when
efficiency trully is paramount, it licenses outright barbarism.

Another approach is "disproportionate simplicity", and works out similarly to
an abstraction, something of a DSL. If you can modularize functionality, such
that that aspect becomes much simpler, do it. Note that it might not solve the
whole of that problem (like the core of git is a "stupid content tracker", it
solves the part it does tackle with simplicity disportionate to the
alternatives; but that simple core doesn't solve everything). It's about where
to draw bounardies (between modules, with functions being one kind of module).

Yet another is the traditional criterion, based on likelihood of change:
boundaries between modules [functions] should be unlikely to change; the stuff
that is likely to change should be "hidden" within a module [function].
Unfortunately, I've found prediction to be tricky, particularly when it
concerns the future of programs.

------
jswizzy
The main point of small functions is they tend to be more readable. If they
are too small then you get into code golf territory and loss readability. It's
a balancing act, you need to follow the Goldilocks principle when writing
algorithms. Also, the author states at the end that small functions aren't bad
so the title is misleading.

~~~
falcolas
Well, there are small functions (that encapsulate a particular bit of
behavior), and small functions (that encapsulate a repeated bit of code). The
OP wants to get rid of the second, not the first. The problem is, they're both
"small functions".

------
userbinator
IMHO if you look at the history of how computers were programmed, then you
will see that functions/subroutines/procedures/callable blocks/etc., as well
as other abstractions, were basically invented to serve a very obvious and
utilitarian purpose: to reduce duplication of code; and anything beyond this
is dubious.

I find "straight-line" code, perhaps in one long function, with comments to
provide commentary and "navigation", far easier to read than the dozens-of-
tiny-and-verbosely-named-functions style too. The "readability" argument is
commonly used, but it's focusing on the wrong thing: a single-line function is
certainly going to be easier to understand than a 512-line function, but
understanding a single-line function does not make it any easier to understand
the system/algorithm/etc. as a whole. The latter is extremely important,
because not knowing "the big picture" can lead to very bad decisions overall;
I've seen many cases where bugs or accidental and severe inefficiencies (e.g.
unnecessary allocations, high-polynomial complexity, multiply duplicate
accesses to data, etc.) were created because the author of the code only
focused on a tiny piece and neglected to consider its application in the
whole.

There are some very insightful posts by an APL(!) programmer here, discussing
the topic of complexity overall vs. complexity in parts:

[https://news.ycombinator.com/item?id=13565743](https://news.ycombinator.com/item?id=13565743)

[https://news.ycombinator.com/item?id=13797797](https://news.ycombinator.com/item?id=13797797)

I suspect part of the motivation for producing "microfunctions" may have come
from a misunderstanding of the "decompose the problem" principle --- which is
intended to mean that you, as a programmer, decompose the problem into simpler
steps --- but not that each step necessarily warrants a function.

The same problem and principles apply to other levels of organisation:
classes, structures, files. etc. --- they are intended to reduce duplication
and simplify code, but will have the opposite effect if used to excess.

------
rufius
Writing code has many analogs to writing prose. If you were writing a letter
and you only used small sentences, it'd be irritating to read and probably
difficult to follow.

Rather than dogmatically applying "no small functions" or "break functions
down as much as possible" it's more useful to look at how the code
communicates the idea (and achieves the goal).

If the function is calling 30 things before it does the 4 things that
logically map to the function name, maybe consider refactoring out all or
parts of the preceding 30 steps.

~~~
pvg
Functions and other forms of structural and semantic composition are among the
(many) things that make writing prose entirely unanalogous to writing code.
You can't sprinkle prose with "descriptiveParagraph(targets, descriptors)" or
"dialogScene(characters, lines)".

------
olalonde
This would be 10x more useful with some code examples.

------
mto
I haven't read the whole thing but I'd briefly say: true, there are many
codebases where you have to jump through a million strangely-named onliner
functions (+5 lines for parameters, braces, modifiers etc... sometimes as
declaration and definition) to find out what that thing actually does. It
pollutes the name space, fragments the code as hell, fills up a lot of writing
space, fills up the stack and the mind. A crash yields a call hierarchy that
fills a roll of toilet paper.

Hope that's a good tl;dr for the article ;)

------
quadcore
The problem with this industry, software development, is that it didn't yet
get that programmers are beginners for a very long time. Like the human child,
it takes 20 years of programming experience to mature a programmer and before
that he's just a child.

Let's assume I'm right here for a minute. If most programmers then are
intermediate programmers, what kind of advise needs most to be given knowing
businesses needs to run? The ones that do "damage control". There is no issue
with that, it's ok to do damage control.

Let's take some example. "readable code". Yes for an intermediate programmer
it is a good thing if the code is not too dense because those code tend to
draw all the programmer's energy when he tries to make sense of it. But as the
programmer get experience, he can read more and more dense code to finally be
able to get in a glimpse what looks like obfuscated code to others. For a
mature programmer, the important point not readability, it is for the code to
be as small as possible which generally tend to produce very dense code. The
mature programmer draws on his abilities to understand obfuscated code to
write even better code.

"write a test first" (beside the fact you're here actually asking a beginner
to solve his code problems by writing more code which is at best controversial
to me). A mature programmer don't write a test first, he thinks that way
already - and yes he delivers code that has less than a percent failure rate.
Now, some code may require unit tests, I'm not saying this is wrong in every
situation, it's just not right in every situation.

Anyway, what I'm trying to say is: I wish we have a way to finally make a
clear distinction between good advises for intermediate programmers and what
is actually mature programming because it's very different - the mature
programmer gonna make the function as big as it make sense to be, he isn't
gonna artificially break up his body into small functions to make it more
readable: he can read the code already.

