
Avoid Indirection in Code - kevlar1818
http://matthewrocklin.com/blog/work/2019/06/23/avoid-indirection
======
gitgud
It's not _indirection_ , it's _bad abstraction_ that's the real issue here.
Consider the example used in the article:

    
    
        if x.startswith("foo"):
          do_something_with(x)
    
        if is_foolike(x):
          do_something_with(x)
    

The problem with both of these variations is that the "if-statement" doesn't
have any meaning behind it. There's no gain in the indirection presented here.
Whereas the following code has meaning:

    
    
        if checkHasPermissions(x):
          do_something_with(x)
    

The reader of the function is now able to understand the _purpose_ of the
indirection and the function has clear responsibilities.

Remember although code is interpreted by machines it's _read_ by humans...

~~~
geocar
In many cases speaking of the foolike nature of a value is not dissimilar to
speaking to the primeness of an integral: Is a value prime? And now here's is
an algorithm for determining whether this value is prime.

The motivation for many splitting out a prime-testing function is that
primality testing is hard to do efficiently, not that there are many kinds of
primes or many kinds of integrals. Indeed, in some languages "primes" is a
builtin (p:), or a trivial derivation {(2=+⌿0=(⍳⍵)∘.|⍳⍵)/⍳⍵} for some domain,
and that's the only domain our application is concerned with so is-prime is a
(⊣∊primes) -- is foolike-ness like that?

Nothing is being abstracted here, and nothing is gained by indirection: Does
this is-prime routine even deserve a name?

Sometimes the foolike nature has nothing to do with the value: Is an apple
foolike? It isn't today, but it might be tomorrow! (That's numberwang!) Is
this account locked (go check the disk!)? Sometimes we encode that the account
is locked by putting an asterisk in the password field -- this won't match any
hashes, so the login will always fail, and a "is_locked" routine might start
exactly this way: def is_locked(x): return x.password.contains(42) but in the
future we want to be sure that this isn't a _copy_ of a locked account or
there's a synchronisation issue, so we go check and then def is_locked(x):
return x.userid in locked_accounts

Nothing is being abstracted here either- we're still only talking about giving
this algorithm a name (or maybe delaying the discussion about _which_
algorithm gets this name). So why don't we just say 'something in x.can
instead of x.can('something')? Because the former means set membership and the
second may be able to not reify the set. Proxies like __in__ might help, but
it's not just the syntax: Our language has let us down (clearly, not enough
laziness! time to haskell all the things!), and so we cheat: The locked button
sets a flag on the user _object_ , instead of adding a user to the real list
of locked users.

But what if our language were smart? Why wouldn't we write x.userid in
locked_users?

> Whereas the following code has meaning:

And yet it is wrong! It has a nasty security bug in it that you will never
find, and _the next guy_ will never find. Fools and children would argue
endlessly that something was wrong with the ticket or user story that led to
this code, or demand that they didn't have enough experts on the team, but no
matter how smart you are (or you think you are), you'll write a piece of code
that is so short it cannot possibly be wrong _and yet it is_.

Code hiding (whether you call it indirection or abstraction) also hides bugs,
and yet few of us have the time or attention to stare at the bits blurry with
a steady hand and a magnetised needle, so there'll always be _some_ amount of
code hiding. The real trick (if there is one) is to do as little of it as is
possible.

~~~
Y_Y
I very much enjoyed this poem.

All the same I think you may have thrown out the abstraction with the bath
water.

> Does this is-prime routine even deserve a name?

Absolutely. If you saw it over and over you might start just reading it as
"is-prime" instead of going through the process of mentally interpreting it
each time. But what about similar-looking functions with similar
functionality? You want a programmer to go and make you a promise that it has
the _meaning_ (if not the implementation) you're thinking about. That promise
is the same as "code-hiding".

~~~
geocar
> But what about similar-looking functions with similar functionality?

Even though we're unlikely to need this construction more than once:

    
    
        def is_prime(x): return x in primes(x)
    

This one (or one just like it) appears quite common:

    
    
        def is_in_set(x,y): return x in y
    

But Guido already noticed that! That's why he made an "in" operator in the
first place! As soon as our routines become too generic, we're just making APL
without arrays, and that's why taste matters _so very much_.

> That promise is the same as "code-hiding".

If I can see the implementation at the same time as its usage, then the code
is not hidden. Ipso facto.

But to expand more on what I mean by this, consider that close() is hiding
code: You get all of the bugs in the C library; in the Kernel; all in code you
cannot see either because it is locked/compiled, or because it's in a
different file you didn't bother to read. And so on. And yet the world didn't
stop spinning, and many people use close() every day without knowing or
dealing with a myriad of bugs and weird corner cases[1]. So it is that even if
code hiding is sometimes bad, it is sometimes less bad than other things we
could be doing.

[1]: Ha ha, only serious.
[http://geocar.sdf1.org/close.html](http://geocar.sdf1.org/close.html)

------
UglyToad
While the example in the article isn't great for making the point I actually
agree with the main idea.

I understand the arguments for 'Uncle Bobifying' code but I think, as the
article says, there's a balance to be struck. It's highly likely the next time
I see your code (or my code if I'm coming back to it a month or two later) is
when I need to fix a problem with it. While it's useful to have it split up
into logically discrete chunks with sensible naming these 2-4 line functions
far increase the mental load of the code.

Given I'm debugging the code because there is presumably a problem with it I
can't trust these sensibly named functions to do what they say, I need to
check their internals to reason about changes in application state. Though
well named functions at least help the maintenance programmer reason about the
why they're equally liable to code rot as the comments they replace.

As in all things there's a balance to be struck but I'd rather see a 20-30
line method than jump between 5 files because someone has read Clean Code
recently.

~~~
JamesBarney
The issue is fundamentally abstractions leak. And if the code is overly "Uncle
Bobified" the abstractions will leak bugs. On the other hand if you under
"Uncle Bobify" the code will be very difficult to read because you won't be
able see the forest for the trees. This is one of the advantages of comments
and local functions. You can inline a function an add a comment. With the
comment providing the abstraction and the line of code providing the details.
Or add a local function which is easy to read but can be easily seen next to
the code that's using it.

For instance you might have a function that is int getOverlap(DateTime start1,
DateTime end1, DateTime start2, DateTime end2). Looks pretty reasonable. But
there are a lot of unanswered questions like

Is the return value in seconds, minutes or ms? What happens if there is no
overlap? Do all the date time have to be in UTC? What happens if we pass in a
start time before the end time? What happens if I pass in a start time equal
to an endtime? If I split a block will the overlap of this block with any
other block match the combined overlap of any two split blocks?

~~~
lmm
> The issue is fundamentally abstractions leak.

Only people who've never used a good type system think this. Your function
example demonstrates this pretty nicely: use a proper interval type for the
two intervals and a proper return type rather than int, and then the answers
to all your questions become obvious.

~~~
JamesBarney
So say you replace it with a DateRange class so you end up with

range1 = new DateRange(start1, end1);

range2 = new DateRange(start2, end2);

DateRange overlapRange = GetOverlap(range1, range2);

You still don't know what happens when start1 is greater than end1?

What happens when they have no overlap? What does overlap return if it doesn't
return null?

What happens if start1 and end1 are in local vs UTC time?

What happens if range1 and range 2 are in local vs UTC time?

When you get overlapRange.Duration.Days is this days in 24 hr increments or
days by date?

And this is a really simple example. Things like ORMs are very leaky
abstractions which make this mcuh worse.

~~~
lmm
> You still don't know what happens when start1 is greater than end1?

That's DateRange's responsibility, it's not a concern for getOverlap. By the
time you get as far as getOverlap you already have a valid DateRange. Small,
reusable, compositional pieces.

> What happens when they have no overlap? What does overlap return if it
> doesn't return null?

It can't "return null", we're talking about using a decent type system here.
The types should express what kind of failures are possible, so perhaps
getOverlap returns a maybe, or an either where the left hand side is some
structured "reason" value that expresses what went wrong.

> What happens if start1 and end1 are in local vs UTC time?

You get a compilation error because that distinction is part of the type.

> What happens if range1 and range 2 are in local vs UTC time?

You get a compilation error because that distinction is part of the type.

> When you get overlapRange.Duration. Days is this days in 24 hr increments or
> days by date?

That's Duration's responsibility, not getOverlap's; again, small, reusable,
compositional pieces. You shouldn't ever be calling overlapRange.duration.days
(obvious law of demeter violation); rather you should be calling something
that can do the right thing with overlapRange.duration (e.g. display it,
compare whether it's longer than some limit).

But to answer the question: check its type, that will tell you. Any value that
you actually pass around and use in business logic should have a type that
tells you what kind of thing it is; primitives should only ever be used within
the very lowest level functions.

------
quasi
The point made by the article (avoid using due to problems for debugging and
code review etc.) does not hold much water. 'Indirection' can be invaluable
while separating 'how you got it' from 'what to do with it' (it=some data),
for example. 'what you do with it' can remain unaffected by changing 'how you
got it'. This will not mess with code review practices but make understanding
the pieces easier. It will also make debugging easier as you can test both
those pieces independently. What the OP calls indirection is just bad code or
meaningless abstraction.

Zed Shaw puts it well: "Abstraction and indirection are very different yet
cooperating concepts with two completely different purposes in software
development. Abstraction is used to reduce complexity. Indirection is used to
reduce coupling or dependence."

~~~
satyenr
Yup, seems like the author had trouble following badly written abstraction
and/or indirection and came up with the idea that they are bad.

Abstractions are invaluable if you are dealing with a large codebase. Even the
original author can’t keep everything in their head.

Indirections are invaluable if you want to be able to modify code in the
future without making a gigantic mess — and potentially breaking compatibility
in the process.

~~~
jrochkind1
Well, good abstractions/indirections are _hard_.

The _wrong_ abstraction can be worse than no abstraction at all. And it can be
non-obvious if you've got the "wrong" abstraction until much later when it's
been in use for a while and had to be maintained -- and sometimes even then,
we aren't good at recognizing that our pain is coming from the use of the
wrong abstractions.

We often are, without necessarily realizing it, reluctant to refactor existing
abstractions once there, preferring to alter them slightly or even more add
new ones on top (which can make things worse in the long-run). And this isn't
necessarily irrational, refactoring/removing existing abstractions is
_expensive_ , and with no guarantee you'll get it "right" this time either.

Sandi Metz says:

> prefer duplication over the wrong abstraction

[https://www.sandimetz.com/blog/2016/1/20/the-wrong-
abstracti...](https://www.sandimetz.com/blog/2016/1/20/the-wrong-abstraction)

Mechanically extracting things as in OP can often lead to "wrong"
abstractions, which is what I get as the takeaway from the OP, which is a good
reminder.

You are right that abstraction is the _main tool we use_ as computer
programmers, one way or another. Thinking in abstractions is the _main thing
we do_ that characterizes computer programming in contrast to other endeavors.
So we certainly can't give it up entirely.

But this fact, I think, from my observations over my career, often leads us to
over-abstraction ("over-engineering"), and the wrong abstractions. Because we
are so taken with abstraction (it is _neat_ , and you probably think so if you
enjoy computer programming). It is good to be reminded that there is such a
thing as too much as well as too little abstraction. Sometimes it's the
"wrong" abstraction, but sometimes the "right" abstraction and the time it
would take to arrive at it were unnecessary, and less/no abstraction would
have served you just fine.

~~~
satyenr
See my top level comment on this post.

------
keyle
That's a pet peeve of mine; I see it all the time when I work with lesser
experienced developers, only I didn't know how to call it.

I call it onion skin development, where the developer keeps hiding stuff in
more layers of the onion, making my eyes water as I have to dig deeper and
deeper to essentially find `a.foo(b)` under 12 layers of abstraction.

They're so focussed on making everything look so purrty, they forgot that it's
about telling the computer to do something, as clearly as possible.

~~~
tsumnia
While I can agree with the sentiment, when is indirection okay vs. when does
it become spaghetti code? Given [1], there needs to be a happy medium between
12 layers and all in a single method. I think with less experienced
developers, the issue is diving too deep because they think they need to prove
they aren't novices. However, its still a learning period for them where they
just need a good mentor to tell them they can pull back a little. Outright
avoiding indirection (I don't believe) does that.

[1] [http://www.cs.kent.edu/~jmaletic/Prog-
Comp/Papers/fix-1993.p...](http://www.cs.kent.edu/~jmaletic/Prog-
Comp/Papers/fix-1993.pdf)

~~~
kenhwang
My rule of thumb is 3 times. Whatever is abstracted away must be used at least
3 times. It's amazing how much preemptive prettification by abstraction it
prevents.

~~~
Double_a_92
Abstraction is not mainly about avoiding repetition though. It's about
(well...) creating abstraction.

Imagine if you have to check permission to do something, and that check is
really complex and long code.

If would be better to have code like: if (hasPermissionToEdit(myUser)){ ...

Than a ton of ANDs and ORs in that if, that nobody would quickly recognize as
a permission check... even if it's just there once.

As a minimum I'd put that in a properly named variable first, if I really
wanted to avoid an extra function.

------
janpot
Ok, to take a real world example instead:

    
    
        if (url.startsWith('http://')) {
    

vs.

    
    
        if (isAbsoluteUrl(url)) {
    

If the next developer comes by in 2 months to fix the case for
[https://](https://) urls (and protocol relative ones in 6 months), they'll
immediately be able to spot the intention of the code and can easily fix it in
the abstraction layer that's already in place. Moreover, the fix will be
applied every other place this faulty assumption about their input was made.

~~~
kixiQu
Seems to me that this is a great example of how, if I'm the next developer, I
will have to go dereference isAbsoluteUrl in order to see that the
[https://](https://) case _hasn 't_ already been handled--whereas it would
have been obvious with the inlined version. Maybe isAbsoluteUrl is imported
from a module I trust, and I won't think to go look at it for the problem. And
"every other place" is a bit of a strawman, since everyone commenting seems to
think if you have three cases of a thing, then the abstraction is warranted...
so it'd have to be "in the other place".

~~~
janpot
I never said you wouldn't have to dereference and check the implementation.
That's debugging. I said that it would make my intent clear. That I wrote the
code below that line with the assumption that url is absolute and not
specifically only the '[http://'](http://') case. The abstraction here is
about encapsulating and communicating my intentions.

------
sudhirj
The argument in the post is a bit of a strawman - for really simple examples
like that indirection makes no sense. Saying "is_something_like" as an alias
for "starts_with" is an unnecessary alias that does nothing for the
readability or duplication of the code. Indirection really isn't necessary in
this case, even if you use "starts_with" in a thousand places.

It might make sense if you're trying to do a comparator, like
"is_email_equivalent" \- in that case the indirection would make sense - you
might be doing a just a lower case check today, but in the future you might
expand that to take into account Gmail specific equivalence rules.

For stuff like "read_json_from_file", indirection makes a lot of sense. No
point doing the whole "create_buffer", "read_file_to_buffer", "parse_json"
deal every single time.

~~~
enedil
In Python it's json.load(file), so no indirection needed whatsoever.

~~~
sudhirj
It doesn't stop being what it is just because the standard library does it.

------
a_t48
I really disagree with just the toy example. If it's just in one place, fine,
but when you need to change the example to also do something else like check
the end of the string `return x.startswith("foo") && !x.endswith("bar");` this
style will end up biting you. Yeah, your IDE has tools that can help, but you
might not catch all examples, especially in other branches.

~~~
addicted
I disagree with the toy example even if it’s just 1 use (although to be
honest, you could construct a different example where I would choose to
abstract only if there were multiple uses).

The big problem I have with the toy example is that both pieces of code convey
different ideas to a human reader, and limiting myself to the situations the
author finds the indirection cumbersome, such as code reviews, or debugging,
both would lead the reader in different directions. The suggested solution
would make a reader think the business requirement of that code is to check if
the input starts with foo, whereas the indirection code suggests to the reader
that the input should be like foo (which in this case happens to be by testing
it starts with foo, but that may be incorrect, or may change).

I think the author’s example really hurts their case. A better example may be
where the intention indeed is to check whether the input starts with the
string foo, but you want to trim the sample first.

I’ve seen code that would abstract that out with a function trimmedStartsWith,
or something, instead of just input.trim().startsWith(foo), where I think the
former only makes sense if you’re doing it a bunch of time.

------
keithnz
I don't think the author made a very good case for this. The case made on
readability. But indirection is very common that if you can't read a piece of
code without having to drill into the implementation of every function, lifes
going to be quite painful. Depending on your language and editor / IDE, seeing
implementation is often trivial. The only case, I see, for inlinig is where
the pieces of code are very cohesive and tied together and will only be used
as a single unit. Maybe that's what the author was trying to get at, don't
break atomic units of code up.

~~~
mberning
I agree. The author actually makes a decent case for using this pattern
though. I think it comes down to the skill and clarity of thought of the
developer. If this pattern is used judiciously and the methods are named well
then it adds very little cognitive overhead.

Also, it is very refreshing to read about and discuss actual development.

~~~
keithnz
For sure, the pattern is good, having atomic units of code fragmented into
pieces is not nice. This is essentially part of the Single Responsibility
Principle, if things are together to suppport a single responsibility, don't
fragment them. The trick is knowing "what IS the responsibility of this
thing?" and not over doing it.

------
gregmac
First of all, one of the biggest reasons to do this (and not mentioned) is
unit testing. If there's several variations of input, it can simply testing
because it's easier to write tests for smaller units of code (fewer or no
mocks, and fewer parameters). This alone is probably enough to override any
negative, in my opinion.

Secondly, and in contrast to the point of the article, it can usually help
readability.

If you're wrapping standard library functions, then I agree, the examples in
the article are bad candidates. If you're dealing with something where magic
values are involved, for example, then something like the following can
_greatly_ help readability:

    
    
        /* Checks if a given user id was generated by the legacy system. These are either 5 digits numeric, or start with "B" or "L". */
        is_legacy_user(userid)
    

This method body can include the details about the strange legacy naming or
links to documentation, and the code consuming this can focus on the logical
implications without caring about these details.

Put together, the code flow is easier to follow, and if you're in doubt about
the method you can check out the unit tests to see what situations exist that
might not be obvious at first glance.

Edit: actually, I take that back. Even if you're just wrapping a standard
library call, then there's really no issue so long as there's a domain reason
(eg, is_legacy_userid() vs is_userid_fivedigits()), and there's unit tests.
The unit tests can prove the standard library function is all that's needed.

~~~
pytester
>First of all, one of the biggest reasons to do this (and not mentioned) is
unit testing.

This is one way in which unit testing can cripple code. It's especially bad
when code that mainly has the job of integrating different systems is unit
tested - it typically means adding two additional, unnecessary layers of
indirection to test some completely trivial piece of logic while the thing
that is most likely to actually _fail_ (the integration itself) goes
_untested_.

Those unit tests covering trivial logic still fail when the APIs are changed,
but they almost never fail in the presence of an actual _bug_ and present a
huge maintenance cost for no tangible benefit. In some particularly
integration heavy code bases I've seen literally _every_ unit test is an
expensive waste that spaghettified the code.

Indirection is a good idea if you're creating a thin abstraction over a piece
of complex logic, but that shouldn't be something that is done just to
accommodate an irrational desire to see unit test coverage % go up.

------
xxxpupugo
Highly recommended this book:

A Philosophy of Software Engineering:

[https://www.amazon.com/Philosophy-Software-Design-John-
Ouste...](https://www.amazon.com/Philosophy-Software-Design-John-
Ousterhout/dp/1732102201)

This is mentioned as one of the symptoms/appearance of bad code:
Shallow/Passthrough methods, which results deep call chains, adding up
cognitive overload. The author in this book recommends deep module over
shallow module, which tends to end up with more cohesive code.

------
mannykannot
In theory, this abstraction of small details is not a problem, but in cases
like this, I would prefer to see the inlined version, simply because, in many
languages, a function call raises the possibility that it has side-effects
that you might want to know about. In code that is abstracted to the max, that
is sometimes a difficult question to answer, even if your IDE lets you easily
trace the call stack as you read.

There is a related problem that causes more trouble, but it cannot really be
blamed on the initial abstraction. It goes like this: you have something, such
as a foo-ness test, that is needed in several places, so you abstract it as a
function for all the right reasons. Later on, one of those uses changes, and
needs slightly different logic than the other cases. Instead of writing a new
function (that may, as part of its implementation, call the original), the
person making this change (not you, of course!) modifies the original function
to handle both cases. This, by itself, makes the code more complex than
necessary, and it gets worse if the criterion for which path is taken in the
function is not something that has any significance for someone who arrived at
this function through reading the code for one of the other cases that call
it. And if the new path has a side effect, now all the other cases are also
calling a function that potentially has side effects...

This is how code rots.

------
rocky1138
This could easily be remedied by our editors offering a keystroke to inline
the function definition in a smaller font and different colour so that we may
read it, in order, as though it were one big file. Once the gist of the
function has been understood, another keypress could take it away.

Visual Studio has this and it's called "Peek at function".

~~~
flukus
If we need more complicated tooling then the code is obviously less readable
x.startswith version.

Wrapping simple functionality like this inside "self documenting" functions is
the bane of my existence when doing maintenance. Not only is the code less
clear when you have to step into or peek through a myriad of micro functions
but when you need to change them you have to analyse every code path to make
sure the change is supposed to affect them all and that quickly becomes an
exponential problem.

So much of this is cargo culting attempts to keep methods short without
understanding why long methods are a problem in the first place. That problem
is the amount of state you have to juggle mentally, short abstractions like
this that don't eliminate that state, so they create no benefit and the cost
of obfuscating the code.

~~~
usrusr
Of course in real life maintenance coding, your journey will not be finished
with a nice .startsWith("foo"), instead you encounter .startsWith(FOO) and can
still count yourself lucky if you find FOO defined as "bar" (because $reasons)
in a way that does not allow principle of maximum surprise redefinitions
somewhere else. All for getting a somewhat reliable understanding of what the
trivial entry point line does when trying to understand code downstream in the
control flow.

------
ubercode5
In general I disagree. In large codebases you're going to have a number of
abstractions that you'll have to deal with by their nature. Flattening these
can cause a scenario where you have a long function with clusters of lines of
code organized in a block are doing different functionality that isn't clearly
stated. This may include a comment at the start of each block discussing what
the next few lines are doing. And without abstraction, the functionality later
may not be cleanly decoupled. Debugging these types of functions also adds
significant mental load trying to see the forest through the trees.

From the article it seems like the bigger issue might be debugging style. For
me, when debugging those abstracted functions should be treated and trusted as
black boxes until you have sufficient reason that function is the issue, and
then dig into it and ignore the rest of the function. Trying to build a mental
model by effectively flattening every function takes huge cognitive load, and
only is feasible for small amounts of code. Once the codebase is large enough
it's going to become impossible to do this.

------
JackFr
"Avoid Indirection in Code" is a foolish thing to advocate, and the example
provided is more confounding than explanatory.

Abstraction is the key. Abstraction can be used to reduce incidental
complexity by hiding an implementation. Abstraction can be used to reduce the
cognitive load of the essential complexity by making the code look like the
problem domain.

But there is a cost to abstraction. A person reading your code may not have
the same mental model of the problem, and might’ve be working at a higher or
lower level of abstraction, in which case the "indirection" might be a
stumbling block. Weigh these costs carefully.

------
reuben_scratton
John Carmack's thoughts on this topic are worth a read: [http://number-
none.com/blow/blog/programming/2014/09/26/carm...](http://number-
none.com/blow/blog/programming/2014/09/26/carmack-on-inlined-code.html)

------
cpeterso
See also Zed Shaw's "Indirection Is Not Abstraction". The blog post is a bit
overlong, but Shaw's point is important:

 _Abstraction and indirection are very different yet cooperating concepts with
two completely different purposes in software development. Abstraction is used
to reduce complexity. Indirection is used to reduce coupling or dependence._

[https://zedshaw.com/archive/indirection-is-not-
abstraction/](https://zedshaw.com/archive/indirection-is-not-abstraction/)

------
bcheung
There's nothing wrong with this kind of "indirection", I agree there are times
when you might not want to do this but I would put it in terms of cognitive
overload.

By substituting a series of lower level concepts/tasks/procedures into a
single keyword the reader can more easily think about the problem domain.
That's a good thing. If they know the abstraction, it reduces the number of
things the reader must keep in their head.

However, if the reader is not familiar with the abstraction they will need to
spend time to familiarize themselves. This can break down when trying to
understand an abstraction leads to the "endless rabbit hole" problem.
Basically, you have to open another file to see how that abstraction works,
then it uses some other abstraction which you need to open another file to
read, and so on. If the depth is too great the reader will lose context and
won't be able to fit the problem in their head.

To avoid this, pull more of the functionality into the function instead of
outsourcing it to an abstraction. Favor writing code that reads sequentially
rather than as a series of jumps through a bunch of files.

As always, this is largely a matter of taste so use your best judgement and
balance abstraction with pragmatism.

------
admax88q
Yeah avoid indirection. Instead of

> x.start_with("foo")

do

> x.length >= 3 && x[0] == "f" && x[1] == "o" && x[2] == "o"

Now the reviewer doesn't have to jump to the definition of "starts_with."
Isn't that way more readable?

------
sbov
It really depends.

If you repeat this a lot, you might have a domain concept. If you can come up
with a name that makes sense for your domain then you should probably take
that domain concept and realize it within your code in the form of some kind
of abstraction - e.g. a function call. Even if it is as simple as a
.startswith call.

If you can't come up with a name, kinda depends upon how prevalent it is. If
you just do it a handful of times, no big deal.

------
carapace
Err, this isn't _bad_ advice, just kinda junior-level.

First, if you're defining interface indirection to support _encapsulation_ and
you wind up _never_ rewriting (or adding another) implementation behind that
"external function" or method, you _probably_ didn't need it. (YAGNI, yo?)

Second, if you're defining interface indirection and it makes it _harder_ to
understand the code, you're doing it wrong anyway.

\- - - -

Indirection that hurts is like when you do dispatch through a dict of
functions and now all your tools, debugger, etc. can't tell you about the
static structure of your code:

    
    
        dispatch = {
            'name': handler_func,
            ...
        }
    
        ...
    
        dispatch[selector](*foo, **bar)
    

That kind of thing seems like a good idea but:

> "Everyone knows that debugging is twice as hard as writing a program in the
> first place. So if you're as clever as you can be when you write it, how
> will you ever debug it?"

–Brian Kernighan, "The Elements of Programming Style", 2nd ed., chapter 2

(Also, notice that, in Python at least, derp.baf( _foo,_ *bar) is the same
thing, eh? EH??)

~~~
keymone
well technically, if you type-annotate the dict you should be fine:

    
    
        dispatch: Mapping[str, Callable[arg_types, ret_type]] = {}

~~~
carapace
That's neat. I hadn't realized type hints had gotten that good now. Cheers!

I've had issues reading code where I'm trying to trace control flow and I hit
some dispatch point in the code and it's suddenly really hard to tell where to
go from there. I do (did) a lot of Python programming and sometimes it's a
little _too_ dynamic, ya know what I mean? :-)

~~~
keymone
type hints got good, editors and typecheckers didn't really :)

personally i like these dispatches more than giant if/switch statements for no
reason other than better separation and management of logic. as for debugging
- stepping through code in ipdb is still the way to go.

------
zimbatm
It's more of a problem with OOP language because the function call might hide
a side-effect. So now the reader has to go and read the function definition to
make sure no side-effect is happening in there.

With functional languages the mutable data is available in the reader's
context. The implementation might still be mysterious but hopefully the
function name is good enough to give a clue about that.

~~~
mikekchar
I'm just being a bit grumpy, but there are non-pure languages that aren't OOP,
and there is no reason why you can't write OO code that is where all objects
are immutable.

However, it _is_ a good point that mutability is a major problem. It's just
not correct that this has anything to do with OOP.

~~~
zimbatm
Fore sure. I was painting a large brush and you can find instances of
mutability in functional languages and subsets of OOP that are immutable.
Things like value objects, the builder pattern, dependency injection, ...

~~~
mikekchar
Possibly you aren't aware, but both FP and OOP are orthogonal to mutability.
There are lots of mutable FP languages (for example, the first one ;-) ).
These days it is popular to make pure FP languages, but that was not the case
40 years or more ago. Similarly, Alan Kay had quite a lot to say about
mutability in Smalltalk. I'm not aware of any pure OO languages, but very few
people knew how to deal with persistent data structures until Chris Okazaki's
PhD thesis on the topic ~1996, if I remember correctly. Mutability is really
up to the programmer. Even in C++ you can have const data structures and even
const functions -- which are guaranteed to be pure. C++'s mistake was that it
didn't make const the default; a mistake that Rust corrected. Back in the 90's
all of my C++ code had const strewn all over it -- it was on practically every
line. It was really clear to those of us doing it that you had to separate
your "const" code from your "non-const" code because if you ever decided to
omit that "const", it would rot your entire code base in a very short time
frame -- mutability is really contagious.

I know it's popular to hand-wave and say, "Well _most_ OO programmers don't
write immutable code and lots of design patterns depend on mutability so
therefore OOP is defacto-mutable while FP is defacto immutable", but this
really obfuscates the true nature of the issue. There are programmers who
prefer to write mutable code (in any paradigm) and those who prefer to write
immutable code. It's quite easy to write OO code in an immutable way and most
statically typed OOP languages even give you tools to help you do it. The
reality is that many programmers just don't care. Of course, those programmers
never reach for a pure FP language, but there are still heaps of old school FP
programmers who would never write immutable code (I know several personally).

Perhaps you knew all that as well, but if so, I would ask a favour that you
don't say "OOP is mutable" because it is highly misleading for people who
don't understand the issue well (and frustrating for programmers like me who
enjoy writing immutable OO code).

------
Thunderya
This made me think: is there an IDE or plugin that makes it easy to visualize
the source code of a function by inlining the sources of the undercalled
functions?

For example, before the transformation:

    
    
        function isFooLike(x) {
            return x.startsWith("foo");
        }
    
        function isBarLike(x) {
            return x.startsWith("bar");
        }
    
        function bigFunction(str) {
            if(isFooLike(str)) {
                console.log("Foo!");
            }
            else if(isBarLike(str)) {
                console.log("Bar!");
            }
            else {
                console.log("Unknown!");
            }
        }
    

And then:

    
    
        function bigFunction(str) {
            if((function(x) {
                return x.startsWith("foo")
            })(str)) {
                console.log("Foo!");
            }
            else if((function(x) {
                return x.startsWith("bar")
            })(str)) {
                console.log("Bar!");
            }
            else {
                console.log("Unknown!");
            }
        }

~~~
ktpsns
While I find your inlined example very bad to read, I think there are many
attemps IDEs do for this. I remember some showing the function stub of a
called function when moving the mouse on the call instruction. I also know
multiple editors that allow to quickly browse the source code with CTRL+Enter
(or CTRL+Mouse click) in a similar fashion as we click on links in the web. I
personally never found indirection to be a problem. It's rather the solution.

~~~
Thunderya
Yes unfortunately my example is not very readable, but it was to illustrate
the idea of including the sources of the called functions.

Indeed, many IDEs allow you to easily display the source code of one function
at a time, by navigating to the sources of the function or by displaying it in
a pop-up, but the idea I had in mind is to be able to view the source code of
the function with all its dependencies at a glance.

------
asavinov
Indirection [0] is a quite big topic and applying it can result in both
significantly improving the code and getting more problems than benefits.
Therefore, the question is when indirection is appropriate. One opinion [1] is
that

    
    
        "We can solve any problem by introducing an extra level of indirection" 
    

Another opinion is expressed in this post (avoid indirection). In the context
of this example, indirection is related to the problem of modularization and
separation of concerns. In particular, a typical question is whether we want
to hide some details or use the functionality _directly_.

[0]
[https://en.wikipedia.org/wiki/Indirection](https://en.wikipedia.org/wiki/Indirection)

[1]
[https://en.wikipedia.org/wiki/Fundamental_theorem_of_softwar...](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering)

~~~
microtherion
I always liked the second part of this quote: "…except for the problem of too
many levels of indirection"

------
husainalshehhi
Well, indirection can be quite helpful even for simple checks:

    
    
      if is_json(somestring) {
          // do something
      }
    
      function is_json(string s) {
          return JsonLib.convert(s) != NULL
      }
    

this indirection is easier to read rather than thinking too much about what
JsonLib.convert(s) != NULL mean. This does not hinder debugging because you
only debug this is_json function when the code does not evaluate is_json as
expected. Every other time, you debug what's in the if statement.

The given example x.startswith("foo") is simple enough to include it directly.
However, if there is a business reason for why "foo" is checked here, it will
be difficult to understand why the check is there. This will be more difficult
to read a code like this:

    
    
      if x.startwith("-")
          do something
    

instead of

    
    
      if is_yaml_node(x)
          do something
    
      def is_yaml_node(x)
          x.startwith("-")

------
DarkWiiPlayer
My mental benchmark for the cost/payoff of indirection is usually

> Would this code be easier to read if A) I inlined the code or B) I replaced
> it with a descriptive function name.

Function could be read as Class or Method if you're dealing with OOP.

One important criterion for this is that other programmers already know the
language, not your program. Writing your own function can help to put a
meaningful name on a set of behaviors, but it also means the introduction of a
new specific set of behaviors the reader needs to keep in the back of their
mind. It's kind of similar to Jakob's Law [1] in a way, but applied to source
code. (In a way, source code readability in't all that different from UX, if
you think about it)

[1] [https://lawsofux.com/jakobs-law](https://lawsofux.com/jakobs-law)

------
tawy12345
I think some commenters don't understand the challenges of reviewing code
contributions as an open source maintainer. Many people already feel like
they're doing you a favour by contributing any code. This is amplified if
they've followed "best practices" and broken it down into many tiny functions
and objects. As a maintainer, having to review and maintain code that is
abstracted in the wrong way is a major headache. Trying to tell someone that
their code is over-abstracted risks starting a debate on the code review and
generally derailing the process.

An article like this at least sets some precedent and gives you something to
point to.

~~~
asdfman123
For my reference, can you give me an example of properly abstracted code and
overly abstracted code?

------
neilwilson
This is back to the usual intractable problem - naming things. If you name the
function properly and it abstracts the operation of the function precisely
then you have added information, not taken it away.

You understand what 'has_kettle_boiled' means.

Inlining functions makes it difficult in a sequence to work out what are the
separate steps. Splitting those out into a function - particularly if they
take the temporary variables with them - makes it easier to understand what
the higher level function does, not harder.

Perhaps we should revisit stepwise refinement as a technique and the need for
functions to be loosely coupled and highly cohesive.

~~~
syn0byte
What was the kettle boiling? How long ago did it start boiling? Is it _still_
boiling? That Sea level boil or high altitude boil? ARRGH! your function name
isn't descriptive enough!

------
geodel
I don't know much about Python programing. In Java though I deal with
ludicrous amount of indirection and 'design pattern' turdlets which deeply
hurt code comprehension everyday.

Just last week I was handed a project with single functionality: 'Doing ldap
bind on receiving userid/password over http and return success/failure'. This
project has about 40 Java files in 18 directories. As far as enterprise
projects go it follows all best practices of Java and Micro services etc. But
I find this project absolute turd and hopeless to refactor. A clean rewrite
would be only sane thing.

------
mgerdts
Reminds me of my first day of python performance analysis, which led to this
fix.

[https://web.archive.org/web/20130227000541/http://defect.ope...](https://web.archive.org/web/20130227000541/http://defect.opensolaris.org/bz/show_bug.cgi?id=11002)
[https://github.com/oracle/solaris-
ips/commit/53bb14f750dd111...](https://github.com/oracle/solaris-
ips/commit/53bb14f750dd1111bf8938278c2436e4141798dd)

------
DubiousPusher
There's a lot of reasons to avoid indirection but I don't think decreasing the
number of files you have to navigate is one of the better ones.

I mean, who reads code linearly anyway. My hand hovers over F12 (the shortcut
for go to definition in Visual Studio) half the day. Any good IDE will help
with this.

I feel like this person is like, I don't want to drill all these holes to
build my deck. I'll use nails. And it's like, "dude, somebody invented the 3
inch self tapping deck screw; go to town."

~~~
dnautics
I think also this idea of navigating files is wrong; shouldn't these
functions, generally speaking, be in the same file?

------
rileymat2
>However, there is also a cost to this behavior. When a new reader encounters
this code, they need to jump between many function definitions in many files.
This non-linear reading process requires more mental focus than reading linear
code.

I think this is the problem right here. As a new reader you should first trust
the function name does what it says it does and continue linearly. Later on,
if you need to go back and look at the details or where the function name
misleads or does more than it says.

------
roland35
There are a lot of examples of heavyweight indirection in Hardware Abstraction
Layers (HAL) on microcontroller vendor code - ST is one of the worst
offenders! In their sensor code library I counted 9 layers deep of functions
to simply send a command to a magnetometer over I2C.

I find this code hard to work with since without actually debugging the code
there is no easy way to just click on a function and see what is called - a
lot of functions reference a data structure of I2C function pointers...

------
lriuui0x0
I agree with the conclusion largely, but I think the reasons listed in the
article are not good. Personally I feel the problems of doing too much
abstraction are belows.

1\. It takes a lot of time.

2\. You might end up with bad abstration that in the end hinders you. You then
have to work around the abstration you created.

3\. More functions create more entries you need to understand in your mind,
this can sometimes make code harder to reason about.

------
vast
The main point here is worth a broader discussion. Part of the problem is that
we tend to hide behind mental concepts that a ambiguous, incomplete or just
bad.

DRY is a awful "thing" (a decree at best). In the worst interpretation it
simply says "never write the same code twice". There is no balance or end to
it. It doesn't have a competitor or alternative. It somewhat implies that it
is always good. It doesn't define a scope where/when it should be applied.
There is simply no broad agreement how to use it in practice. DRY touches
multiple concepts, each too complex to put behind three tiny letters.

OP fails to make a great point, but the direction is right. We cannot take DRY
and "code reuse" laws as granted. Abstractions and indirections have their
downsides. It increases systemic complexity and may add dependencies. It
certainly limits how easy the full system can be understood by humans.

~~~
tetha
The counterpoint to DRY doesn't seem to have a name, but it very much exists.
I know it under the names of 'oversharing' or 'premature sharing'.

Practically, this tends to extend the question of DRY: Is the code the same in
two places, and will the code /change/ in the same way in these two places?
Can we delay sharing this code to have more time to figure out if the code
changes in the same way?

Maybe currently two integrations with two external applications are just the
usual socket/newline separated json at the moment, so you could share the
implementation. But maybe one integration gets replaced with thrift, one gets
replaced with protobuf and suddenly you end up with an abstraction that's full
of <if service.protobuf...> and that'll be ugly.

------
Rapzid
Pretty much everything written in Ruby before "Practical Object-Oriented
Design In Ruby" by Sandi.. Haha, #KiddingNotKidding.

Looking for a non-contrived and/or non-scarecrow example? Anything in the Chef
code-base circa 2014 or prior.. No clue if it has changed.

Ruby code bases were the poster children of DRY-gone-wild.

------
2rsf
Like everything else you need to consider the context. I was involved in a
rather big internal automation project written in Python, it involved a few
platforms and targets and was ran open source style where each team could hook
up to the infrastructure independently or make changes to it using PR.

Some of the code ran remotely using RPC, so part of the code sat in different
repositories.

Naturally this called for multiple levels of abstraction and indirection. The
results ? a big ugly pile of levels if you are trying to understand what the
code does, that led to bugs, unnecessary complexity since it's hard to foresee
the future and always abstract the way the code will need and the worst was
that people simply gave up on trying to contribute do to the complexity of the
otherwise simple code.

------
skywhopper
As with anything, the key here is to strike the right balance. Sometimes the
code behind the ‘if’ test will be worth factoring out and sometimes it’s
better to keep it simple. Impossible to say what the line is, as it depends on
the team, the language, etc.

But in relation to the actual problem of legibility to a new developer still
building up the mental model of the code, this is an area where tooling can
help. You ought to be able to view the called-method’s body with a keystroke,
without losing the current context, in an IDE or a code review tool. I’m sure
some IDEs do have such features. Yes this is harder in some languages than
others but in general it’s too bad these sorts of features aren’t built in to
Github and Gerrit as well.

------
hyperpallium
How should we decide whether to abstract something into a function?

Abstract aspects _likely to change_ , via interfaces that aren't, says Parnas
( _On the Criteria To Be Used in Decomposing Systems into Modules_ ,
[https://www.win.tue.nl/~wstomv/edu/2ip30/references/criteria...](https://www.win.tue.nl/~wstomv/edu/2ip30/references/criteria_for_modularization.pdf))

abstract-unless-leaky: hide information you don't need.

In practice, writing tests helps me see what is naturally part of a module vs.
a usage of it. You want the module to contain everything it needs to do its
job, and nothing else. But this assumes the modularization (its "job").

------
ojbyrne
Decomposition and abstraction are fundamental tools of software design,
obviously it can be done badly, as can anything. Carried to the logical
extreme, this suggests that hiding the complexity of arithmetic operations
like + and - is a bad idea.

------
satyenr
I don’t quite get the point of article. It essentially boils down to:

“Abstractions — or rather, indirections — are bad for for readability, but
they are good in some cases.”

Umm... this can be said for pretty much anything. Over use of anything is bad.
Inappropriate use of anything is bad.

> Mostly, I want authors to be aware that there is a human cost to indirection
> that is felt more acutely by everyone reading the code except the original
> author.

Sure. It’s a well known fact that abstractions have associated costs — nothing
new there. However, the author seems to forget that not abstracting things can
be equally costly — even when human readability is concerned.

Imagine someone trying to figure our the business logic be reading code. All
they need to know is some conditions are being met for certain steps to be
executed. The actual steps involved in the check are irrelevant to the
business logic. The condition can be modelled separately from the business
logic and can even change independently. For example:

    
    
        def is_foo_condition_satisfied((self):
            return self.variable.startswith(‘foo’)
    

In this case, it just happens that the satisfaction of a condition is
implemented as prefix check. In the future, it might be changed to checking a
flag or reading a database, or anything really. If you sprinkle the
startswith() check all over you code, it will be hard to make the change. And
it actually hurts readability. While reading a large codebase, you want to
focus on specific parts of it — getting into the details of everything all at
once does not make it easier.

The most important thing to note is — programming is not confined to s strict
set of rules. It’s not an exact science. You can’t define a strict a rules
that must be followed. What makes sense somewhere is completely useless
elsewhere. The Linux Kernel is full of goto statements, despite the fact that
goto is considered harmful.

Dogma Driven Programming must be avoided. Even well known programming
practices are guidelines at best. One must evaluate whether a given rule fits
a given situation.

Coming back to the example at hand, functions are meant to divide the program
by the logical tasks being performed, not by number of lines or any other
metric. As I said before, it’s more art than exact science.

Is the check for foolikeness of something a logically distinct task? It
depends on the context. There is no way to write a cardinal rule either way.

PS. I sincerely hope no one comes to me for code review with a monolithic
function because of all the inlining — citing this article as inspiration.

Edit: Typo correction.

------
ivanhoe
Having a central place for logic is a HUGE pro, while both those cons apply to
a very narrow class of situations: having a super simple condition logic on
one side vs. not very smartly named indirection. In real life (at least mine)
you'll often have a complex or cryptic-looking conditions (requiring that
you're deeply familiar with business logic, laws, tax rules, etc.), and
abstracting them into sensibly named functions actually helps the review
process and debugging IMO.

------
z3t4
Code is like a bunch of cords. Each time you add something, it's like adding
another cord, and at some point you will have "spaghetti", and you pull out
every cord, unravel it, connect everything neatly and use zip-ties to hold
everything in place. The problem is when you use the wrong abstraction, eg.
you start zip-tie:ing before even knowing what other cables to add. So don't
be afraid of using the copy/paste function. Make it easy to unravel.

------
winkelwagen
There is one thing that I’m missing, yes when the only thing you do is
wrapping the standard library with a function it’s overkill. But when there is
a business decision logic behind it I would still advice to extract it.
Creating a function with a name tells you something about the why!

I think peek definition could help you out when you are really dealing with it
in the code base. I just wish it would be smarter. Peeking a function with 5
lines, peek it. If it’s larger, jump to it.

~~~
nomel
I do this when I know a future feature or requirement will make that function
complex. I also will add a comment saying so, “putting this her since we’ll
need x when y”.

------
rajangdavis
I feel as though this is a huge part of working with Rails as there is an
incredible amount of magic that happens under the hood.

From my limited experience, I think abstractions like these are helpful as
long as there is consistency.

This is easy to do by yourself but not always on a team; however, I think
Rails has a good balance of conventions that allow indirection to be
successful as long as the people writing the code use the same patterns and
communicate effectively.

------
k__
When I started programming, I tried to get functions below one screen height,
have one return and use mutations often.

Today, I try to do big functions, multiple returns and const where I can. I
still use code-blocks sometimes to scope variables.

I sometimes end up with a few hundred lines of code per function, but I can
usally scroll over it and find what I need without jumping around.

Especially in JavaScript the addition of const and async/await helped much
with this.

------
tobr
It’s not just a problem for review and debugging. I think it’s even worse for
refactoring and new feature development, because it makes it so much more
difficult to know where to make a change. If the condition in the (silly)
example needs to change, should you write a new function to replace the call
to is_foolike, should you rewrite its implementation, or should you add a
parameter that alters its behavior?

------
taneq
One very important reason to avoid this kind of unnecessary indirection is
that it wrecks locality of reference for anyone reading it. They end up
spending all of their mental energy on skipping around the codebase trying to
follow the thread of execution, making it very hard to assemble a coherent
picture of what the code's actually trying to do.

~~~
vbsteven
I like IntelliJ's "Quick Definition" shortcut for this. It allows me to see
the definition of a function in a popup window so I can quickly glance at what
a certain function does without the full context switch.

~~~
taneq
I've only spent a few weeks using IntelliJ but from that time it seemed like a
really well constructed dev environment. Definitely gave me the feeling that
this is what modern software development _should_ be like.

~~~
vbsteven
I believe it is. The big thing that made IntelliJ completely click for me is
when you buy the Ultimate edition and you download some plugins for
Ruby,Node,Python or whatever languages you use you have a full IDE that works
for pretty much any language.

In my eyes it is a platform for doing modern software development.

------
ncmncm
The simple rule is: every line of code has to earn its keep.

Sometimes we really do need a one-line function to implement a common
interface. Sometimes we need it so that, in the two places it's called, both
get the new behavior when it changes.

But abstraction for abstraction's sake is just stalling. The technical term is
"ratiocination". It's a filthy habit.

~~~
he0001
I’m not so sure about this. A one liner may have its merit when you write it
at that very precise location right now, but as the code grows the one liner
may not have the same merit any more. But that may not always clear when you
wrote it.

~~~
ncmncm
When any part of the program stops earning its keep, delete it. Feel proud of
every deleted line. Deleting code adds as much value as writing code, but
costs much less.

------
shtolcers
On the other hand fixing "is_foolike" will be pain to fix if used in many
places. And - "This non-linear reading process requires more mental focus than
reading linear code." \- isnt't this a god thing? Good abstraction requires a
bit more mental involvement and much less monkey-job when fixing the code.

------
wawhal
I slightly disagree because I prefer the code to be consistent all across my
codebase.

Sure it is some effort to backtrack to the util functions, but I think the
effort is worth it because I would rather have a developer spend multiple
hours debugging and figuring out the code than having something that is a
nightmare to iterate upon.

------
cryptica
I agree with OP, if a function just returns the result of another function
without any additional processing then it usually means that one (or both)
function's name does not accurately describe what it does.

The name of a function should abstract away from "how it does something" not
"what it does".

------
CodiePetersen
If you have defined the function and it is used in many places then you only
need to read it once and remember everytime you come across it. How is
recoding it every time going to help the readability. I think you just need to
make sure the functions purpose is clear either.

------
vendiddy
I wonder if the real solution to the indirection is better tooling.

If I can't easily see the contents of is_foolike, should I stop writing my
code like this or should the editor be showing me these details more easily?

The real problem: it takes me too much time to see what's inside the function.

------
de_watcher
"All problems in computer science can be solved by another level of
indirection"

------
XeO3
Imho, this type of abstraction should be allowed in case of code repetition.
So the reviewer/debugger will have to go non linear only one time.

In any other case the whole code should be in one file.

------
rb808
Both are clearly wrong, should have been using proper IoC with a
fooCheckerFactory creating a checking object that inherits from a generic
interface that is passed in the context.

------
raiflip
I wonder how many people who decry abstraction and indirection would choose to
work directly in Assembly rather than in higher level languages.

------
gautam1168
i regularly see functions that are over 300 lines of code that make no other
function calls except ajax. That shit is hard to read.

------
choiway
This is an overly simplistic complaint that leads to the argument that you
should never use a 3rd party library.

------
doomjunky
But _all problems in computer science can be solved by another level of
indirection_. ~David Wheeler

------
nickthemagicman
Could it be lack of good tooling?

I dont understand why there isn't a code editor that will inline functions for
you?

------
Simon_says
This feels like somebody is testing the limits of what nonsense developers can
be made to read.

------
blibli
Maybe we should avoid reading code.

------
unnouinceput
The connection has timed out

The server at www.matthewrocklin.com is taking too long to respond.

too many HN readers perhaps?

------
djohnston
if the tooling is terrible maybe, but as long as i can click thru to see the
referenced method in an IDE it's going to be fine

------
kerkeslager
Okay, maybe it's just a poor example, but in the example used, the problem is
_definitely_ not too much indirection.

If I was doing a code review and came to this:

    
    
        def is_foolike(x):
            return x.startswith("foo")
    

I would comment, but my comment would be, "Could you name this function
`starts_with_foo`?"

This addresses both concerns mentioned in the article:

1\. _During review, when a reviewer is asked to verify that code is sensible
before it can be merged into the main project. That reviewer probably has
about a tenth as much time to spend as the original author does on that code._
But if the reviewer comes to the code

    
    
        if starts_with_foo(x):
            do_something_with(x)
    

They aren't likely to be misled in any way by the indirection. They can just
keep reading, without having to look into the implementation of
`starts_with_foo`, because either that name is accurate and they know what it
does, or they'll discover that the name isn't accurate when they review the
code.

2\. _While debugging future issues. This code will eventually be involved in a
bug and some completely different developer will have to glance at this code
to figure out what’s going on. They’ll have to understand some small section
this code within a few minutes to determine what is relevant. They won’t be
able to invest the time to understand the full thought process behind it, and
a web of function definitions can slow down this process considerably._ But
again when they read the code, a good name means they can understand what the
code does without having to read the implementation.

I do think there's an argument to be made that indirection is a problem here,
but it's a small problem compared to the enormous problem that `is_foolike` is
a _really_ bad name for that function.

General rules for when to NOT pull something out into a function:

1\. If it the function call wouldn't be clearer than the code (even a better
name like `starts_with_foo(x)` loses some information contained in
`x.startswith('foo')`, and the latter is readable enough that there's no real
upside to the former. If the latter were even two lines long, it would become
a lot more worth it.

2\. If there's no repetition. Two similar pieces of code aren't enough: you
don't understand from two use cases what pattern you're abstracting, so the
result is just going to be a leaky abstraction. Three repeated pieces of code
seems to be the sweet spot: now you have enough examples to know what's
actually repeated, what should be arguments to the function, etc.[1]

[1] [https://blog.codinghorror.com/rule-of-
three/](https://blog.codinghorror.com/rule-of-three/)

~~~
justasitsounds
`is_foolike` is a bad name if you name your methods to only describe how they
are implemented internally.

At a high-level, in your business domain, you generally write your code to
hide the implementation details and to describe the intention of the method
EG. `is_carbonated`

    
    
      if is_carbonated(beverage):
        beverage.jiggle(false)
    

vs

    
    
      if beverage.startswith("co2"):
        beverage.jiggle(false)
    

(edit) for formatting

~~~
kerkeslager
I'm sorry that this will come across as rude, but I feel like you're repeating
something you've heard but didn't quite understand.

Yes, at pretty much ANY level, you should write your function names to hide
the implementation details and to describe the intention of the function. In
my previous post I hinted that the name of the function should "describe what
the function does", which is the same as "describe the intention of the
function", because if the function doesn't do what it's intended to do, that's
a bug.

The reason I say I don't think you understand what you're saying is that none
of what you're actually saying applies to the examples you're giving.

`is_foolike` _doesn 't_ describe the intention of the method. It gives an
entirely vague and inaccurate view of what the method does. So even though we
both agree that functions should describe the intention of the method, you're
saying `is_foolike` is an okay name even though it doesn't do what you say it
should?

`starts_with_foo` describes the intention of the method. It doesn't describe
the implementation: `starts_with_foo(x)` might expand to `x.startswith('foo')`
or `x[:3] == 'foo'`, but I don't care which, because the name accurately
describes what it does either way.

Your example doesn't elucidate. If we're representing beverages as strings
which are somehow guaranteed to begin with "co2" if the beverage is
carbonated, and we've decided that the deserialization should be mixed into
our "high-level, in your business domain", the program is so badly tangled
that we're not going to get any truths about good programming from it.

~~~
justasitsounds

      I'm sorry that this will come across as rude, but I feel like you're repeating something you've heard but didn't quite understand.
    

You're right, it does come across as both rude and condescending, and you know
it, so don't apologise.

    
    
      In my previous post I hinted that the name of the function should "describe what the function does", which is the same as "describe the intention of the function", because if the function doesn't do what it's intended to do, that's a bug.
    

`is_foolike` to me, implies `test_for_abstract_quality_foo`. `starts_with_foo`
implies an assertion a string beginning with a particular prefix

    
    
      If we're representing beverages as strings which are somehow guaranteed to begin with "co2" if the beverage is carbonated, and we've decided that the deserialization should be mixed into your "high-level, in your business domain", the program is so badly written that we're not going to get any truths about good programming from it
    

I feel like you're intentionally misunderstanding my point. Please don't
critique my entirely fictitious codebase as if it represents anything other
than an abstract example. The point being that even though, in this (again)
entirely fictitious example, the implementation is very simple, the intention
of the method is different from it's implementation

~~~
kerkeslager
> `is_foolike` to me, implies `test_for_abstract_quality_foo`.

Yes, which is why when `is_foolike` tests that a string starts with 'foo',
that's rather unexpected. If `is_foolike` actually describes the intention of
the function, then the function testing for the string beginning with 'foo' is
a bug, because it doesn't do what it's intended to do.

Referring to "the quality of starting with 'foo'" as "abstract quality foo"
isn't hiding implementation details, it's being opaque about what a function
does.

Put another way, "what a function does" isn't its implementation. " _How_ a
function does what it does" is its implementation.

> `starts_with_foo` implies an assertion a string beginning with a particular
> prefix

Yes, exactly. Because that is what it is intended to do, we hope, since that's
what it does. That's NOT the implementation: there are plenty of different
ways to implement testing whether a string starts with 'foo', and the name
`starts_with_foo` isn't coupled to any of them.

> The point being that even though, in this (again) entirely fictitious
> example, the intention of the method is different from it's implementation

Yes. `starts_with_foo` is _also_ different from the implementation, while
still describing the intention of the method.

Could you explain to me why you think `starts_with_foo(x)` describes the
implementation `x.startswith('foo')` and not some other implementation (such
as `x[:3] == 'foo'`)?

~~~
silveroriole
Ok, I’m curious. Let’s say that I want to fall into that if statement if my
product is named something like “foo”, costs under $100, isn’t discontinued,
and has sold over 100 units in the last month. Fine, you can say now I have
overcomplicated the domain and I should go back to requirements, but these
complicated things do happen. Surely it’s acceptable to name my function
“isFooLike” rather than “startsWithFooAndCostsLessThan...” etc, especially if
I suspect management may change some of those figures later?

If that’s acceptable, why is an IsFooLike which only checks one condition
unacceptable, even though it (imo) expresses the same intention?

~~~
kerkeslager
> Ok, I’m curious. Let’s say that I want to fall into that if statement if my
> product is named something like “foo”, costs under $100, isn’t discontinued,
> and has sold over 100 units in the last month. Fine, you can say now I have
> overcomplicated the domain and I should go back to requirements, but these
> complicated things do happen. Surely it’s acceptable to name my function
> “isFooLike” rather than “startsWithFooAndCostsLessThan...” etc, especially
> if I suspect management may change some of those figures later?

The idea with naming is to capture as much of the meaning of the function as
possible, and 'isFooLike' doesn't really capture any meaning (what does it
mean to be like a foo?). In business, there are typically names for the
topics, like `is_profitable_transaction` or `meets_sales_targets`, which may
indeed encompass some very complex logic, but are a coherent idea, so naming
is usually a bit easier than this. But you're right, sometimes the
requirements are bad and you don't have a chance to go back to requirements
before a deadline. In those cases, I don't think the naming matters much
because no name you come up with is going to represent the concept. So I guess
`is_foolike` might be the best you could come up with, but it's certainly not
good code. I'd also be less likely to pull out a function in the first place
because it's a premature abstraction: if the abstraction isn't coherent enough
to be named well, we probably don't understand it well enough yet to abstract
it out.

> If that’s acceptable, why is an IsFooLike which only checks one condition
> unacceptable, even though it (imo) expresses the same intention?

It's not the same intention.

In the more complex case, you have a bad name because the name doesn't
describe the (overly complex) intention of the code, but there isn't a better
one.

In the second case, you have something that does a clear thing, so you should
name it what it does. For a more complex case, you're not going to be able to
capture every nitpicky detail of what the function does in the name, so you
have to describe it in a broader concept. But with a short simple function
like this, there really isn't an excuse for using a vague description.

I think you're still ignoring my explanation that what a function does and
intends to do is different from how it does it. So I'm going to again insist
that the two of you answer these questions to prove they actually understand
my points before disagreeing with me:

1\. I'm claiming that in correctly-working code, the intention of the code IS
what it does. Is there a case where code would do something other than it's
intended to do, and this isn't a bug or at least misleading code?

2\. I'm claiming that what a function does is not the same as its
implementation. Why are you claiming `starts_with_foo(x)` coupled to the
implementation `x.startswith('foo')` and not `x[:3] == 'foo'`?

~~~
ncmncm
This is a really, really terrible way to name functions, or anything. I have
seen code that tried to put the spec in the name, and it was awful.

A name that is long enough to tell you all you might need to know about what
the thing named does is too long to be usable at all.

A name _needs_ to be long enough to distinguish it from the other things being
named. That's what naming means. If it also gives you a hint about which
thing, of the things named, it really does, that makes it perfect.

Anything beyond that adds cognitive load, making it exponentially worse with
each syllable added.

------
james_s_tayler
TL;DR - don't pointlessly wrap functions from the standard library in methods
with non-standard names.

Yes. Please. Stop. Doing. That.

~~~
justasitsounds
In this case the desired functionality can be encompassed in a single standard
library method.

However, the author of the code may have started with a much more complex
implementation but still feels like `is_foolike()` is better at describing the
_intention_ of the method than the eventual implementation.

Replacing the descriptive method name with the actual implementation may mean
the author now feels that they have to append a non-functional comment to
explain that in order to test for foo-ness one only needs to do a simple
string-prefix match.

~~~
james_s_tayler
You know how many times I've seen that exact scenario in code bases? Plenty.
Some developers just do really pointless shit and then name is terribly to add
insult to injury.

~~~
DonHopkins
Especially if your name is Ike and you don't appreciate being called a fool.

