
Why I'll never abbreviate a variable as long as I live - bhollan
https://beehollander.wordpress.com/2016/07/08/the-6-month-bug-and-why-i-will-never-abbreviate-variable-names/
======
benkuykendall
"Never abbreviate a variable" is a very strong statement, surely inspiring
religious wars. And maybe there are edge cases: i as a loop counter, id for an
integer primary key, whatever. But this example is something else entirely;
"ttpfe" is honestly the worst variable name I have every seen.

~~~
dgcoffman
Ah, good ol' TransactionTemplatePayloadFactoryExtension.

~~~
foota
I've just started using Spring in my internship now and some of the class
names in our code are pretty hysterical.

------
mmaunder
When hiring we ask for sample code from developers written specifically for
the job application. We place a really high premium on readability.
Occasionally we'll get code from devs that is very concise - not just
abbreviated variable names, but complex long statements using ternary etc.

I'll usually ask them to resubmit code and really focus on readability and it
turns out fine. But I think there might be a misconception among devs where
the thought is that really compact code shows talent.

Our ideal coder is someone who cares deeply about performance and where the
code is ridiculously easy to read and trace through. I mention performance,
because in some situations making things more verbose can affect that.

~~~
component
Whenever we interview a candidate, the one thing we give high premium on is
structure, the rest come next.

 _bad_ variable names, long complex code...can easily be _adjusted_. If the
candidate doesn't have a good understanding of how to structure a code (what
goes where, for what purpose, separation on concern...), it'll take much more
effort to teach that candidate than to teach the _bad_ variable namer.

Properly structured code _naturally_ leads to a better performing software,
even if/when a bug arises it'll be easier to spot / test.

All that being said, if a coder names a variables, 'a', 'aa', 'aaa', 'aaaa'
that's a serious red flag.

~~~
Roboprog
"I ran the code through minify, but it's _bigger_ now?!?"

~~~
component
I don't understand.

Why would one submit a minified code for a review?

A friend of mine used to work at a place where one of the coder actually used
'a', 'aa', 'aaa'...when naming variables.

~~~
Roboprog
I was joking. The example sort of looks like something that would fall out of
minify -- although 'a' \- 'z' would have to be used/visible in a scope before
'aa' would be generated as a name.

~~~
component
I should have clarified. The coder _style_ of naming might seem to be run
through minifier, but the coder _actually_ wrote 'a', 'aa', 'aaa'...

My guess is that the coder saw a minified code a took it as a _standard_ or
something.

~~~
Roboprog
And that was the joke: that the original code "out-minified" minify - that
running the original gibberish through minify would NOT make it any smaller
:-)

------
gavinpc
Great writeup, full of sharp details.

In the 90's I worked on a FoxPro application, also for DoD. My boss, a retired
Navy Captain, used very terse variable names, partly because he was a hunt-
and-peck typist, and partly because earlier languages allowed only two-
character names.

But FoxPro allowed variable names of any length. However, it only _recognized_
the first ten characters. To my boss's chagrin, I often used names longer than
ten characters, risking collisions with other long names. That never happened.

But ten years after leaving, I was hired to port the product to Visual FoxPro,
which _does_ recognize the whole name. Some of the early commits were "Fixed
inconsistently-used long variable name..."

Of course, in those days we weren't using linters, or tests, or source
control, or reproducible builds... and yet still had a business. No wonder
Alan Kay calls computing "not quite a field."

~~~
foota
Hearing that absolutely terrifies me like?? Silently allowing names to collide
in that way just sounds ridiculous to me.

~~~
Roboprog
Many early 90s C compilers did much the same thing, though the limit was
usually around 32 chars, rather than merely 10.

------
stevep98
I was recently implementing a geometry algorithm which I looked up on quora.
It was described using typical vector notation, using r,s . t,u. Since I
referenced the algorithm in the comments, I decided to use these same variable
names in my code.

I think this is the right choice, but my code reviewer didn't. But he didn't
click on the quora link.

Why is okay for mathemeticians to abbreviate things but programmers? Is it
because they deal in more abstract entities where the name is irrelevant?

~~~
dang
In math, notations are designed to make statements about the problem domain
concise. Once you pass a certain degree of concision, longer names impede
readability rather than enhancing it. That is because the ability to take in
an entire complex expression or subexpression at a glance tells you things—and
lets you see patterns—that wouldn't be as apparent if longer names were used.
Programmers in the APL tradition understand this, but most programmers do not.
(Many refuse to believe it's possible when they hear about it!)

In software, programmers have grown accustomed to a notion of readability that
derives from large, complicated codebases where unless you have constantly
repeated reminders of what is going on at the lowest levels (i.e. long
descriptive names) there is no hope of understanding the program. In such a
system, long descriptive names are the breadcrumbs without which you would be
lost in the forest. But that is not true of all software; rather, it's an
artifact of the irregularity and complexity of most large systems. It's far
less true of concise programs that are regular and well-defined in their macro
structure.

In the latter kind of system, there's a different tradeoff: macro-readability
(the ability to take in complex expressions or subprograms at a glance)
becomes possible, and it turns out to be more valuable than micro-readability
(spelling out everything at the lowest levels with long names).

It also turns out that consistent naming conventions give you back most of
what you lose by trading away micro-readability, and consistent naming
conventions are possible in small, dense codebases. That of course is also how
math is written: without consistent naming conventions and symmetries
carefully chosen and enforced, mathematical writing would be less
intelligible.

Edit: The fact that readability without descriptive names is widely thought to
be impossible is probably because of how little progress we've made so far in
developing good notations, and tools for developing good notations, in
software. This may not be so hard to understand: it took many centuries to
develop the standard mathematical notations and good ways of inventing new
ones to suit new problems. Mathematics is the most advanced culture we have in
this respect, and in computing we're arguably still just beginning to retrace
those steps. If we wrote math the way we write software, mathematics as we
know it wouldn't be possible.

Edit 2: The best thing on this is Whitehead's astonishingly sophisticated 1911
piece on the importance of good notation:
[http://introtologic.info/AboutLogicsite/whitehead%20Good%20N...](http://introtologic.info/AboutLogicsite/whitehead%20Good%20Notation.html).
If you read it and translate what he's saying to programming, you can glimpse
a form of software that would make what people today call "readable code" seem
as primitive as mathematics before the advent of decimal numbers seems to us.
The descriptive names that people today consider necessary for good code are
examples of what Whitehead calls "operations of thought"—laborious mental
operations that consume too much of our limited brainpower—which he contrasts
to good notations that "relieve the brain of unnecessary work".

Applying Whitehead's argument to software suggests that we'll need to let go
of descriptive names at the lowest levels in order to write more powerful
programs than we can write today. But that doesn't mean writing software like
we do now, only without descriptive names; it means developing better
notations that let us do without them. Such a breakthrough will probably come
from some weird margin, not from mainstream work in software, for the same
reason that commerce done in Roman numerals didn't produce decimal numbers.

~~~
pvg
_If you read it and translate what he 's saying to programming, you can
glimpse a form of software that would make what people today call "readable
code" seem as primitive as mathematics before the advent of decimal numbers
seems to us._

This is an extraordinary (and enticing and often advocated) claim that has, so
far, failed to produce the extraordinary evidence. It says something that a
person as concerned with notation as Knuth used mathematical notation for the
analysis of algorithms and a primitive imperative machine language to describe
behaviour.

~~~
dang
I see no connection here to what I wrote, which has nothing to do with
functional vs. imperative programming. I'm talking about names and readability
in code.

Imperativeness is a separate matter. One can easily have it without
longDescriptiveNames, and although I don't have Knuth handy, I imagine he did.

~~~
abdulhaq
At first read the idea you propose is very attractive but I think you do need
to address why APL didn't take off. Perhaps they chose a poor vocabulary, are
there better ways to represent algorithms?

~~~
dang
I'm sorry I didn't reply to this during the conversation, but am traveling
this week. IMO the short answer is that questions like "why didn't APL take
off" presuppose an orderliness to history that doesn't really exist. Plenty of
historical factors (e.g. market dynamics) can intervene to prevent an idea
from taking off. Presumably if an idea is really superior it will be
rediscovered many times in multiple forms, and one of them will eventually
spark.

------
guelo
If the variables had been named timeOfTenPercentHeightOnTheFallingEdge and
timeOfTenPercentHeightOnTheRisingEdge it probably would have still been hard
to notice that they had been swapped in that one line.

~~~
DSMan195276
I think the catch is that the 'f' and 'r' keys are right next to each-other,
so if you accidentally type one instead of the other then you'd get the other
variable by mistake.

That said, you raise a fair point - we don't know how this bug got there. If
it was a simple typo like above then the verbose names would have prevented
it. If it was a logic error by the programmer (For whatever reason), then
you're right that the name wouldn't matter because they typed the one they
intended, it just wasn't the right one to use.

~~~
adrianratnapala
Fat-finger is only one kind of error -- and one far less common than the good-
ole brain-spasm which can easily replace "Falling" with "Rising".

The ability to read the code an find _see_ the error is more important. And
one that score even though ttpfe is a _terrilbe_ name
"timeOfTenPercentHeightOnTheFallingEdge", is even worse.

------
saghm
I definitely prefer more verbose names to abbreviated ones, but I'm not sure
that _never_ abbreviating a variable name is the right way to go either.
Surely there's a middle ground between `Ttpfe` and
`timeOfTenPercentHeightOnTheFallingEdge`?

~~~
pshc
tl;dr: use newtypes!

I like using static types to avoid these sorts of problems. Modern languages
like Swift, Rust and Haskell let you make zero-overhead type wrappers around
other types.

So here they could have defined `newtype RisingEdge(Float)` and `newtype
FallingEdge(Float)`, and then use those types in the function parameters as
appropriate.

Helps shorten function names to boot!

~~~
Roboprog
I miss Pascal "type" block definitions (or C typedefs, at least) when I have
to work with too many Java generics.

I know, [Turbo] Pascal (etc) never had generics. But if they had, you would
have been able to make a type with a generic expression, rather than a
constantly repeated inline monstrosity of an expression.

------
adrianratnapala
One thing I often see elided in these naming wars is scope: am I the only
person who gives longer names to globally visible things than to locals?

For example:

    
    
        size_t *find_foo(const char *source_text)
        {
            const char *src = source_text;
            for(; src != 'f'; src++)
               ... do stuff
            ... do more stuff
    
            return src - source_text;
        }
    

Now the meaning of "source_text" would not be evident, except for the name.
But just glancing at the usage shows that "src" is clearly a working cursor
into the source text.

But if I called it "working_cursor" would that really explain anything to the
reader? If anything, giving a detailed name risks _misleading_ readers in much
the same way as stale comments can mislead.

------
twblalock
The problem was not that the variable was abbreviated. The problem was that
the abbreviated variable was so similar to another abbreviated variable that
was used for a similar purpose.

~~~
bryanrasmussen
sure but if the variable was not abbreviated the chance of collision would
have been less.

~~~
brandonbloom
Even with a 200 character variable name, the chance of collision is 100% if it
was auto-completed or copy-pasted.

~~~
philosoft
At a minimum, readable variables would have simplified identifying the issue
after it was introduced.

------
syphilis2
I say this tongue in cheek: With current IDEs having such great
autocompletion, has anyone experimented with coding far outside of the ASCII
character set? Programming language restrictions aside, I recognize the
obvious troubles this would cause. And yet I do a lot of work on simulations
where math formulas are converted to code, which means lots of compounded
Greek names like "omegaSquared" or "epsilonMinus". Naming decisions becomes
more challenging as subscripts and superscripts are added, yet alone matrix
indices. At some point perhaps the symbolic name should be replaced with the
descriptive name, such as "first_eccentricity_flat_to_fourth". But it sure
would be nice to have access to something with such brevity.

~~~
new_hackers
I see some new emojis on the horizon!

------
buzzybee
I self-tested variable name lengths on my own code.

Three letters is enough to avoid most collisions. Words do not make sense yet.
At four letters most words become decipherable given an appropriate encoding.
At five letters a two word phrase may make sense.

I make a rough decision based on variable scope - shorter lifetime means
shorter variable name, but I rarely go with just one letter as it reduces
uniqueness.

If I need to use a really long phrase frequently I take a mathematician's
approach and alias it to an abstract and highly unique symbol. The phrase may
still exist in the addressed data structure, I just avoid it within the
algorithm. Mathy code also has a tendency to encourage numbered variables,
e.g. "x0, x1, x2".

~~~
rileymat2
Did you self test on code from years ago or was it all fresh in your mind?

~~~
buzzybee
I had approximately a month between when I wrote it and reread it. While it
was serviceable in short-term memory, comprehension suffered noticeably in the
long term with four-letter or less names.

At the time I did this I was pursuing Arthur Whitney's extremely terse style
to see what advantages and disadvantages it brings. My main modification was
to add a full description comment inlined into the declaration of every
variable, a practice which I have mostly kept up with even after the
experiment ended, e.g.:

    
    
        var tz : String; /* time zone */
    

This meant that I was not testing on whether I could reconstruct the entire
meaning in my head, just whether it made code flow better to use very short
names. My discovery was that it does help, up to a point, because you can
always create original short names that differentiate better than long
abbreviations:

    
    
        result = ttpfe + ttpre;
    

vs.

    
    
        result = q + k;
    

Any time you copy-paste-modify, you create a risk of error. Including the
whole phrase slows down your reading and brings diminishing returns on
comprehension as it puts friction on actually working with the variable. With
the highly differentiated "q" and "k", error has a lower likelihood of
slipping in than the abbreviation because you've minimized the amount of noise
in the data - it "chunks" better and you can read the whole algorithm more
fluidly. The only problem with using such short names is the issue of
reconstruction, which is why I opt for aliasing a long name if I need to work
with it frequently, even if it means extra boilerplate to make a local
variable. Abbreviation is the worst of both worlds and so I try not to do so
much of it.

Given the original 8-character constraint of OP I might have tried something
along the lines of:

"edge_t0" "fall0" "ef"

Because there is no way to convey the entire phrase, I see it as preferable to
focus on a key aspect, expand that word, and push everything else into more
symbolic content, especially numbers. Programmers are already trained to look
carefully for uses of 0 vs. 1 in our code. That doesn't mean it's my go-to for
an abstract symbol, but it works better than a bad abbreviation.

------
tehchromic
It's the quality of engaging posts like this one that draws me here on a
Friday night.

~~~
bhollan
Why thank you, sir.

------
foota
Anyone else feel a bit uncomfortable reading the level of detail in this post?

~~~
bhollan
I tried to keep it as high-level as I could. What would you change?

~~~
foota
honestly I don't have any experience working with anything classified, so your
judgement is likely oodles better than mine. Great story :)

------
ams6110
Heh. When I first learned to code as a kid, a variable was max two characters
(the first of which had to be a letter A-Z and the other could be a letter or
number).

~~~
stevep98
Ha, you're right. Commodore BASIC, probably? (written by Microsoft, I think.)

I do remember at some point having access to a language with less restrictions
on variable naming, and thinking, "wow, that's so amazing"

~~~
Roboprog
Or Apple "integer" BASIC. Probably a few other dialects, as well.

------
ericssmith
I would suggest that the need for Englishy variable names is due to a weakness
in programming languages and possibly the programming model itself. Why should
a set of legitimate values for a computation benefit from how you refer to
that set? Can that variable take on undesired values? Do you rely on that name
and its comprehensibility to distinguish good from bad values? I sometimes
find it hard to believe we still program this way.

~~~
correnos
We don't _have_ to still program this way - you can write code with very
strict types, with machine-checked proofs that it works correctly, etc, etc.
We don't do this very often because it turns out this level of rigor is
incredibly time-intensive.

------
TheDong
May I recommend checking out Rob Pike's document for a counter-point
[http://doc.cat-v.org/bell_labs/pikestyle](http://doc.cat-v.org/bell_labs/pikestyle)

I think that his variable names section is utterly ridiculous, but it's a
relevant read from a relatively prolific person, so worth sharing.

~~~
Roboprog
Thank you for that. That was a good little read.

While many of the Bell Labs guys didn't much like Pascal, Turbo Pascal managed
to address almost all of the complaints I've ever seen, while preserving the
good parts of Pascal (or Modula???).

Java must look irredeemable to the Bell Labs folks, though. Perhaps it is:
UglyNames; limited structured constant literals; still too clunky lambdas for
callbacks.

------
sseagull
This is common in the sciences as well, since senior professors also had the
8-character limit (from fortran [0]). And functions are also named this way as
well (see lapack/blas).

Some people also hate typing out slightly longer variable names and what not.
I try to emphasize that a section of code will tend to be read more times that
it is written, and therefore readability is more important. It's a frustrating
battle sometimes, though.

[0] Exacerbating the problem of using the wrong variable is the fact that much
existing code uses implicit types...

------
mchahn
> The term was supposed to be “Ttpfe”, but he had mistakenly called it
> “Ttpre”,

The same cognitive mistake could have happened with it spelled out. Two
concepts closely related can easily be confused.

------
iblaine
TL:DR Abbreviated variables are not always intuitive to others. I tend to
agree. If you're going to use a pattern or abbreviation then be as non-
creative as possible.

------
hartror
Is the issue that they were abbreviations or that they were named similarly?

~~~
molteanu
Similarly. The term was supposed to be “Ttpfe”, but he had mistakenly called
it “Ttpre”

------
foota
I have the... pleasure of working with a legacy oracle database where table
names are restricted by convention to 8 names.

------
D-Coder
Typing is easy. Thinking is hard!

------
hoodoof
let thisIsCertainlyAnApproachThatIAdvocate = true

------
bikamonki
Pwd stands for: ________?

Correct, you guessed it!

Once an abbreviation or acronym becomes _widely used_ it makes sense to use it
as a variable name for as long as you live.

~~~
hoodoof
This comment really deeply gets to the heart of the issue.

Programmer X thinks "I'll abbreviate because MAN IT IS SO OBVIOUS what the
abbreviation is, what sort of idiot would think anything else?".

You have nothing to lose by typing:

let password = ''

as opposed to

let pwd = ''

~~~
Roboprog
It depends on how many characters are significant to the compiler/interpreter.
This was a real issue in the 80s, as laughable as it sounds now.

------
SimeVidas
That’s what linters are for :-P

~~~
bhollan
How would a linter catch the presence of a variable that exists?

------
cbd1984
I think the usual rule is that the size of a variable's name should be a
function of the size of the scope in which it's visible: If it's a global, it
should have a long, descriptive name, perhaps with_underscores or CamelCase or
similar. If it's a class member, abbreviate it some. Function locals get even
shorter names, and loop indices or temporary variables only used in one part
of the function can be single-character.

