
Clarity over brevity in variable and method names - llambda
http://37signals.com/svn/posts/3250-clarity-over-brevity-in-variable-and-method-names
======
Silhouette
I usually find that the wider the scope of a variable or function, the longer
and more descriptive the name I want to use. For names that are only used
locally, there's more context available, and the name can be shorter to keep
the code more concise and easily scannable.

For example, I might call a function in a geometry library gradient_of_line,
but within a line module in that library, I'd probably just call it gradient.

In mathematical code where there is a common notation that happens to be very
concise, I'm quite happy writing code that uses it, like

    
    
        y = m * x + c,
    

as long as the terminology/notation is standardised or well understood. I
don't think writing

    
    
        dependent_variable = gradient * independent_variable + crossing_point
    

is an improvement in this sort of case.

By a similar argument, I have no problem with giving abstract loop control
variables like counters a short name like i or n, since they are only used
within a small area of the code. However, if the variable represents some more
meaningful concept, then I would probably name it accordingly.

~~~
pjungwir
I first read this principle in some Java coding guidelines published by Sun,
and I think it's great advice. Anything public probably deserves a longer
name, keeping local variables short promotes readability.

------
crazygringo
Of course, naming is one of the "Two Hard Things".

It's absolutely an art, not a science. But it shouldn't go to either extreme
-- names that are too short (var xi_a) are too confusing for clarity, but
names that are too long provide too much visual noise to be clear.

In my opinion, "make_person_an_outside_subscriber_if_all_accesses_revoked"
just takes too long to read. Much better would be something like this:

    
    
      // If all accesses have been revoked, then make the person an outside subscriber
    
      def updateOutsideSubscriber
        ...
      end
    

The function name is short and gets straight to the point of what it does.
Someone coming across the code for the first time can instantly figure out
roughly what the function is, and then decide if it's worth their time to read
further, or if they should just skip over it.

Then, a comment explains in further detail what it is for, and why it is used.
You only need to read it if it seems important to your task at hand.

Good variable naming strategy goes hand-in-hand with good commenting strategy.

The end goal should always be maximum clarity, but remember that too much
information, or too much information presented upfront, can _reduce_ clarity.

37signals says "clarity over brevity". I'd say "clarity over extremes".

~~~
greggman
What do you think about Joel's "Make Wrong Code Look Wrong"
(<http://www.joelonsoftware.com/articles/Wrong.html>)?

In it he makes the argument that it's a good goal to make each line of code
understandable with as little other context as possible.

Your `makeOutsideSubscriber` is fine when your in that module. But when
someone else is using the module.

    
    
       import subcribe
    
       ...
    
       def SomeFunction
         subscribe.makeOutsideSubscriber(...)
    

And then 3 months later someone comes and reads the code above using your
module it sure would be nice if they didn't have to go dig through the source
of your module and read the comments to figure out that it really means "make
person an outside subscriber if all accesses revoked"

~~~
crazygringo
It's a good point. (And BTW I edited my comment to be
"updateOutsideSubscriber" when I realized that was what it was really doing.)

It seems to me that
"make_person_an_outside_subscriber_if_all_accesses_revoked" just sounds like
an internal module function. If something outside the module is calling it, it
just smells like the whole thing is organized wrong.

But sure, if the function is being used in a "global" way like that, then the
full, long version is probably better...

------
scottlilly
I once wrote a program that stored complex rules in a database (what was
required for different types of legal documents, in thousands of different
jurisdictions across the country). The table names I used made the relations
obvious, but they were long.

A few years after I left that job, I briefly ended up back at that company on
a contract assignment - for a different project, under a new development
manager. At one point, I had to pull some data from the database, and had no
problem finding the information I needed, or how to connect it together.

The dev manager joked about the long table names (not knowing I was the source
of them). But when I asked him if any of his current developers had trouble
finding the information they needed, or understanding the relations, he said,
"No, they just get tired of typing them."

That's just one anecdote, but I'm going to stick with long, descriptive names.

~~~
hackinthebochs
Agreed. Short names seem like their optimizing for typing speed, rather than
ease of readability. If it's true that code will be read many times more than
it is written, then descriptive variable names are the answer.

~~~
Evbn
DescriptiveVariableNameUser thinks DecriptiveVariablName is great.
DescriptiveVariableName constantly reminds DescriptiveVariableNameUser the
DescriptiveVariableNameMeaning of DescriptiveVariableName, so
DescriptiveVariableNameUser never forgets descriptiveVariableNameMeaning.

The only way DescriptiveVariableName could me made for meaningful is if
DescriptiveVariableNameUser declare s DescriltiveVariableName's type
(DescriptiveVariable) every time DesriptiveVariableNameUser accesses
(DescriptiveVariable) DescrptiveVariableName.

~~~
hackinthebochs
Of course, knowing when to use an extremely descriptive name is half of the
battle. Having a good aesthetic sense as a programmer is crucial. As a general
rule, if I can see the entire lifetime of a variable in my field of view, I'll
go for short/terse variable names (loop control, temp values, etc). Or if the
method name is descriptive, and the code in the method is fairly short, I
won't need to repeat myself in the variable names. Good sense is key.

Also, the names used in the article show a very poor aesthetic sense. Names
like that are screaming out as needing a refactor. Having a condition in your
method name is a code smell. Your methods (read: api) should be abstracted as
discreet actions, and the code that calls it should perform control flow.

------
j_baker
This is a strawman. When have you ever heard someone say "Let's choose short
variable names! Screw readability!" ?

I don't buy that short variable names are necessarily less clear. I think the
ideal is to find variable names that are terse _and_ clear. And I'm strongly
in the camp that believes that brevity _aids_ clarity.

~~~
crazygringo
Oh, man. I've heard it all the time.

It's not so much "screw readability" as "long names are ugly and annoying to
type, and if you can't figure out what 'i' and 'x' and 'dx' and 'j' mean in
this context, then you lack intelligence and that's your problem, because it's
_obvious_ , duh. _Real_ programmers are efficient, not verbose."

I've worked with quite a number of programmers who seem to have a very hard
time putting themselves in the shoes of someone who will have to read/maintain
their code later. Good variable names are all about communication, and there
are certainly programmers out there who don't have communication as one of
their strong suits.

~~~
debacle
> I've worked with quite a number of programmers who seem to have a very hard
> time putting themselves in the shoes of someone who will have to
> read/maintain their code later.

We all do that. How many times have you looked at something, went "What
asshole wrote this?" Only to find out from git that it was _you_?

~~~
crazygringo
Ha! Yeah... way more often than I'd like to admit.

------
dustingetz
there are really smart people in the camp of long variable names, and there
are really smart people in the camp of terse variable names. both sets are
optimizing towards whatever subjective definition of "good code" they have,
where, again, their definition is informed and intelligent even if it differs.

if someone really smart says they use terse variable names, you don't tell
them they're wrong, you try to understand them. There exist styles of writing
high quality code that don't need naming style like
`someone_else_just_finished_writing?`. there are styles that use it. if you're
only accustomed to the long way, you'd be well served to go join a successful
team that uses the short way, instead of blindly saying that they're wrong.
some of the clearest code i've ever seen was Haskell, for which i think most
people find you don't need long names.

me, when i find myself needing long names, i try to refactor until i don't
need them any more. its not always possible given time constraints, backwards
compat, trying not to change fragile code etc, but i mostly chalk that up to a
personal or team failing.

~~~
wtetzner
> there are really smart people in the camp of long variable names, and there
> are really smart people in the camp of terse variable names.

Like the Scheme and Haskell communities, respectively.

~~~
MBlume
Haskellers don't so much want to keep the variable names terse as avoid naming
the variable at all.

~~~
wtetzner
While that's true, there's a tendency to use short names in pattern matching,
since using long names actually makes it harder to read the pattern.

------
Stratoscope
When you use it right, brevity _is_ clarity. I'm all for descriptive names,
but the elaborate wording in those method names is like writing "utilize" when
you could have said "use".

Utilize, upward, starting at, finished writing

vs.

Use, up, from, wrote

So:

    
    
      def shift_records_upward_starting_at(position)
    

could be:

    
    
      def shift_records_up_from(position)
    

and:

    
    
      def someone_else_just_finished_writing?(document)
    

could be:

    
    
      def other_user_just_wrote?(document)
    

I don't think the shorter versions are any less clear.

~~~
nonrecursive
I really like your revisions. While I agree with the spirit of the 37s post,
I've found that megalong method names like the ones they listed are hard to
reason about simply because they take up more human memory, thus making it
harder to hold other objects in memory at the same time in order to compare.

------
typicalrunt
I agree with David's sentiment and I have been doing this for years... except
that I have been consistently told by other developers that I am doing the
wrong thing. One of the issues in our industry is that there are no right
answers, everyone seems to be correct and the biggest, loudest person in the
room wins the debate. Obviously, moderation is key for anything, but it'd be
nice to have a good, clear set of rules to live by (that won't change in 10
years).

For me, clear and concise method names have always help me understand code
that I'm reading, as well as understand a stack trace.

~~~
recursive
> that won't change in 10 years

I'd settle for six months.

~~~
typicalrunt
Heh. Yeah I wanted to give a long enough timespan, but you are absolutely
correct, even 6 months would be nice.

<rant class='mini'> When I talk with friends in other industries (e.g.: civil
engineering), they have a very specific way of doing things. These fundamental
activities do not change very often because they're based on decades of
experience (and, I assume, because bugs in their process could kill people).
That doesn't mean things don't change: concrete mixtures and such are always
improving and sometimes it sounds like the IT industry with all the new tech
coming out, but that's just materials technology, not how specs and processes
are written. </rant>

------
Dove
The preference probably depends a lot on your background. If you're spending a
lot of time stitching together APIs, you probably prefer longer names for
clarity.

If you spend a lot of time writing math, the opposite is true: shorter
variable and function names are better for clarity, to the point that
arbitrary operator overloading is essential -- as anyone who's suffered
through
matrix.transpose().multiply(matrix2.inverse()).multiply(vector.multiply(scalar))
. . . would tell you!

------
davvid
This is flame war territory so I'll try and avoid fanning any flames. One
advantage to sticking to short lines (80 chars) is that you can place several
source files side by side and read them.

You can also print them. My co-workers sometimes print code. Please don't
laugh, some people do not have good vision (and one day, you will be that
person too).

------
tomx
I always feel articles such of these should be prefixed with "within my humble
experience and within my application area".

"Comments are generally only needed when you failed to be clear enough in
naming. Treat them as a code smell." ...but only in simpler applications, such
as self-describing CRUD type web applications?

I think I agree with the quote within some web apps, however remember some
code may take weeks/months to appreciate the complexities of. For example, a
TCP stack is a complicated thing, born and refined through much research for
several decades. Notes of various design choices and optimisations need to be
documented, and long comments are sometimes the most convenient way to do so.

~~~
ballooney
Agreed. I recently had to write a lot of digital signal processing including
some things like unscented Kalman filters and other things you need a bit of
stats background to digest. You don't just dump a load of linear algebra down
and hope the poor maintainer can keep up! It's beneficial to explain what is
going on. Just as a mathetmatics professor will talk while writing on the
black board in a lecture, rather than just writing it in silence and
arrogantly declaring it is complete and self documenting and walking out the
door.

------
nicholassmith
I always find short variable/function/class names are often developers
optimising for themselves right at that point in time, but when you get two
years down the line and the developer has disappeared into the sunset you're
stuck with it.

I think since I started programming professionally I've stuck to expressive
names (not necessarily long names), mostly for code self-documentation. It's
nice, most editors/IDEs tend to have name completion, even if it's only
document bound, so you only need to type it in full once.

------
thehotelambush
With the risk of RSI, I don't need to type any more than I already do. In a
perfect world tab completion would exist everywhere and you could make
arbitrarily long variable names. But do names like
"make_person_an_outside_subscriber_if_all_accesses_revoked" really make your
code more readable? There's only so much info you can put in a variable name,
and I'd rather be able to see more code at one time on the screen that read a
massively long name every few lines.

~~~
thehotelambush
Not to mention the increase in difficulty of _remembering_ the variable names.

------
BoppreH
I don't like long names because it's much harder to differentiate the
following:

    
    
      someone_else_just_finished_writing?(document)
    
      someone_else.just_finished_writing?(document)
    

Especially when using underscores, it's too easy to miss a period or a
subtraction.

And when you have several lines using such long names, it becomes a wall of
text and have you to read everything out loud to understand what's happening.

------
wethesheeple
I subscribe to the Church of Less Than 40 Character Lines, for no other reason
than it's easier to read. Doesn't matter if it's code, poetry or prose.

My eyes never need to drift to the right side of the page/screen. Everything
is justified at the left. Keep indentation to a minimum. Alas, this is not a
popular idea when people write code.

Books, newpapers and magazines often adopt multi-column formats. Why? Does
anyone know?

Columns also allow room for comments. Columned class notes are a great
example. You can even leave the whole right side of the page clear for adding
comments later.

Unconvential perhaps. But very useful.

Anyway, names are arbitrary. They are inherently ambiguous. The truth is that
computers work via numbers, not names. As such, naming will always be a
subjective affair, to some extent.

------
dctoedt
Here's an ignorant technical question (I'm not a developer): _Can variable-
and method names still be discerned from publicly-distributed executable code
by reverse-compiling it?_ It used to be that way years ago, but I have no idea
if it's still true. If so, verbose names would make it that much easier for a
competitor to reverse-engineer your distributed executable code. From that
perspective, comments might be preferable to verbosity, because comments would
get stripped out in compilation.

(Of course, this assumes that you distribute your executable instead of
keeping it inaccessible to others, for example by running it on a Web server.)

~~~
MichaelGG
Generally, virtual machine or interpreted languages have such symbol
information available, native compilers tend to not include such things in the
binary.

It is not that big of a deal though. On platforms like Java or .NET, you can
run an obfuscator to strip out names and leave confusing looking strings.

But it only increases reverse engineering complication very slightly. If
you're trying to protect a small secret (like a key validation or special
algorithm), you're out of luck, even if you strip symbol names. If you're
trying to prevent wholesale ripoff of your app, legal action is more
effective.

Making your code "worse" is a terrible tradeoff for approximately zero gain.

~~~
Evbn
In fact, shipping symbol names makes it easier to prove copyright violation.

------
aristidb
This is an example of insufficient abstraction, not clarity:
make_person_an_outside_subscriber_if_all_accesses_revoked

------
zerostar07
If you have to resort to _that_ long method names it means you 're doing
something very specific (which means the method body will be very small); it's
more flexible and clearer to just paste the method body (case in point the
examples given)

~~~
msbarnett
By inlining that logic at all call-sites, you're abandoning encapsulating the
logic for "updating user records to mark them as outside subscribers if all
accesses are revoked" in a Single Point of Truth.

This creates at least two problems:

1) You need to remember to update every inlined use of that logic at bug
fixing time. Experience says you'll inevitably miss at least one, the first
time around.

2) You've created a Multiple Points of "Truth" maintenance burden. When you
hand the code off to somebody else for maintenance, and there's some bug
around "updating user records to mark them as outside subscribers if all
accesses are revoked", they have to figure out whether or not the fact its
being done different ways in different locations is intentional or accidental,
and if accidental, which way is in fact correct. This is always an enormous
pain in the ass.

------
devgutt
This can improve clarity, but it's ugly as hell IMO

    
    
       def make_person_an_outside_subscriber_if_all_accesses_revoked
           person.update_attribute(:outside_subscriber, true) if    person.reload.accesses.blank?
       end
    

I prefer camel case

------
Xcelerate
This is one of the reasons I like functional programming. Each function has
one explicit task, so variable names can be short while also being very
descriptive.

