
Is naming things really that hard? - jpswade
http://wade.be/development/2017/03/03/naming-things.html
======
fragmede
I've written code numerous times to deals with files, and if there's anything
more than feeding the path as given into open()`, then the code inevitably
gets filled with variables of related names, and if you stare at it long
enough, they all blur together. 'filename', 'name', 'file', 'files',
'file_handle', 'path', 'dir', 'dirname', 'parent_dir', 'file_ext',
'basename'...

Do I want 'filename', which is whatever was given? Do I want 'file', which is
the file object? Do i want 'name', which is just the name of file, with
directories, if any, stripped off? Do I want 'files', which is the list of
files, as given?

If copying/moving is going on, then there's a source and a destination and it
gets even worse.

The obvious way out is to refactor by renaming those variables, but
'unsafe_filename_as_given_by_user' or 'list_of_files_as_given' doesn't quite
roll off the keyboard.

~~~
nine_k
One word: autocompletion.

You only need to type unsafe_file_name_from_user once. After that, even the
simplest, language-agnostics autocompletion of identifiers from the current
buffer goes a.long way. A proper IDE makes it even simpler.

~~~
mikekchar
I have found that when I start reaching for long variable names it's because
there is a design problem. Ideally the use of the variable should be obvious
from the context. As you find yourself adding more and more context to the
name, you should realise that it's because you lack context in your design.

~~~
randallsquared
Deriving meaning from the context requires understanding the context. That's
fine for the original author or the refactoring dev, but the troubleshooter
swooping in to fix that problem that's obvious in the UI needs to be able to
understand the variable names with as little context as possible. "Well, if
you'd just spent a few hours to truly grok the intent of the code, ..." is not
an acceptable answer when it's possible to make the solution transparent as
easily as making more descriptive names.

~~~
barrkel
If you're in the middle of the spreadsheet reader, it should be self-evident
that 'row' means a row in the spreadsheet; if you're in the middle of the
database insertion logic, it should be self-evident that 'row' means a row in
the database.

I simply don't accept your priors.

~~~
randallsquared
I use 'row' as a variable name quite a lot, especially in loopy situations
where it's obvious. The more problematic times are when you have things like a
source and a destination row dealing with, say, cows, and you call one of them
'row' and the other one 'cow\\_row', and in different places pick which kind
of row is which at apparent random.

Or the situation with the file names, as someone else mentioned at more
length.

------
rkrzr
In my experience there are a couple of reasons that lead to bad names in
programs:

\- it takes effort to think of a good name. And more of the linguistically
creative type instead of the logical reasoning type which makes up most of the
rest of programming.

\- it takes domain expertise to pick a good name

\- if you stare too long at the same piece of code it becomes harder to see it
through the eyes of someone who has never seen that code before

\- almost everybody uses in English in their programs, but many people are not
native speakers

But I think making an actual effort to pick a good name already goes a long
way. It forces you to think about the problem at hand and how you could
communicate that to somebody else.

------
stirner
<pre> tags are not meant for block quotes. HTML has a <blockquote> tag
instead. <pre> tags force mobile users to read in snippets as they scroll
horizontally along the entire line. If you want your block quotes to use a
monospace font, you can use CSS, and then you don't have to give up on
semantic HTML and line wrapping.

~~~
jpswade
Markdown fixed.

------
tchaffee
Good naming is hard because you are using the wrong tool. Names in code, and
even in the business world, are usually abbreviations for much more complex
things. You should not be looking for _too_ much meaning from the name itself.
Most words have many meanings, so trying to string together some verbs and
nouns to describe complex behavior will almost always fail. A BDD test suite
is far more suited to tell you what the abbreviated name really stands for
than trying to deduce it from the name itself. One exception to this could be
programming idioms, for example using 'i' as a counter in a loop which anyone
with experience recognizes without even thinking. To many programmers, 'i' has
become a word with quite a complex meaning itself: a) I'm a number b) I will
be incremented c) I will be used in a loop d) I will be tested against to end
the loop. Try naming that accurately without ending up with a really long
variable name.

~~~
jstimpfle
Programming mostly in Python and some C these days, for me names are the most
important place to put meaning in.

The trick is being systematic. And consistent in the small. Short names and
(at first) non-telling names are not a problem. It makes no sense to use
(predominantly) very long descriptive names, not because they are harder to
type, but because a long name is and indication that it's not an established
concept, not a recurring theme across the code pase.

So my habit has become to just choose a one-word name for this vague but
probably important concept I have, even if it's not the best name. I put a
comment on the datatype declaration, explaining in detail the meaning of that
thing. Maybe later I will come up with a better name, so I will just switch to
that name. Or the comment gets improved. It's about growing clear concepts.
After some time that single-word name will be a natural thing to use for that
concept.

~~~
natch
Fix-it-later leads to shoddy code though.

If you can have a clear understanding of the important concept (instead of a
vague understanding) that can be reflected in a good name.

Good name meaning one that is descriptive, unambiguous, and reveals a crisp,
not vague, concept.

Then with that good name, when you write the code the first time, the clarity
of the concept will help you avoid logic pitfalls and messy constructs.

Not to say comments are always bad, but probably a comment won't be needed,
because the name itself will reveal the intent.

Then later, when someone does come and edit the code, instead of having to
come up with a better name, they can focus on whatever they came there for,
and they can do so with a better understanding of each variable and what is
going on.

The difference between the two approaches is really about where you spend your
extra time related to naming. In my approach, you spend a little extra time up
front. But it begins paying dividends immediately, as you continue writing the
very first version of the code; it is easier to write clearly and correctly.

In your approach, you spend less time up front, and pour the code out faster,
which might seem like a great thing. You also see an immediate payoff, because
your process is fast. But then you (or someone who comes behind you) has to
pay a heavy bill of technical debt.

When I read your comment I read it as a description of what your current
practice is, not something that has been thought through a lot. Growing clear
concepts is a great way to think about it, so that's a good line to continue
with imho. I just think that part can and should be done up front. Sometimes
it's worth a short conversation with another developer to see if you can agree
on a good definition reflected in a clear name.

But you're also right that sometimes single word names work fine. And
sometimes concepts are just clear and don't need much thought. Obviously those
ones we can agree on. I'm more focused on the more difficult, or more vague
ones, as a place where I think up-front quality naming is better than having
the "fix-it-later" approach.

~~~
jstimpfle
Yeah I didn't mean to imply that I pick bad names intentionally. Of course I
pick the best name that I can come up with. But when I'm doing something I
haven't done before, it's hard to judge whether a name is any good, maybe
because I don't actually know precisely what I'm doing in the first place.

In these cases I err on the side of short names (preferably non-composite)
because I _know_ that making a longer name will not help. I just use that new
name a little in a more exploratory type of programming. When it sticks it was
a good choice. Examples: tree, spec, coverage, identity, isomorphism, object,
struct, key, value, tag... These names need context, but they can turn out to
be good names because they are short and _distinct_.

When the name doesn't stick and I can't come up with a better one, maybe I
should do something entirely different.

------
kornork
I think this article ignores or glosses over the hardest part about naming,
which is when you have to start naming things with abstract behavior. Naming
things with analogs in the domain is easy, but when you find yourself trying
to follow the Single Responsibility principle and breaking pieces off you run
into trouble.

My AccountingRow is something the user sees, and there's also a
SummaryAccountingRow with similarities; both of these need SQL generated, so
there's a AccountingRowSQLGenerator and a AccountingRowSummarySQLGenerator.
These generally share some of the same business logic around how to join
different tables, so there's a AccountingRowCommonQueries class. We also need
to spit this out in JSON, so there's a AccountingRowSerializer and
AccountingRowsSerializer and AccountingRowSummarySerializer,
AccountingRowSummaryRowsSerializer. An AccountingRow also has an
AccountingRowBehavior, and AccountRowAdder, and onward into infinity.

Now you have name soup, and if you are looking at this system for the first
time your eyes glaze over.

~~~
jt2190
This example would benefit from a namespace/package named "Accounting". Then
you could drop all of the "Accounting-" prefixes.

~~~
kornork
Whoops - there's also a BudgetingRow that needs a very similar hierarchy of
support structures!

Clearly, the system needs some refactoring - perhaps there's a pattern that
can be extracted. Hopefully, though, my point came across.

------
hyporthogon
Well, even non-hard, very simply functional/x-ray naming sloppiness causes
plenty of grief, but I think this is partly because we're used to naming being
hard enough that -- at a certain level of energy, time-crunch, and lack of
investment in a massively brownfield project we really would rather refactor
entirely -- we sort of give up. There's a recentish (2015) little empirical
study that clustered'linguistic antipatterns' (based on surveys of software
engineers):
[https://www.researchgate.net/publication/276314133_Linguisti...](https://www.researchgate.net/publication/276314133_Linguistic_antipatterns_what_they_are_and_how_developers_perceive_them)

~~~
Noumenon72
Wow, I wish every academic paper were that useful to the layman. Free
download, well-explained list of antipatterns, in-document hyperlinks to code
samples so you can recognize the antipatterns, suggestions for renaming.
Thanks very much for the link.

------
mstade
I always thought the quote was that the two difficult problems are cache
invalidation, naming things, and off by one errors.

~~~
joncrocks
[https://martinfowler.com/bliki/TwoHardThings.html](https://martinfowler.com/bliki/TwoHardThings.html)

There are only two hard things in Computer Science: cache invalidation and
naming things.

    
    
        -- Phil Karlton
    

But yes, there are lots of other popular variations :-)

~~~
mstade
Haha, that distributed systems tweet is excellent – thanks for sharing! :o)

    
    
        There are only two hard problems in distributed systems:
        2. Exactly-once delivery
        1. Guaranteed order of messages
        2. Exactly-once delivery

------
ganfortran
Yes. Right naming hints right abstraction, vice versa. If you cannot find a
good name, large chance is that you probably have some confusion as of how to
structure your program.

So yes, it is hard, and rightfully so.

~~~
Kenji
>If you cannot find a good name, large chance is that you probably have some
confusion as of how to structure your program.

I'm not sure about that. If you don't know about patterns like factory or
observer (event listener), and you invent these concepts independently, you'll
have a hard time naming it, and your names will be nonstandard. My point being
that it's possible to have an elegant solution yet it's still hard to name
succinctly.

~~~
ganfortran
> you invent these concepts independently

That sounds confusion to me. Those patterns are called patterns, because
people practice them over long time, during the cause, they decoupled them
nicely. Assigning the right name, implicitly means the clear understanding of
scope and responsibility. Self-invented solution may echo the original idea in
some aspects, but most of time, is not even nearly as clean.

------
jsemrau
Naming things is the most fun part. At least in my opinion. Closely followed
by creating the logo. Then going back to the naming again because for the
original name you could not create a good logo.

~~~
douche
Maybe I'm weird, but I think actually building things is the fun part. Naming
things is the worst - every time that we have to name a product I feel like
I'm in the Dilbert animated cartoon [1].

Plus it lets the talkers and the bullshitters look like they're doing
something while they bikeshed over naming things, but for some reason they
have to bring in the doers to spectate instead of letting them get things
done.

[1]
[http://dilbert.wikia.com/wiki/The_Name](http://dilbert.wikia.com/wiki/The_Name)

~~~
jsemrau
The name makes out of the tool a product. Something you can relate to. When I
was in charge of rolling out a regional scoring engine in Asia no one wanted
the project. Renamed it to Blaze because of the underlying Fico Blaze Advisor
software, suddenly things changed. Blaze is fast, efficient, fiery, daring.
People are weird.

------
Clubber
I always thought this referred to abstract computer science concepts like :
constant vs literal vs variable, scope, composition vs aggregation,
encapsulation, function, procedure, polymorphism, interface, loops, accessor,
mutator, class vs object and instantiation, abstraction, linked lists, arrays,
hash (not so much hash), recursion, etc.

I consider these names to be beautiful and succinct and a lot of thought was
put into them.

~~~
renox
Except that C ruined 'constant' by mistaking it with view/'really only access'
which means that in D real constant are named 'immutable'..

Plus there are compile time constant, runtime constant, once (Eiffel)
variable..

------
vinceguidry
One thing I do when I can't immediately think of a domain-appropriate name is
to give an obviously wrong one, like "new_function_01". During a complex
refactoring, I may have dozens of such methods or variables. Eventually the
variations are all smoothed out and I can replace 20 different method calls
with one method call with perhaps an additional argument. That method is _way_
easier to name.

Naming seem to be the last step of refactoring. Refactoring is the process of
manipulating / identifying the semantics of an application, names are human-
readable tags that ease the mental task of understanding semantics.

~~~
BurningFrog
I do the ugly names too. One of two things usually happen:

\- As I work with the code, the name becomes obvious.

\- The variable/function turns out to disappear during the work.

One more case of how it can be better to decide something later, when you know
more.

------
Y_Y
"There are only two hard things in Computer Science: cache invalidation and
naming things, and off-by-one errors."

------
mybrid
Coding software needs to catch up with usability testing of presenting
software. The dynamics are equivalent. In usability theory there is no one
practice for all domains. In fact, usability theory completely embraces
marketing where branding is important, i.e. distinctness is highly valuable. I
suggest then that usability studies be companion to software written to the
same extant as software presentation. Naming is just one aspect of usability.
Information architecture, or grouping of names, is one of the most important
roles in usability. In his report this year Jakob Nielson did a comparison
study across ten years. Usability of software has increased in every category
save one: information architecture. Grouping names into menus for tasks is
really, really hard. The current trend of mega-menus is no help from
experience. Doing usability studies and task analysis on the code would be a
huge boon to usability of the application. It would raise overall usability
awareness and its value.

------
danso
Naming is hard for all the reasons OP mentions. But why I believe it is
especially hard in computer science is that, like cache invalidation, it does
not have a deterministic or "correct" solution. At some point, humans have to
make a decision based on circumstances and tradeoffs.

------
Kiro
Does anyone know of an article filled with examples of good naming? Functions,
variables etc.

------
jondubois
Naming is very important. I think the problem is that it can take a little bit
of mental effort to come up with a good name and it can distract a developer
from their current train of thought. It just takes a bit of practice to get
used to it.

------
js8
Has someone looked into a problem of automatic (computer-generated) naming? As
in, generate a unique name that somewhat describes this thing that was just
generated (like a piece of code)?

------
gotthemwmds
I think the bigger problem with naming is simply being consistent. Ask 5
developers what the variable name for an important filehandle should be and
you'll probably get at least 3 answers. As long as everyone is consistent,
even if the names get a little sketchy here and there, that consistency goes a
long way for code with a long lifetime.

------
jwatte
The problem comes when someone else already used "product" for another concept
in the system, and you now have to come up with an appropriate synonym. (Or
the same for any other common concept: Object, connection, lock, etc)

------
jwilk
[https://en.wikipedia.org/wiki/List_of_names_for_turkeys](https://en.wikipedia.org/wiki/List_of_names_for_turkeys)

------
jlgaddis
_" There are only two hard things in Computer Science: cache invalidation,
naming things, and off-by-one errors."_

------
dredmorbius
This article is badly marred by a blockquote styling which fails to wrap lines
and has text scrolling off to the right of the enclosing box.

Author, if you're reading, please fix this. Default blockquote styling is far
preferable to what you've done here.

Thank you.

~~~
jshmrsn
I remember HN having this problem on small mobile screens for the longest time
(for block quotes in comments).

~~~
dredmorbius
It still does.

I've written a few Stylish fixes (including that) on desktop.

------
y7
Never mind naming, would putting proper paragraphs in the article have really
been too hard?

~~~
jpswade
Nope.

