Hacker News new | past | comments | ask | show | jobs | submit login
How to name things: the hardest problem in programming (2014) (slideshare.net)
57 points by Tomte on June 20, 2017 | hide | past | web | favorite | 36 comments

Naming things is hard because a proper name imparts the essence of the thing you're naming. To understand the essence of the thing, you have to understand what role it serves, what else it interacts with, and what part of that is interface and what is implementation.

Naming is identifying concepts that matter—and occasionally realizing that the thing you just wrote, but can't name, is not a thing that should exist. It's part of understanding what your system really is and what it does.

The naming part, the mapping of words to objects, is easy. What's hard is breaking down a system into parts with obvious names.

Indeed, naming things is hard because it involves design, and most often design is just a mess (a big ball of mud). Which is good, because bad names are a code smell. If you find yourself with lots of manager, doer, handler names maybe you should revisit your architecture. If you find yourself with lots of stuff that is similarly named or all shares the same naming prefix, maybe spin that chunk off into an object or something.

Well said.

I was inspired by this Confucius reference in a recent HN topic[0]:

As Master Confucius wisely advises, any effort to bring order to the world should begin by “rectifying names.” If language is not correct, then what is said is not what is meant; if what is said is not what is meant, then what must be done remains undone; if this remains undone, morals and art will deteriorate; if justice goes astray, the people will stand about in helpless confusion. Hence there must be no arbitrariness in what is said. This matters above everything. (Confucius, 1980 edn: bk 13, v. 3)

[0] http://rosalindwilliams.com/writing/history-as-crisis-and-me... https://news.ycombinator.com/item?id=14580362

One of the very rare times I've lost my cool in a Stack Overflow comment thread was an argument about a name. Not a function or variable name, but a piece of terminology.

The question was from someone who was using this to detect whether a JavaScript array was empty:

  if( JSON.stringify(array) === "[]" ) { /* it's empty */ }
They wanted to know if this was a good way to do it or if there was a better way. Naturally, several answers showed up within seconds suggesting the usual (and much faster) way of doing it:

  if( array.length === 0 ) { /* it's empty */ }
Then someone commented that if you did "new Array(23)", this test would give incorrect results: it would report that the array was not empty even though it actually is empty.

I replied that it didn't make sense to call this an empty array, because it had a length of 23, and it was no different from "[ undefined, undefined, ...23 times... ]", and surely that wouldn't be considered an "empty array", would it?

Of course I was wrong on that detail. This person explained that "new Array(23)" doesn't create an array of 23 elements, it creates an array with 23 empty slots but no elements at all, so regardless of its .length it should also be considered an empty array. He pointed out that `array.forEach()` skips over empty slots, and gave the analogy of a schoolbus: if a bus has 23 seats and no one is sitting in any of them, you could call it an empty bus. So by analogy an array of 23 empty slots should be called an empty array.

Eventually we just agreed to disagree.

At this point the poor OP must have been getting pretty confused. They were already using a test that would have called "new Array(23)" a non-empty array, just like the .length test does.

And then someone else came along asking which was really the right way to test for an empty array, .length or something else that would take into account the possibility of an array with empty slots and no elements.

Empty array or not? Naming things is a minefield!

That is a wrong question. Trying to find the "best" meaning of "empty" in the abstract is mooted by the DWIM nature of the language. Javascript does not have a concept of "empty"; what do you really need to know about the variable?

Bingo. The above example reminds me of the old situation where one asks about X when they really care about Y.

> Empty array or not? Naming things is a minefield!

I'll go for option 3 and say that the naming error was further up the stack and that they really wanted a list and not an array. A list would have both a length (0) and a capacity (23).

Two biggest issues in programming: naming things, cache invalidation and off by one errors.

Let's not forget tabs vs spaces.

And my pay raise now that I've switched to using spaces.

This is a joke conflating causation and correlation with regards to this article: https://stackoverflow.blog/2017/06/15/developers-use-spaces-...

So... slide 38, except you mangled the quote.

Actually, I think the slide mangled the quote - I've often seen "off by one error" as the punchline, and found at least one post going back to 2010 with that variation: https://martinfowler.com/bliki/TwoHardThings.html

Brian Will, the guy behind the 'why OOP is bad' youtube video brings this into his argument. In a follow up video to that video he rewrites an OO program in procedural style. In doing so he reduced the number of methods/functions from 321 to 113, 54 of which were contained inside other functions as they were only accessed by the containing function. The total number of lines of code remained about the same. That's over 200 fewer meaningful names to try to come up with in a short program. Plus the fact that functions that do bigger tasks are easier to name in the first place, imo.


> [use] class PlanEvents [instead of] class EventPlanner (slide 15)

Eehhhhh... Classes are nouns. Methods are verbs. Generally speaking.

A crisp blog post would be quite convenient than clicking 90 times on the next button. Just saying.

You probably meant to say 'more' convenient. It is from a talk that he is giving. A blog post would probably have been easier than creating a load of slides. The text from the slides are listed underneath.

The more I teach programming, the more I emphasize to students how hard/important the naming problem is. In fact, I've found the vast majority of novice debugging time is lost to naming-related problems. Sounds stupid, I know, but at the introductory level of programming (e.g. before trickier concepts, such as recursion, optimization, and memory management), typos and variable/reference confusion is a huge stumbling block, moreso because novices haven't learned debugging tools.

The naming problem leads to problems other than typos, of course, but it's amazing to me how much time can be wasted on that, especially when moving to real-world file/net access, when typos/confusion happen in the file path and URL level, and the confusion is compounded with misunderstanding of how the file system and HTTP works.

Some anonymous comment on the Internet told me once that the 'naming things' bit was about some problem in distributed systems, I guess it was a lie then?

Good slides, not only the advice, but also the steady peace of delivery, the programming ~ writing theme and overall structure.

I misses advice about never naming classes with words ending in -or and -er, is this still a thing?

In the context of functional languages one letter variables are often sold as a feature, where hindrance of cryptic names is overweighted by benefits of being able easier bring smaller amount of text into the mind of the programmer (may require some practice). And personally I also recently observed that this abstract, mathematic notation helps me focus on the very abstract and mathematic core of some definitions in Haskell. It's good to use word "Car" when dealing with cars, but when inside of morphism-monad-whatever 1-line function definition "a" is OK, who knows what datatype will implement it?

One advice that helped me about naming things is not to have things to name in the first place. Very often I used to create lots of variables representing intermediate and not really interesting steps of computation, resulting in horrible names like file_content, validated_content, parsed_content and so on... Just thinking first about how to structure things will prevent many of risky naming situations from even appearing.

The suffix thing is generally a sign you're doing functional programming and should instead explicitly name the state if any is needed.

The problem with that idea is that it clashes with the OOP notion typical in languages like C++, Java, C# and so on. (Though it does make dependency injection and persistence of the state easier.)

Plus some names like "allocator" or "garbage collector" are ingrained in standard.

(FYI the state for an allocator could be a freelist, for GC could be a graph.)

I got that advice from some talk about refactoring in OOP languages, as far as I can remember Manager and Calculator were the discussed bad class name examples, "what do you manage/calculate?". Explicitly naming things you operate on was part of the advice given in the talk.

When I started Haskell I carried over my descriptive variable naming conventions. Didn't go well ha. Just be pragmatic, write lots of software, and adapt.

I totally agree that I find naming things to be difficult. The worst is arguing over names in a code review.

Is there a dictionary/encyclopedia of programming names somewhere?

There's the design patterns (where we get some of the examples from the slides in), but sometimes you just need the word, or want to look at other words. Sometimes getting a different word can even make you think about the architecture in a different way.

Kind of like a programmer's rhyming dictionary?

Yes. It's the same dictionary as the one for non-programmers.

I will sometimes put off (when I have other tasks) writing a new class for hours because I can't think of a resonably short but descriptive name.

That's why almost all IDEs have "renaming" a class as a feature.

I'd say roll with whatever name comes first, as you can always change it later. Dwelling too much with it can be a sign of perfectionism.... or procrastination, or ADD.... or just in-experience.

As I got older and had more experience "naming" things stopped being a problem, as I always knew I could change things later on, and trying to be perfect on first shot is futile.

Thinking of programming as an essay that will need some editing/revisions when done, helps a lot.

"I'd say roll with whatever name comes first, as you can always change it later. Dwelling too much with it can be a sign of perfectionism.... or procrastination, or ADD.... or just in-experience."

-- I will take "things i will never do" for a 100 Alex.

And then you end up with the eternal TO-DO. Congratulations.

I was trying to be funny but the point i was making is that "i'll fix this later" just leads to a ton of technical debt that usually doesn't get fixed.

In this one specific example being paralyzed about naming something is silly.

Interesting, I usually give my class a slightly stupid name and figure out a better one once some or all of the behavior is in place.

Yeah, that works for me. A stupid enough name that I'd never check it in will probably be replaced within an hour, as I've gotten to know the class or function better.

Another thing I do sometimes is tricking people into talking about the thing I'm trying to name without giving away that I'm looking for a name. They often just use the natural term I couldn't think of.

Right now, somewhere, some kid having descended several levels of java into a forgotten class is wondering what the hell a SnuffleupagusFactory does...

"abbreviations are ambiguous" "one letter is too short" And wrap line at 80 characters (e.g. PEP8) Facepalm.

Yes, the advice in this is pretty bad, especially since author does not know what active vs passive means. It is about verbs not nouns.

Albeit one letter is too short.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact