Hacker News new | past | comments | ask | show | jobs | submit login
How Developers Choose Names (arxiv.org)
198 points by matt_d 27 days ago | hide | past | favorite | 153 comments



I just use a UUID for every variable name to make sure the codebase never has any ambiguity.

No one is going to confuse e7693160-b5cf-4761-9202-de019cfd0fc9 with c3d8b9ac-d0da-4bbc-912b-025ce4e47f62


A good name has maximal information density. If the string is compressible, you're wasting your reader's time. That's why I use all utf-8 glyphs to render my uniformly random and highly informative (from an information theory standpoint) variable names.


You joke but symbolic programming languages like APL are kinda neat.

https://en.wikipedia.org/wiki/APL_syntax_and_symbols


I do see this more often in UTF8 native platforms.

arr1 and arr2 might use subscripts for 1 and 2 with no side effect on plan9.

Why use "delta" when writing a delta symbol is so easy?


Julia not only supports Unicode, but embraces it culturally. You indeed very often see Δ in Julia code.


A good name is also memorable, consistent, easily decipherable, and works around namespace conflicts. Might be a best 2.7 out of 4 situation, I guess. Why 2.7/4 and not 3/4? Because the remainder was lost in translation to the ASCII gods.


This implies your code has fewer than 2¹²² variables. A real coder should have more than that in each method. Sounds like you're basically a glorified VB "dev."


you're basically a glorified VB "dev."

How did you know?! Actually it's probably not far from the truth... most of the code I write is more like scripts in R & Python and small utilities than fully-fledged applications. And lots of SQL. I glue a bunch of stuff together, and after that is where my real job begins. I used to have a bit more fun writing C code to work around the limitations of an ancient ERP: that system's foundations literally extended back to before the moon landing and long before I was born.


I once ported a 3rd party telecommunication middleware stack to a new OS. Every single function was 12 characters long, 4 fields of 3 characters each. Each field had meaning, but I wasn't given the decoder ring.

That was not fun.


There's some usefulness in this idea, which is the core idea behind Unison https://www.unisonweb.org/docs/tour


You could rename the functions to match their symbols, after compiling.

The output of nm for the function originally named "write" from musl libc might look like

   000000000006ee20 T 000000000006ee20


Actually, you'd be really surprised about that: https://twitter.com/Foone/status/1229641258370355200


lol


With naming what IMO matters more than choosing one name, is choosing a name that is consistent with other similar names in the codebase. A new developer will learn your project's specific convention in a few minutes, but will be completely lost if the convention is routinely violated here and there. This is especially important with what verbs to use in methods, and how to name interfaces/classes in a way that respects both the business domain and the techical use.

I've seen way too many times how much confusion is created and time wasted when a developer not too familiar with the project comes up with a brand new verb for a common action, and couples it with a new synonym for an existing business domain entity – or worse still, uses the same name previously meant for a completely different domain entity.

The unfortunate burden with this is that a new developer might be right that their name is better for their use case, which then leads to laboursome debates about which is more important: consistency or slightly crisper name for the one use case.


You have to be careful with your definitions. Don’t use different definitions of the same verb/noun in different places. You get one. If you need another definition elsewhere, use a synonym.

It is not beneath you to hit thesaurus.com while picking a name.


A thesaurus is one of my favorite dev tools.


100%. I use RhymeZone and Wordnik more than I care to admit during development work.


Naming gets far worse than that. I've worked on code where the naming was not just unhelpful, it was outright misleading. One of the worst I remember was a project which unsurprisingly was named after a game of thrones character. As I never watched game of thrones it took me about two weeks before I could even remember the stupid name, during which time I was also dumbfounded as to how the program worked until I realised the naming in the code was equally bad, completely misleading. Then there's the place where the stg1 environment was actually the dev0 environment...


When I was working on a small game startup back 20 years ago, I renamed our all the high-level projects of our codebase into specifically-ancient Greek and then left a "glossary.txt" in the root directory so people could brush up on their language skills... this did not last for very long ;P.


One of the best coding interviews I had was an paired debugging session and the problem boiled down to a poorly named parameter at the beginning.


One of the worst I remember was a project which unsurprisingly was named after a game of thrones character.

I remember the era when sys admins would name machines or environments after nerdy shit like Tatooine, Dagobah. And you had to memorise which was the test environment and which was the mail server.

Thankfully I have never seen that in code.


> I've worked on code where the naming was not just unhelpful, it was outright misleading.

It's one way to achieve job security for life though: https://github.com/Droogans/unmaintainable-code


> Then there's the place where the stg1 environment was actually the dev0 environment...

Hey, think yourself lucky. It's only a pretty recent occurrence that my team has more than just the two "Prod" and "Non-Prod" environments...


Using the name “prod” for anything other than delivering code to external customers is one of my eternal pet peeves.


This goes double for APIs, and triple for externally facing APIs, where typically the people who are working with it aren't familiar with your code base or naming conventions.

That's why good API design is so hard. Every name you assign to something becomes a convention by default.


Couldn’t agree more with this. Consistency is key even if the convention used is awkward


Yes, convention is more important that the quality of the convention. The convention abnegates the need to think about things, and for a developer, thoughts are precious.


With a good convention, even more important, structure can be derived and deduced, without reading the documentation. I love it, when i begin to grasp the structure of a codebase, write a line and hit auto-complete and the codebase was reliable. Functionname exists and parameters expected are served up. The code just feels so much more discover-able in such moments.


One of the most cruel things evolution did to us is that we constantly seek meaning, order, correctness, where there are none to speak of.


I'd argue that the ability to unite people by assigning meaning onto things, places, and ideas is the single reason why humans have been so successful in advancing civilization to where it is today.

Edit: eschew -> assign


That's precisely the problem. It's very useful, when it's useful. Which leaves us spinning in circles when it's not.


It would be impossible for you to say this (circular) sentence and call something a “problem” if you didn’t think that there was some basis of right/wrong, which contradicts your top-level comment.


See, this shouldn't cause a deep sorrowful void within me, but it does.


Pedant note: eschew means ‘to go without’. You probably meant ‘assigning’.


Whoops, thanks


Sure, but now it's actually DE-volving our society in form of QAnon and religious "interpretations", and technology here is not blameless in the least.


How do you know that there isn't meaning, order and correctness if you don't search for it?

There are very few situations where you can definitely say that there isn't meaning, order and correctness as opposed to not knowing if you are just too dumb to find them.


Well, with the internet we don't need to do it anymore. Everyone has lost the battle to organize the internet, and it is now the helm of run away AI algorithms. And the same is happening to other areas which are controlled by AI and "big data".


That awfully sounds anti-knowledge. Humans didn’t use to think that there was an order or correctness to all the facts that fall under science and mathematics today.


I have a rule called “name it what it does, not how it’s used” which is a mistake I often see with junior developers and sacrifices information density

Eg it’s more helpful for readability to do “onClick => updateDate()” rather than “onClick => handleClick()”

(Not the best example and I’ve seen far more egregious, but those examples are escaping me now)


I’m an engineering lead and I made a rule of using the handleClick() style instead of updateDate() in our codebase. The reasoning is that the latter sounds more like an atomic, standalone function, whereas handleClick() may have some internal checks that could eventually call something like updateDate(), but the internal checks are beyond the responsibility of the updateDate() function.


> may have some internal checks that could eventually call something like updateDate

This sounds like exactly the kind of pre-mature optimization that leads to overly-abstract function names throughout the codebase. Your point is a good one, once the function does something more than `updateDate()`. But until then, just call it `updateDate()`. As a bonus, this way any time you do choose to use `handleClick()`, the name is conspicuous enough that you pause to see what side effects it might have.


It’s not premature optimization, it’s just boilerplate because it’s a pattern that happens so often.

Besides, there’ll be less work if the requirements do in fact change, and there’s nothing wrong with accounting for the possibility of change when you’re writing code because what you write now constrains what you can write in the future.


I agree. It's also nice to know when you see a `handle` function, it is getting slotted into an `on` somewhere. Within the handle I prefer to call functions such as updateDate().

But this whole thread brings up something about coding that has always irked me, it's very opinion based. When two people strongly hold the opposite opinions on the same team, it can be a massive hassle for absolutely zero benefit.


It also quickly explains to you that no, you can’t refactor this method to change the arguments.


In training interns I had to figure out what part of my coding advice to them is best practice and what part is just the way I like to do things.

It's good to know the difference and it gave them the space to challenge me on some of those methods, which in turn helped me write better code.


When two people strongly hold the opposite opinions on the same team, it can be a massive hassle for absolutely zero benefit.

On teams, or in life in general, IMO when there is no consensus, or even an anti-consensus, the only rational global decision is not to make a global decision.


I cannot disagree strongly enough, this is how you end up working on teams that are just disconnected, hostile fiefdoms.

Solving an issue that is intractable via discussion is when you escalate. Probably to a tech lead, or even to a manager.


Maybe we're talking about different kinds of issues. In the context of opinion-based disagreements (rather than matters of fact or evidence), an anti-consensus seems to indicate that there isn't (and can't be) one right answer. If one group imposes its will on another group, or appeals to the powers that be to do so on their behalf, the second group will definitely become hostile to the first. Thus, anything other than "agreeing to disagree" and moving on to areas where you do agree would just build more resentful, disconnected hostility.


Personally I would have both and within the handleClick() function do whatever it needs to do and call an atomic updateDate() function (assuming it's updating the date globally). Although for anything generalized standalone function I like to put that in its own namespace, such as helper.updateDate().


Or be explicit. Call it AtomicUpdateDate if such functionality is really desired. Nobody should assume any operation is atomic ... Unless it is explicitly stated to be so.

EDIT: sorry meant to reply to the parent comment.


Yes my example was if updateDate literally just updates the date - I take this a step further too, Eg if the function has some extra checks, I’ll name it maybeDoX(), which is a somewhat uncommon practice in common programming languages


I take this approach as well; especially, when the handle takes an event param and can then let the object react with the correct behaviour.


validateAndUpdateDate()


I completely disagree with this example. Only in first order implementations does that click just change the date. As you refine UX you may add some debounce, error handling, alternative behavior, etc. DOM events rarely end up corresponding to such simply expressed handlers.

Also when you are reading the reverse, it’s not clear on what on what updates date. The worst scenario is when two listeners share a handler.


What are your naming recommendations?


Exactly what OP discourages. Ha dleClick for onClick, handleDragOver for onDragOver, etc.


It entirely depends on what the function is receiving.

In your example, assuming we're looking at a React codebase (since this seems to square with React's style of events, etc), the resulting data being set into the updateDate function would be an event.

This, unfortunately, doesn't make any sense for a function by the name of updateDate to receive. I would expect to receive at least a new date to update with in the parameters, or ideally in a functional world, both the state to update and the new date we want applied to it. Anyone thinking they could simply reuse the updateDate function somewhere else is going to be woefully disappointed they largely cannot, since it would have been constructed around receiving an Event object.

In that case, I find the "handle" nomenclature to be very useful, as it appears to be largely shorthand for functions design to handle an _event_ (and we tend to see this pattern being used in various React libraries and documentation). React does have a few of these shorthands it tends to use (such as useEffect largely being short for useSideEffect).

Ultimately, I recommend using both functions. One, a handleClickUpdateDate function (notably not a hyper-generic handleClick function which conveys nothing) that receives the event and pulls out the necessary information out of that event to determine what the date is that the user has selected. It then will call the updateDate function with only that date, which creates a nice, reusable updateDate function if we need it anywhere else.

This roughly squares with the idea of having functions that handle the data transport layer (such as receiving data via HTTP or gRPC, etc) whose responsibility it is to receive those events, pull out the necessary information out of the package, route the request to the correct controller, and ultimately return the result in a shape that satisfies the original transport mechanism. In this case, our handle* function is responsible for receiving the original transport of data, then routes the request through to the appropriate controller which is entirely unaware of the means of data transport.

It also means we have a nice, easily testable unit of updateDate to verify our state modification is working to our liking without needing to assemble an entire Event object.

Anyhow, that's how I think of these things ;p


I’d agree in some cases, but then you get weird debouncing code in your update date function. Because, no one wanted to change the onClick function name in 38 different places when it starts to do more.


If you don't have refactoring that can rename functions across your code base, now you have two problems


In 16 years as a developer, I have never seen this beyond the scope of a single project/language. Once things cross boundaries it stops.


I've hit this many times.

Microservices and JSON APIs.

Common library code (eg. common session handling) incorporated into dozens of microservices.

Monorepos (which actually make this problem somewhat easier).


Even within those bounds, I found it doesn't always work terribly well (depending on the language, tooling and project complexity)


What boundary would be involved in the handleClick example? Or are you speaking more generally about renames across API contracts?


Here is a great example that came up last week. Almost exactly like the OP. (It is even an onclick!)

I have a C# .net MVC web project. All of the JS, .cshtml templates and C# controllers are in the same visual studio solution. So it has complete vision into the code.

So the boundary is from cshtml template to javascript function and then another one from the js function (ajax call) to C# controller.

Visual studio has complete vision, but loses the connection at each boundary, so intellisense says that the C# controller function is never called.

I am not criticizing Visual Studio, it is a great tool, but this is a hard problem.


onClick => onAccept => submitRequest

The downside is lots of little functions, the plus side is smaller behaviors. (Ie avoiding 5 levels of nesting in a single function.)


This sort of thinking leads to codebases that are too abstract, where it’s hard to follow the flow of execution.

There’s nothing wrong with 5 levels of nesting. It’s much easier to follow than 5 separate functions.


Depends on the size of the function. If these are literal one liners: don't break them up. When they are shared between multiple pieces it's more useful. Ie: submit request has multiple callers.


the benefit for handleClick in a react codebase at least is, aside from convention, generally there will only be one click action per component and then you can see right when you come into the component - hey there is a click somewhere in this component.

In that case what I find preferable is onClick => handleClick()

handleClick = () => { stuff in handle click updateDate(); }

on edit: I see lots of others made same point one node lower.


I completely agree with this, but I additionally advocate for function/method names to be formed verb-object. Plus IMHO active, descriptive verbs are best, instead of 'do', 'handle', or 'manage'. As others have mentioned, handle click might be appropriate as a shim/landing point for satisfying an interface. Logic is best when focused without unnecessary dependencies/side effects. The best naming comes along with a strong design and architecture, when all the pieces can justify their existence and boil down to the essence of what the system is trying to accomplish.


These are both pretty bad if used non-locally across files. Maybe it's a bad habit from Java, but verbosity helps a lot,

updateDateCallback()

updateDateFromClick()

These show how it's used and what it's doing.

(actually, "up-date date" kind of irks me,)

handleDateChangeClick()

dateChangeModalClickEvent()

And what does this thing actually do? Is it updating data, or making the update UI visible? It's kind of unclear.

If you're showing/hiding UI rather than performing the model update,

showChangeDate[Field/Modal/...]()

beginChangeDateFlow()

...

Code should read kind of like a book.


Agreed, and something to add in the explanation of what to do is the "why."

When someone (or you) is reading the code later, you see a method call and know exactly what it does. This prevents you from having to click in or find it just to figure out what it does.

Frequently, if you click into a vaguely named method, it doesn't do the thing you were searching for and you wasted a little time and need to backtrack, only to repeat the cycle with other vaguely named methods.


But how its used is often inseparable from what it does. And in UI code its nice to have all user actions called OnXxx or some convention.


This example is not good for talking about naming as there is no proper design which would have a function like this. In any proper design there would always be a separate function for each of the concerns.


That is all well and good, until you have many components all needing to handle clicks. Then it is better to do a handleClick abstraction and put the updateDate inside it.


As a non native english speaker I have a recurring problem with this:

> getInstances(); // ok, returns an array of instances

> getInstanceId(); // ok, returns an id

> getInstancesId(); // ??, returns an array of id

> get InstancesIds(); // ?? returns an array of id

> getIdsOfInstances(); // ?? returns an array of id

> getAllInstancesId(); // ?? returns an array of id

> getAllInstancesIds(); // ?? returns an array of id

How do I conjugate that ? And what about the possessive `s` ?


As a native English speaker (and former English teacher), hope this helps:

> getInstances(); // ok, returns an array of instances

Correct.

> getInstanceId(); // ok, returns an id

Correct.

> getInstancesId(); // ??, returns an array of id

Unusual, but would return a single ID referring to a collection of instances.

> get InstancesIds(); // ?? returns an array of id

Correct.

Though more common would be getInstanceIds() since you don't need to pluralize twice. E.g. we don't say "blues shoes", we say "blue shoes", so these aren't "instances ID's" but "instance ID's".

However, in naming functions or variables you'll sometimes see plurals repeated for clarity. E.g. getInstanceIds() is ambiguous because it could refer to the ID's of multiple instances, or multiple ID's of the same instance. While getInstancesIds() makes it clearer that it's the ID's of multiple instances, even though it's not actually grammatically correct.

> getIdsOfInstances(); // ?? returns an array of id

Same as getInstanceIds(). Both are correct, though getInstanceIds() would probably be more common -- most people wouldn't put in the extra word "of" simply because it's an extra word, but sometimes it might be clearer for consistency with other function names.

> getAllInstancesId(); // ?? returns an array of id

It would mean the ID for the collection of all instances, though that would seem very unusual. Perhaps if there were a global account ID for a service or API you used.

> getAllInstancesIds(); // ?? returns an array of id

Sams as getInstanceIds(), but "All" presumably means it doesn't include a parameter to filter -- so I'd assume this implied the existence of another function such as getCollectionInstanceIds() or similar that it was contrasted against.

> How do I conjugate that ? And what about the possessive `s`?

Obviously you can't or shouldn't use apostrophes in variable names, so people generally try to avoid the possessive s. That's why we generally change nouns to noun adjectives -- instead of "computer's ID" (computer = noun) we use "computer ID" (computer = noun adjective).


> Though more common would be getInstanceIds() since you don't need to pluralize twice. E.g. we don't say "blues shoes", we say "blue shoes", so these aren't "instances ID's" but "instance ID's".

Completely agree with all your points (as a native English speaker who also minored in English)

However, for the sake of clarity (since we're trying to disambiguate between places the 's' could indicated possession vs. plurality) "instance IDs" should not have an apostrophe.

The only time an apostrophe goes in a non-possessive plural is in a contraction. E.g. "the '90s"

Then we also have fun words that probably should never be used which get an apostrophe on both ends (contraction at beginning + possessive plural)

e.g. " ... the '60s' countercultural attitudes ..."

But for clarity it might be better to go with

"... the countercultural attitudes of the '60s ..."


That's a hotly debated issue that different style guides give different answers to.

Some style guides insist plural all-capital abbreviations should never have an apostrophe ("IDs"), other insist they should ("ID's").

Your personal usage is absolutely valid, but it should be recognized as opinion and not fact. "ID's" is equally valid in modern usage.

Edit: You can read an extensive discussion of it here:

https://en.wikipedia.org/wiki/Acronym#Representing_plurals_a...


Interesting. I've seen the apostrophe for acronyms which involve periods: e.g. "C.D.'s" (NYT has an article about this being their preferred style)

I've seen it for one-letter words where you don't want to switch case: e.g. "dot your i's and cross your t's"

But I haven't seen it in upper-cased abbreviations followed by a lower case plural 's'. Or if I have, I've assumed it's wrong. Would love it if you can refer me to a modern style guide which doesn't recommend dropping the `'` in this usage, as my own background has drilled into me that "IDs" is preferred for clarity.


I just added a link to a Wikipedia discussion in my comment above, which should answer your question. It gives an example:

> ...whereas The New York Times Manual of Style and Usage requires an apostrophe when pluralizing all abbreviations regardless of periods (preferring "PC's, TV's and VCR's").


I would make a case that in our field, one which has an abundance of initialisms and non-period-separated acronyms, we shouldn't shadow the possessive. Consider "this script retrieves JSONs from all API URLs and validates them against their respective schemas" and "this function extracts the URL's fragment, if any, and checks it against a set of page targets". If you use an apostrophetic plural, does the latter receive one URL, or a collection of them?

URL was carefully chosen, since you can't make a consistent ruling about initialism vs. acronym, I've worked with people who pronounce it "you are ell" and people who say "earl", and sometimes both! Absolutely no one is going to start spelling it U.R.L., either.

Especially given the abundance of non-native speakers, let's not add a peculiar and ambiguous edge case to one of the more frustrating rules of English grammar.


I'm intrigued, but I'll admit entirely confused, by your comment.

First, I don't know what it means to "shadow" the possessive. Do you mean we shouldn't hide it? Although it always has an apostrophe so we're never hiding it.

Second, I don't understand your examples. Your second example clearly necessarily uses the possessive, while the first is a plural, but the first would clearly be plural even with an apostrophe:

> "this script retrieves JSON's from all API URL's"

And since a possessive necessarily precedes another noun in this type of construction, it would still be clear:

> "this script retrieves JSON's from all of the API's URL's"

It's clear in speech and clear as written, even though the "'s" serves two grammatical purposes here.


> First, I don't know what it means to "shadow" the possessive.

“Adopt a plural form identical in form to the possessive so as to create avoidable ambiguity between the two.”

(Incidentally, that’s why the Times shouldn’t do it either, even if context will usually disambiguate it.)

> And since a possessive necessarily precedes another noun

Not necessarily a simple noun, though, it could be a noun phrase. In practice, this will usually be disambiguated by context still, but merging the possessive and plural means that there will be more situations where you have to process more context to disambiguate meaning, which distracts from whatever your purpose was in reading the material in the first place.

Reducing the inherent redundancy in natural language reduces quality and clarity of communication.


One thing that getInstancesIds() might mean is "get the (multiple) ids for each of multiple instances".

I don't know why an instance would have more than one id, probably more plausible with different nouns, but a similarly named function might return an array of arrays, or a flattened equivalent, possibly uniqued.

Probably a bad idea though. I have been guilty (though pleasurably, pridefully guilty) of abusing plurals -- day, days, dayses etc. If you can let the types do the talking that's obviously better, and more descriptive (and less eccentric) names can help a lot with communication and clarity, but there is also sometimes a place for brevity, and for levity.


I have never understood the practice of using a verb phrase ("get") for functions with no side-effects. To me it makes more sense to name the function after the result, i.e. Instances, InstanceIds, AllInstanceIds etc.


Preforming I/O is a side effect. Accessing an external database is a side effect. Using a verb to imply external access to "get" some resource is appropriate in this instance.


Like other's have said, I might use "Get" to signify that this is a complex operation. Although I prefer more descriptive words like "Fetch" or "Find".

So FetchInstances() is expensive (each time it's called) but the property Instances is cheaper or only expensive on the first access.


But then the name of the function is somewhat tied to its inner workings. What if you change the implementation so that it returns a cached value? To quote Michael Caine, an interface should "Be like a duck. Calm on the surface, but always paddling like the dickens underneath."


> But then the name of the function is somewhat tied to its inner workings.

Not necessary tied to it's inner workings but more tied to how you should use it. There are good reasons that a function might not cache it's value -- maybe because you want that fresh data each time. The name is the signal to the user how they should use it. I expect an "Instances" property to give me the same values each time. However with "FetchInstances()" I would store that result in my own variable if I want to keep referring to those instances.

> What if you change the implementation so that it returns a cached value?

That's a different function then. The semantics have changed significantly.


The advice I once got and since internalized is to reserve 'getX' to give a hint that something happens besides returning 'X', lazy initialization is a common use case.


I think of ensureX as getting x whilst initializing it if needed


getInstancesId() would usually mean a singular id corresponding to the group of instances.

Only ones with an "s" at the end of the id should return an array if you have the existing "getInstanceId" either that or invert that function to be getIdOfInstance and subsequently getAllIdsOfInstance for the array version.

Generally speaking don't use the possessive "s" it is implied by being next to it. Also I usually use "instance" in this case more like an adjective

Edit: usually I think of it as (current subject) verb adjective object.

So server get instance IDs becomes getInstanceIds() or zoo clean active dogs becomes cleanActiveDogs()


getInstanceIds


Thanks everyone, much appreciated :).


I find that I am remarkably internally consistent with how I name things. I sometimes forget about a filename, class or function that I wrote months ago (thats another problem in itself). But when I go to name the new thing I am about to write I realize that it already exists. I then pat myself on the back.


i wish that was me, if I'm not vigilant about it I always end up choosing a synonym that I didn't previously use (e.g. pressed vs clicked vs activated)


My process:

(1) How would I explain face-to-face what this code does to another developer?

(2) Use the key words from that explanation to make a name as short as possible without losing clarity.

Example:

Step 1: This class handles all the navigation work for the sign-up flow in our mobile app.

Step 2: SignUpFlowNavigator


In this video [1] the developer uses the prefix *old* where I normally used *prev*. Today I renamed all those "previous in time" to "old" for distinguishing them from the previous or next of a list.

[1] https://youtu.be/AQ0lcm2_skI?t=319


This reminds me of this old quote from qotd: We had fairly random naming for our servers, so I proposed we named them after planets - and the admin responded with "great idea, you mean like planet01, planet02, planet03.."

It´s a funny story, but naming means ALOT when you are developing, or building systems or processes that requires a name or an identificator.

Also don´t be afraid of changing the name if it no longer reflects on its purpose.


I would be interested if any task-performance based studies had been run with different naming schemes and enough developers to see which naming conventions worked best. Some simple job like fixing a bug or a simple refactoring, judged on time to completion and correctness.


That would be great empirical data to see. However from anecdotal data across almost two decades, is extremely obvious to my case study that naming is second only to application architecture in its impact on ease or difficulty and risk of updating the code base in the future.


This seems such a weird study. Names are important, but not so much as what is being named. In one sense, names become easy given a properly abstracted design. Further, it’s not names so much that are paramount to program comprehension, but more the quality of the design. With a bad design that doesn’t properly separate concerns it won’t be possible to come up with good names because functions/data structures will represent multiple concerns. Any time names are controversial it probably indicates a design problem, even for unique/complex abstractions, there is usually a name which everyone agrees is the perfect name for a thing..


Say what you want, but it's better than how mathematicians choose names imho.


In defense of mathematicians, having conventional single letter variable makes stuff a lot easier to read.

For example in graph algorithms, I used to write descriptive names like:

  weight = adjacenyMatrix[node][neighbor]
Nowadays I write it as:

  w = G[u][v]
Because in graph theory, an edge is usually (u, v), weight is usually w, graph is usually G, etc.

It's basically the same reason why people don't name their indices "index" and instead use "i", "j", "k". And instead of point.first and point.second, you use point.x and point.y.

Naming conventions are awesome when people can agree on them.


When writing code that is mathematical, I firmly believe "descriptive" names are an anti-pattern. To figure out what anything complex is doing, I'm going to have write the equations on paper anyway, and a matrix called G instead of adjecencyMatrix speeds that process along greatly. Even better, if the variable names are sufficient short and mathy, I may not need to translate back to paper, since the equations become more recognizable in place.

This zealotry a lot of devs have over descriptive names needs boundaries.


In maths, you usually write "where x is ..., y is ... etc." after equations where what the symbols represent isn't abundantly obvious anyway.

Likewise, when programming, if I reuse some variable very often in a short space, I will temporarily rename it to something single-lettered, e.g. "s = volumeScalingFactor; x = s; y = s; z *= s".

I think maths style vs. programming style is only one factor here, with the other simply being this application of DRY: describe one-off things well, but shorten their names if they're oft-repeated within a given context.

Maths tends to reuse the same variable for a lot of different operations in a single context whereas programming doesn't as much, which naturally leads to this single-letter name vs. long name difference in styles.


The single letters are pretty bad, but when we use a persons name instead of a concept to describe something is where it gets truly awful. What the heck is an Eigen vector? Gaussian curve? Cartesian coordinates? Hilbert space? Julia set? Mersenne prime? Zero chance of intuiting any of that


Just to quickly clarify: eigenvectors are not named after a person. It's a German word meaning "own" or "inherent" and was meant to denote something which resembled itself.

https://hsm.stackexchange.com/questions/5563/where-does-the-...

https://jeff560.tripod.com/e.html


Math needs way more names than English can support and since they are used by humans they lack namespaces. Human names are both easy for humans to speak and plentiful enough to satisfy the need for names in math, I don't see a better solution.


The following article has influenced my naming choices ever since I first read it:

https://www.joelonsoftware.com/2005/05/11/making-wrong-code-...


It's interesting how programmers tend to reinvent the idea of type systems after enough experience. The us_ prefix is a manually checked Us(String) type. Fortunately in modern languages we can express that at the function signature level and we don't need to spend "three weeks" training our eyes!


Relying on naming discipline as opposed to the more reliable features available in a language with any of (1) static typing, or (2) class-based OOP, or (3) prototypical OOP, seems...unnecessarily error-prone.


> Based on the results of this experiment we developed a 3-step model of how names are constructed: (1) selecting the concepts to include in the name, (2) choosing the words to represent each concept, and (3) creating a name from these words.

This seems flawed, when I look to name something, one of the most key things is what other things are named in that context, and patterns and consistency between them.

It appears the study described a problem, instead of putting them in preexisting code?


This is a actually the way I name important things. There's even a 4th step: Choose only a few of the important or distinguishing concepts to include in the name. For examples in a database model you end up with x_y_z_name_id because of relations on relations, that might get reduced to y_z_name or x_z_name, dropping the part that is of widest/implicit context.


Says in the paper they had 337 subjects, so we'll have to match them with 337 irrelevant anecdotes so that we too can be scientists


you think that's bad? this paper

http://pdinda.org/Papers/ipdps18.pdf

surveyed ~150 and concluded therefrom. the more hilarious thing being that they drew conclusion completely unscientifically i.e. by just interpreting vague plots (i.e. without performing t-tests or anything like that).


“But choosing good meaningful names is hard. We perform a sequence of experiments in which a total of *334 subjects* are required to choose names in given programming scenarios.”


This is one of the most interesting artifacts of programming, and it's probably impossible to explain to people outside the field. How at the same time it's trivial and important, and a constant cause of neuroticism.

It takes up far too much tought-space to the point I would gladly follow some ugly set of conventions merely to have to just never think about it.


Sure, following some guidelines and standards is helpful. But never having to think about naming is almost like never having to think about logic. What you name things dictates how future developers think about them. Just this week we had a feature that could belong in an existing module or it could require a new module all depending on how we define the domain and therefore the name of the existing module.


"s almost like never having to think about logic. "

Not really, because naming is 98% convention when it's done right.

The challenge is establishing all the conventions for a project, and then sticking to them.

If there are really good conventions in place, well known, then naming is considerably easier.

In fact, even if the conventions are not super good - they can be powerful when they are very well adhered to.

Example: recently broke a rule and decided to use a known naming anti-pattern by suffixing variable names with the system type. This is normally not good. But within this module, the meta typing was ambiguous - by adding the suffix, the code was magically more clear. That little convention, very easy to apply, solved a clarity problem far more so than any issues around what functions should be called. So we used it for the module and that module only.

Apple has some pretty hardcore naming conventions that I don't really like, but what's more important then whether they are good or not, is that they are very consistently applied - in other words - a lot less to think about.


Research like this could lead to IDE's suggesting names for some of our methods. Exciting!


Given how much it freaks the IDE out I can't imagine writing a method before naming it.


very badly, sometimes randomly, magic 8 ball

would not be surprised if some dev somewhere considered naming his/hers first born "/tmp/first_born" until coming up with a better name

we can also add mathematicians and physicists to the group of people that are exceptionaly bad at naming stuff


Primo is an actual name in Italian; the others have fallen (more) out of fashion, but certainly Quintus, Septimus and so on used to be perfectly reasonable Latin names.


sharing a relevant bit of research I did collecting naming practices across a bunch of different languages and frameworks: https://www.swyx.io/how-to-name-things/


Variable names are alright, available domain names are a different beast


The more shared a name is, the greater the pain in selecting it. Given domain names are worldwide, no wonder then.


I thought you just added "ly" and called it a day?


You watch your program buzz,

But don't know what anything does.

How do you name things so they're obvious to the eye?

Descriptively,

descriptively,

descriptive... L-Y!


For those unfamiliar with the reference, Tom Lehrer did several songs for the educational TV program The Electric Company, including one on the use of the suffix -ly to produce adverbs: https://www.youtube.com/watch?v=dB2Ff8H7oVo


Naming things - one of the four hardest problems in computing (the others are sorting and off-by-one errors) :)


Looks like you need to invalidate your cache of hardest problems in computing ;)


Proper caching makes any problem 10x as hard.


Twice as hard is a underestimate imo.


Twice as hard? Or is it three times?

Like they say, there are 10 kinds of people, those who understand ternary, those who don't, and those who mistake it for binary.


Don't forget time comparison.

Your unit test of a function comparing dates relies on server time and usually fails if run around 12pm. Whatever. Nobody uses this app around lunch time anyway. Just tell your teammates in the to ignore that. Wait, why is the test failing every time today?! Did we just switch to DST? Uh oh...


Funny related story: For an internship, I worked on a app that would show your sleep data in a nice visualization (this was for a smartwatch). I implemented a basic version, and used it for a few weeks. It seemed to work fine for me, so I released a beta version for others in the company to dog food.

Within a day, I got a bug report from a coworker that it rendered incorrectly for them. I was pretty surprised, since I had been using it for a long enough that I would have expected to encounter all the edge cases.

I dug into the code, and realized that this bug only happened if you went to sleep before midnight, which I never did...


Funnily enough, I recently had to help with an issue where a company's payment screen was not loading correctly and instead displaying "Invalid session". They use a third-party vendor to handle payments and display their payment form in an iFrame on their website.

There is an initial request to create a session that has to pass the desired expiration time of the session. Unfortunately, the vendor requires the time to be in Eastern Time. The poor, naive soul that originally implemented this just got the current date, added 15 minutes, and converted it to "EST". As soon as daylight savings hit a few weeks ago, the expiration time was automatically being set to 45 minutes in the past, the vendor was responding with "Invalid Session" and the company was unable to take payments from customers.


“Our JavaScript date library and the date library we’re using in Clojure disagree on the first day of the week. Elasticsearch disagrees with both too”


I recently came across a 15 year old bug in qt involving both time comparison _and_ cache invalidation: the code was trying to set timers to fire when the next cache entry was set to expire, but the timeout was in seconds while the eviction code compared milliseconds. The result was 100% cpu usage for up to a second as the timer spun without evicting the cache entry.


You can avoid this problem by always passing in either a time or a time provider and never just relying on Time.now or whatever it is your language provides.


This. Nearly every date/time library comes with this sort of attractive nuisance that is just waiting for a chance to ruin your day.


This has to come with a trigger warning.


Sorting actually isn't one, at least anymore. Exact once and in-order delivery although part of 'distributed' computing should be, since most computing is distributed, right down to our multicore processing hardware.


It is - do your sorting, then show the results to stakeholders and you’ll see they all want different, mutually conflicting sort orders.


Discussions about naming schemes make me think of the old joke "until I got married, I had no idea there was a wrong way to boil water" or other variations of banal tasks :)

Perhaps there is an answer lurking out there in the universe, but it feels more likely it'll only be invalid pointer Aborted (core dumped)


five... naming, sorting, of by one concurrency errors and.


I see a transaction rolled back, and the autoincrement id column no longer matches the row count.


Well, the app is broken now because we relied on the autoincrement id column never having a missing value and didn't handle the error when it has one missing.


Honorary mention: Copy/Pass by reference or value


oh man; they should of selected 42 or 1337 study subjects!


How about numbering? I always use to get a sideways glance back in the day (before we all started using uuids) when starting the number for a server group at 0. You’ll only have 9 servers in the first group and 10 thereafter if you start at 1! Yet some people insisted.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: