Do I want 'filename', which is whatever was given?
Do I want 'file', which is the file object?
Do i want 'name', which is just the name of file, with directories, if any, stripped off?
Do I want 'files', which is the list of files, as given?
If copying/moving is going on, then there's a source and a destination and it gets even worse.
The obvious way out is to refactor by renaming those variables, but 'unsafe_filename_as_given_by_user' or 'list_of_files_as_given' doesn't quite roll off the keyboard.
That's absolutely the way to do it.
"unsafe_filename_as_given_by_user" is clear, precise and a much better name than just "filename" when used in a large scope. It's a stepping stone to better code.
The length and unwieldy nature of the specifically named variable is a code smell, but it's a code smell indicating a different problem this time - the problem is that the variable's scope and/or context is too large.
When I set out to refactor a large code base I almost always increase the variable name length to begin with to prevent name conflation, ending up with the 'unsafe_filename_as_given_by_user' kind of name and then bring it down again later when short variable names in a short scope are precise enough.
You only need to type unsafe_file_name_from_user once. After that, even the simplest, language-agnostics autocompletion of identifiers from the current buffer goes a.long way. A proper IDE makes it even simpler.
DescriptivelyNamedFoo = new DescriptivelyNamedFoo();
I simply don't accept your priors.
Or the situation with the file names, as someone else mentioned at more length.
I've come across that rule (Gerrand's rule?) which sums up my personal experience pretty well:
> The greater the distance between a name's declaration and its uses, the longer the name should be.
I often notice that I can remove a comment line after renaming a few identifiers in the code below it, because the code starts to read self-explanatory enough.
I've found that people who believe too much in the ability for identifiers to tell the story of the code factor too much stuff into separate non-reused functions. The result being that you can't take the name at face value; you need to push the current context onto your mental stack, drill into the identifier being called, and if necessary continue drilling, until the unstated side-effects and extra bits and bobs that always accrete in continuously maintained code become clear. This code is ironically harder to read than inlined code.
Careful with that. I'm of the mind that the code author is incapable of describing how readable any codebase is. It's too easy to mistake your innate understanding of your own creation for readability.
Code with descriptive identifiers often make code read like plain English.
Opening a file, i.e. interacting with the filesystem, is a totally different thing than writing out stuff, so I try to separate these tasks into different procedures. It's not a big problem if "file" might actually mean "filepath of the file in the filesystem" in one place, and "file handle" in another.
In your example, you're dealing with a file system, so the file is the entity and everything else is just a property of that.
With regards to source and destination, I find that people tend to struggle with this, so instead I tend to think about the entity in question (in this case a file) and the target entity (eg a remote file system).
You don't need to add more words to make it obvious. Just be concise.
- it takes effort to think of a good name. And more of the linguistically creative type instead of the logical reasoning type which makes up most of the rest of programming.
- it takes domain expertise to pick a good name
- if you stare too long at the same piece of code it becomes harder to see it through the eyes of someone who has never seen that code before
- almost everybody uses in English in their programs, but many people are not native speakers
But I think making an actual effort to pick a good name already goes a long way. It forces you to think about the problem at hand and how you could communicate that to somebody else.
The trick is being systematic. And consistent in the small. Short names and (at first) non-telling names are not a problem. It makes no sense to use (predominantly) very long descriptive names, not because they are harder to type, but because a long name is and indication that it's not an established concept, not a recurring theme across the code pase.
So my habit has become to just choose a one-word name for this vague but probably important concept I have, even if it's not the best name. I put a comment on the datatype declaration, explaining in detail the meaning of that thing. Maybe later I will come up with a better name, so I will just switch to that name. Or the comment gets improved. It's about growing clear concepts. After some time that single-word name will be a natural thing to use for that concept.
If you can have a clear understanding of the important concept (instead of a vague understanding) that can be reflected in a good name.
Good name meaning one that is descriptive, unambiguous, and reveals a crisp, not vague, concept.
Then with that good name, when you write the code the first time, the clarity of the concept will help you avoid logic pitfalls and messy constructs.
Not to say comments are always bad, but probably a comment won't be needed, because the name itself will reveal the intent.
Then later, when someone does come and edit the code, instead of having to come up with a better name, they can focus on whatever they came there for, and they can do so with a better understanding of each variable and what is going on.
The difference between the two approaches is really about where you spend your extra time related to naming. In my approach, you spend a little extra time up front. But it begins paying dividends immediately, as you continue writing the very first version of the code; it is easier to write clearly and correctly.
In your approach, you spend less time up front, and pour the code out faster, which might seem like a great thing. You also see an immediate payoff, because your process is fast. But then you (or someone who comes behind you) has to pay a heavy bill of technical debt.
When I read your comment I read it as a description of what your current practice is, not something that has been thought through a lot. Growing clear concepts is a great way to think about it, so that's a good line to continue with imho. I just think that part can and should be done up front. Sometimes it's worth a short conversation with another developer to see if you can agree on a good definition reflected in a clear name.
But you're also right that sometimes single word names work fine. And sometimes concepts are just clear and don't need much thought. Obviously those ones we can agree on. I'm more focused on the more difficult, or more vague ones, as a place where I think up-front quality naming is better than having the "fix-it-later" approach.
In these cases I err on the side of short names (preferably non-composite) because I know that making a longer name will not help. I just use that new name a little in a more exploratory type of programming. When it sticks it was a good choice. Examples: tree, spec, coverage, identity, isomorphism, object, struct, key, value, tag... These names need context, but they can turn out to be good names because they are short and distinct.
When the name doesn't stick and I can't come up with a better one, maybe I should do something entirely different.
This actually isn't too bad provided the loop's scope is small and it is obvious from context what it is doing.
It's variable names which could have a variety of different meanings used in larger contexts and scopes - that's when maintainer confusion really sets in.
My AccountingRow is something the user sees, and there's also a SummaryAccountingRow with similarities; both of these need SQL generated, so there's a AccountingRowSQLGenerator and a AccountingRowSummarySQLGenerator. These generally share some of the same business logic around how to join different tables, so there's a AccountingRowCommonQueries class. We also need to spit this out in JSON, so there's a AccountingRowSerializer and AccountingRowsSerializer and AccountingRowSummarySerializer, AccountingRowSummaryRowsSerializer. An AccountingRow also has an AccountingRowBehavior, and AccountRowAdder, and onward into infinity.
Now you have name soup, and if you are looking at this system for the first time your eyes glaze over.
Clearly, the system needs some refactoring - perhaps there's a pattern that can be extracted. Hopefully, though, my point came across.
There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton
There are only two hard problems in distributed systems:
2. Exactly-once delivery
1. Guaranteed order of messages
2. Exactly-once delivery
So yes, it is hard, and rightfully so.
I'm not sure about that. If you don't know about patterns like factory or observer (event listener), and you invent these concepts independently, you'll have a hard time naming it, and your names will be nonstandard. My point being that it's possible to have an elegant solution yet it's still hard to name succinctly.
That sounds confusion to me. Those patterns are called patterns, because people practice them over long time, during the cause, they decoupled them nicely. Assigning the right name, implicitly means the clear understanding of scope and responsibility. Self-invented solution may echo the original idea in some aspects, but most of time, is not even nearly as clean.
Plus it lets the talkers and the bullshitters look like they're doing something while they bikeshed over naming things, but for some reason they have to bring in the doers to spectate instead of letting them get things done.
I consider these names to be beautiful and succinct and a lot of thought was put into them.
Plus there are compile time constant, runtime constant, once (Eiffel) variable..
Naming seem to be the last step of refactoring. Refactoring is the process of manipulating / identifying the semantics of an application, names are human-readable tags that ease the mental task of understanding semantics.
- As I work with the code, the name becomes obvious.
- The variable/function turns out to disappear during the work.
One more case of how it can be better to decide something later, when you know more.
Author, if you're reading, please fix this. Default blockquote styling is far preferable to what you've done here.
I've written a few Stylish fixes (including that) on desktop.