Hacker Newsnew | comments | show | ask | jobs | submit login
Is there an excuse for short variable names? (stackexchange.com)
24 points by iProject 908 days ago | 36 comments



I am reminded of a complicated piece of indexing that I had to figure out. I kept on getting confused, reversing things, etc.

Eventually I put a big comment at the top that to understand the code you needed to draw the picture, and named my variables x and y. About a page of code later, I was done. And when I has to tinker with it a couple of years later my first reaction was to wonder WTF I was thinking, wince in memory of how I had struggled with it, then I drew the picture, and I was amazed at how easy it was.

The rule is not that you need long or short variables. You need meaningful ones. A short variable name is inherently ambiguous, which can lead to confusion and mistakes. Thinking through anything with a long-variable name abuses your working memory, limiting how complex your thoughts can be.

It is a trade-off. Use the right one for your code. For me, index variables are short, and so are any variables that refer to math concepts that I understand well. Otherwise I use (concise if possible) descriptions without any abbreviation, separated by underscores.

-----


A short variable name is not ambiguous when the code is based on some math or physics that has a more or less standardized notation. In this case, what's needed is to connect the variable names to the mathematics, as Rex Kerr pointed out to the OP:

"Once you understand the math, the length of the variable names is irrelevant. Do others a favor and leave a citation (in a comment) to some relevant description of the math, though, if you had to learn it!"

-----


A short variable name is not ambiguous when the code is based on some math or physics that has a more or less standardized notation.

The problem is that the same notation is often used in different parts of math in different ways. You sometimes need to clarify. (Except in the case of differential geometry - there the notation is so horrible that you should find a different job! I only partly kid...)

Conversely even if you aren't using a standard math notation, meaningless names and a simple labeled diagram can often make something unambiguous in a way that no amount of verbiage in variable names can hope to do.

-----


Variable name length should be inversely proportional to scope visibility. Names in complicated math should also be kept very short. I like to get the best of both worlds by renaming just before the formula:

    var G  = Physics.Gravitational_Constant;
    var M1 = Physics.Mass_of_Earth;
    var M2 = Simulation_Constants.Mass_of_Asteroid; 
    return (G * M1 * M2) / (r*r);

-----


An argument that usually lets programmers understand why physicsists do this: to a physicist variables like x (position), v (velocity), p (momentum), R (radius) are as clear as i,j,k,l to a programmer. Using the longhand would not improve readability for them, just like using arrayIndex instead of i would not improve readability for a programmer. In fact for a physicist it is much easier to read the short variable names, since that's what we always use, and it makes it easier to recognize an entire expression in the blink of an eye. It also keeps expressions on one line instead of 5. Another thing is if you're not a physicist, chances are very good you are not going to understand the code anyway, or at least not without consulting a physics text or paper, in which they will use exactly the same variables x,p,v,R, and now you're glad that these correspond exactly to your program's variables. If you update an equation like T = dE/dS to temperature = infinitesimal_change_in_energy / infinitesimal_change_in_entropy, you have gained only superficial readability. If you don't understand the former you will not truly understand the latter either.

-----


> If you update an equation like T = dE/dS to temperature = infinitesimal_change_in_energy / infinitesimal_change_in_entropy

and in the case the changes in energy or entropy are not so infinitesimal, the semantics of the variable name can cause all sorts of troubles

-----


Simple solution: in a program source, create a very clear comment header that won't be missed, that explains the meanings of the variables.

Also, even though physicists can use any names or lengths they like in their papers, they seem to prefer concision:

f = G M1 M2 / r^2

If the reader understands the math, short variable names aid comprehension, not the reverse.

-----


This actually bothered the hell out of me. For anything more complicated that highschool physics, this is terrible. You start to run out of letters and start using capitals and noncapitals, and even worse is that there are some "globals" that are not really variables but well-known constants. So you're trying to distinguish between uppercase / lowercase / variable / constant and keep all that in your head.

There really is no logical reason to do this, other than established practice.

I've found that even scientists and mathematicians can be closeminded enough to get angry when you suggest that they're being inefficient and wasteful, and their conventions are crap.

Don't even get me started on the mathematician way of explaining things, where instead of explaining the concept and filling in the gaps, you take a 20 variable equation and start by describing all the variables one by one. By the time you reach the end, you can't remember what the beginning did, so you have to go back and forth all the time.

-----


Most people tend to get unhappy when you start dismissing their knowledge as crap.

Also in this case you are wrong. With a compact notation you can think more complicated thoughts than you can with a verbose one. The downside is that a compact notation requires more from the person reading it. The tradeoff is correct, and the domain expert is not necessarily wrong.

Your domain expertise is maintaining a lot of very precise instructions for a variety of different problems. For you, constant signposts are an assistance. For a domain expert in their domain of expertise, they are overhead.

-----


This is a wonderfully lucid and subtle comment. I think any earnest programmer who wants to get better would do well to study your contributions to this thread until they are sure they understand them.

For a domain expert in their domain of expertise, [long explicit names] are overhead.

Once you pass a certain threshold of complexity and/or time-investment, a well-designed program becomes its own domain of expertise. At that point there is great leverage to be had in finding a compact notation suitable for the recurring concepts of the system. In some ways, finding such a notation is how one ensures that the system has a good design and will keep it. As you point out, it (critically) is what enables us to keep more complex things in our heads. It also becomes an intimate part of the creative process – a good notation suggests new concepts and hints at how the system should grow.

Long explicit names are exactly what you don't want when it comes to the core concepts of a well-designed system. They are useful for things that aren't familiar and so need to be spelled out. But in a well-designed system, the most important concepts are familiar and it is a poor use of our limited cognitive capacity to constantly spell them out, for much the same reason that we prefer to say "gas" instead of "liquid hydrocarbons". Since cognitive capacity is our principal bottleneck in software, this is a big deal.

-----


Well, I put it more politely than that, but it's a longstanding nagging issue for me, so I tend to get worked up when talking about it.

I think your point is a very valid one. I guess when I try to read a paper that's "way over my head" academically, it's fair that the authors don't care about me.

But I don't think they know its tradeoff. The way I know this is by reading the majority of intro / mid level textbooks I've had to deal with as a freshman. At that point you're not nearly a domain expert, but the texts are written in the same way.

-----


> You start to run out of letters and start using capitals and noncapitals, and even worse is that there are some "globals" that are not really variables but well-known constants.

Fair enough, but don't forget the Greek alphabet, both upper and lowercase. There are plenty of math notation issues, but a shortage of symbols isn't on the list. I don't think anyone in physics thinks multiple-character variables solves any real problems. Practitioners are more likely to craft a new symbol, like h-bar (ℏ) for the reduced Planck's Constant, at the point where ordinary h wasn't adequate:

ℏ = h / 2 π

> There really is no logical reason to do this, other than established practice.

I can think of one logical reason -- it works for people who don't think very much about the symbols because they're thinking about the math.

> Don't even get me started on the mathematician way of explaining things, where instead of explaining the concept and filling in the gaps, you take a 20 variable equation and start by describing all the variables one by one.

I won't try to excuse every example of this, but in physics, explaining each variable and constant is a very good way to approach an understanding of the equation as a whole.

> By the time you reach the end, you can't remember what the beginning did, so you have to go back and forth all the time.

That's temporary, it only lasts until real familiarity sets in, until the relationships become instinctive. It's like the old joke about prisoners who tell jokes by number.

-----


The comment by btilly is ok as the mathematicians are trying to communicate in a highly efficient manner which is not the same as communicating with us mere mortals.

-----


You mean a comment header like http://futureboy.us/frinkdata/units.txt? No matter how clear and detailed, there comes a point when nobody is likely to remember much of it. (And that would be an exemplary example of clarity. Read it if you don't believe me.)

There is an upper limit to how much state we can stuff into our head. We can memorize things for longer. But unless we will amortize that state over many, many uses, the effort is not worthwhile.

That said, if we have taken the effort to memorize it, there is no reason not to leverage the already spent effort.

-----


> You mean a comment header like http://futureboy.us/frinkdata/units.txt? No matter how clear and detailed, there comes a point when nobody is likely to remember much of it.

Yes, but I'm not sure that's a very good example. There's a lot going on there -- things not really soluble using multiple-character variables.

> There is an upper limit to how much state we can stuff into our head.

Yep. It's why we have computers. :)

In case that seemed flip, consider this -- if I want to verify that I am using some program variable X in a consistent way, all I need to do is globally search for it and evaluate each case. Modern programming editors will happily list every instance of that variable's use, so I can make sure I'm being consistent. In such a case, a multiple-character variable would likely tend to obscure the issue and complicate the search.

-----


it sounds very programming 101, but I've found getting into the habit of boring old comments at the top of all my methods/functions explaining what the function does, what the input should be, what the output should be, and then simply putting some comments near the top talking about what all the variables are, what they do, why I have them etc. (usually only 4-10 words)...it sounds dumb and time consuming...but this little bit of polish has pulled my ass out of the fire more times than I care to count, and actually helped me write saner code more times than I care to admit (since sometimes writing out the semantics of this stuff helps clarify thoughts that get muddied and confused in the details of code).

you don't have to do this while you're writing the code, often-times just hammering the code out gets you something running, but then doing this after the fact really helps.

-----


Variable length should be comparable to the length they're in scope and their importance. So short indices for loops is fine. S for a string in a loop over a list of strings is fine (so long as the loop isn't too long).

Assuming everyone knows your equation and what your variables mean is a terrible idea. Unless your method is less than 10 lines and you explain all the variables in a comment... just write stuff out. Text is free. Don't go crazy, but 4-8 characters won't kill anyone and may well prevent bugs.

-----


Ugh. He's complaining about commonly known physics abbreviations. Short variable names are fine as long as they are descriptive. The key thing to keep in mind is descriptive to you may not be descriptive to someone else (or future you for that matter.)

You can even scrape by with naming the full version in declaration. nothing worse than a cryptic name you cant figure out at first glance like in this instance.

-----


You might be able to make the case that short variable names are reasonable as long as they are common and used consistently.

I am not completely sold on this, but for example LLVM has a lot of 1 character variables where 'I' is always the instruction you're working on, 'F' is always the function you're compiling, 'B' is the current basic block you're processing, etc.

Having worked on several compilers over the year, it's certainly not much worse than other practices I have seen like using 't1' and 't2' to refer to the two instruction temporaries you just created, or 'insn' for the instruction you're currently examining.

-----


The root problem seems clearly (to me) that he is the only programmer maintaining a system whose original authors are no longer available. The life of a software system is very much in the minds of the people who are making it, and when they are gone, this is a kind of death.

This has nothing to do with length of variable names, although it is tempting to ascribe one's frustrations with a codebase to lack of skill on the part of the original programmers, rather than lack of access to their thinking.

-----


Uh, the example in the article is from a numerical computation, and following conventions (typically, from a book, or a research paper, etc) is often a good idea. Just as long as they are not confusing. An example would be using g for the gravitational constant. If they start becoming confusing, can take them out to a namespace (if the language allows it): just like Math/e is a better practice than e, so Physics/g could be better than g. As a previous comment (http://news.ycombinator.com/item?id=5157849) indicates, when used in a short piece of code, quick synonyms can be used, thus g instead of Physics/g (or Physics/Some_Longer_Name).

As mentioned elsewhere in this thread, short-lived names can often be short, too. In some cases they are little more than placeholders. An example is an index for a for-loop that is in fact a "for-each" loop when the language lacks the latter and the index does not have much of a meaning of its own.

-----


  //for every 
  for(int i = 0; i < things.length; i++){
    //there's a
    dx = x - tx/mf;
    dy = y - ty/mf;
  }

  //but then for every
  distanceToTarget = Math.SquareRoot(xDistance*xDistance + yDistance*yDistance)
  //there's a
  for(int index = 0; index < robotsCurrentlyInField.length; index++)
  {
    robotsCurrentlyInField[index].ohGodMyFingersAreBleeding(distanceToTarget);
  }

-----


    di[ctrl+space] = Math.Sq[ctrl+space](xD+[ctrl+space] 
...

    robo[ctrl+space][index].ohG[ctrl+space]
Every editor I've used in about 10 years has had an excellent autocomplete feature. I've never had to type out such long variable names.

-----


Physicist here. I never have to read such long variable names, which would make it much more difficult to spot bugs. I do undertand that long variable names have their place but, when it comes to programming numerical code which is a translation of actual equations, shorter names make it much easier to read and spot mistakes. I do wish that I could use a programming editor that could look like a TeX output on the screen though ...

-----


> I do wish that I could use a programming editor that could look like a TeX output on the screen though ...

This was one of the design goals of the Fortress project

http://labs.oracle.com/projects/plrg/Publications/fortress.1...

Scroll down to section 2.3 'Rendering'.

Sadly the project is now defunct.

-----


> I do wish that I could use a programming editor that could look like a TeX output on the screen though ...

For other readers (I suspect you already know this), Mathematica and Sage (and other environments) do a pretty good job of providing TeX-like feedback on what you've just typed in (but not keystroke-by-keystroke, which would be nice).

For those unfamiliar with Sage:

http://www.sagemath.org/

My Sage tutorial:

http://arachnoid.com/sage

-----


Reading is arguably more important than typing. Being able to fit more code on the screen has its advantages, too.

-----


Even in the age of 1080p I agree with you. Side by side code samples, or having documentation up next to code, is very useful. However, if super verbosity pushes your code width to 120 characters, the screen real-estate needed is too much.

-----


Something like thing.horizontalPositionInTransformedAndScaledUnits may be slightly clearer the first time but becomes counterproductive when there are 20 of them on the same screen, especially if there are 20 thing.verticalPositionInTransformedAndScaledUnits mixed in.

Your eyes just start to glaze over (or at least mine do).

If you're only ever using one coordinate system, just use thing.x and thing.y or thing.h and thing.v, maybe with a comment that explains that these are in transformed and scaled units. Your eyes will thank you. :-)

-----


I'm surprised Hacker News readers are even having this conversation. :-) Kernighan and Pike's excellent _The Practice of Programming_ addresses it early on; chapter one _Style_, section one _Names_. It includes suggesting length should be inversely proportional to scope and that “clarity is often achieved through brevity”. http://amazon.com/exec/obidos/ASIN/020161586X/mqq-20 http://cm.bell-labs.com/cm/cs/tpop/

-----


After reading many of the responses, I am wondering if there is a difference between the 'business programmers' and the 'scientific/mathematical programmers'. Most of the examples below show code that could have come from physics or maths code. I have noticed that my code (scientific) tends to have much shorter variable names than the code of some of my "not so mathematical" counterparts. Also, 'business' languages like Java and COBOL seem to encourage longer names more than C and FORTRAN.

-----


Original question on Programmers:

http://programmers.stackexchange.com/q/176582/55712

-----


dhh recently posted about this very issue. Made me feel a bit more confident in being more expressive with variables.

http://37signals.com/svn/posts/3250-clarity-over-brevity-in-...

-----


Readability. Whatever is the most readable. Do that.

-----


I think that the second most upvoted answer is right on spot: variables that a short lifespan should be shorter than the ones having a long lifespan.

Of course when you're working in a functional language you basically only have short lived "variables" (they're not even "variable" but whatever). Even better yet, in some languages you don't even need that many variables.

The recent voted article about the guy refactoring Java code to Clojure was precisely an example of that.

So why even have "variables" when you can very often do without!?

-----


> So why even have "variables" when you can very often do without!?

In many of those cases, as often as not, the problem of variable names is replaced by the problem of function names, and we're back to square one.

-----




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: