Since a Lisp programmer spends all his time writing "(- foo bar),” when he sees “foo-bar,” his brain easily parses it as a symbol, and he never needs to wonder whether it should actually have been “foo - bar” with spaces to separate the infix operator from the symbols representing variables.
If you wrote a version of Ruby that was still infix but where you needed to put spaces around all symbols, then "foo-bar" would work fine there, too.
But what’s also at stake is whether that language would be readable to humans without mistakes. Part of what makes Lisp work, IMO, is that you never try to type foo - bar, so you never run into an accident by mistakenly typing foo-bar and having the parser think you meant a hitherto undefined symbol.
Whether that’s a worthy tradeoff is a matter of opinion, and I’ll take Avdi’s word for it that he believes the benefits of calling symbols foo-bar outweigh the odd time you might type the wrong thing. I’m just pointing out in addition to the “whitespace is the separator” argument you both make, there’s also the “foo - bar doesn’t mean foo subtract bar” argument.
Unless you have a function called "foo", and you want to pass it the subtraction function as it's first parameter and the value "bar" as its second. In which case (foo - bar) is perfectly valid, reasonable, and completely different from (foo-bar).
Even though Lisp is not an infix language, foo-bar does mean something different to the parser than foo - bar because of whitespace.
I’m sure you knew perfectly well that I meant you never type foo - bar meaning foo subtract bar :-)
Even if I don't know what programming language you're talking about, if I see x2-x1 or margin-75 I know you probably mean subtraction, and if I see last-name or message-id you probably mean a single identifier. I've seen plenty of Lisp code that says things like x2-x1 in docstrings (because it is shorter) and I've never found it to be confusing.
What he's referring to as a dash is actually a hyphen (-), which is pretty ambiguous when compared to the en dash (–) and em dash (—), both of which require modifier keys to type (and are frequently used in prose, to counter another of his claims). Fortunately, I've never seen em dashes or en dashes used in code.
I think it's pretty clear from the article that his comments were meant to be taken in that context. If he'd chosen to preempt the pedantic readers by making that context explicit, it would have only diluted the article with noise that most people would rightly assume is implied by the context.
I don't know what your keyboard looks like, but mine doesn't have en- or em-dashes on it, and I wouldn't even consider them for a programming language syntax.
 Even when using en-dashes and em-dashes in prose, they can often be typed with multiple consecutive hyphens, e.g., '--' and '---' in LaTeX.
While most of the article talks about code, the author mentions underscores being ambiguous in the context of underlined fonts. (I don't even think my text editor [for code] supports underlined fonts.)
He also mentions only using shift once or twice per sentence when writing prose.
So it would seem that the author is just hating on _ in all contexts. Which is kind of silly. (Though I'll admit, the point of it being improperly handled with underlined fonts is certainly a valid complaint.)
Well, mine does, and I used to have a co-worker who actually used it for coding. So there you go. :-)
How would you use it for coding, though? Unless you stripped it before the compile phase I would think any formatting in a rich-text format would choke up almost any interpreter or compiler.
Lot's of editors do. In Eclipse, for one, it's common to see function names and such shown as hyperlinks when you click on them (with underlines), which makes their definition open (like a function call is "linked" to the function definition).
The people who think only in terms of the “big picture” while glossing over the details are not programmers. They are “architects.”
EDIT: This second statement is probably not correct. I would use strikethrough if I could. But the first statement still rings true with me, many programmers are detail-oriented, which means that they care about lots of things other people would consider irrelevant. Whether this is, on the whole, a good thing or not is another matter. But I do not think it is surprising.
For example, I'm a great typist so the underscore key has never really bothered me.
But to call the author out on being detail oriented would be inane of me. I myself don't like keyboards that have small backspace keys. I also still hate OS X for having the window control cluster on the top-left corner.
I know I certainly get paid to think about the details. The only variable here is which details.
Many country specific keyboards also need modifier keys to access the hyphen.
I for one, don't like the "->" operator in C. It requires two characters and the use of the shift key. I would like that the next standard just used "." for all member access and let the compiler work its magic and guess what needs to be done.
Of course that would break a sizeable amount of legacy code, but who cares about that anyway. The important is that the code looks good, especially if I enable drop shadows and anisotropic filtering in my editor.
if you want that, use a higher level language. One of the foundations of C is that there is as little implicit behavior as possible. I'd hate to be left in the dark whether my . will now result in a pointer or a dereferenced value and I doubt it would even work for the compiler to figure this out in all edge cases.
Similar in concept: &function and function are the same. If you have a fixed-size array: ary, &ary and ary are the same when passing them into a function.
Then there are all the other numeric conversions, where "unsigned" is a common trap, and truthiness.
Even if most conversions are trivial, I wouldn't call C a very explicit language. Compared to what?
The main problem with this is that it hides memory accesses. The offset of x.y.z.w relative to &x can be determined at compile time, so if &x is available, accessing x.y.z.w requires roughly only one load instruction. (DISCLAIMER: ABI- and compiler-dependent. One instruction is an estimate. This estimate may not be precisely true for all possible programs across all CPUs and compilers.) But x->y->z->w requires 3 pointer dereferences, even if you have x to hand. 3 load instructions and therefore 3 memory accesses. (The same disclaimer applies.) People already complain enough about the inefficiency of C++ operator overloading, and the difficulty of spotting when it occurs just by examination of the source code...
Additionally, this change wouldn't play nicely with C++ operator overloading.
I just set up my text editors to insert -> when I press <Alt-,>.
For example, you cannot tell how many indirections happen in a either. Every game programmer must have seen a newbie try to cast a into a :)
I guess it'd break because of (a* +b) but that's just all the more reason to disallow whitespace around infix operators.
It would be p^.foo instead of (^p).foo or p->foo. Less awkward when adding or removing one indirection level too.
It's especially annoying when shuffling array elements or function parameter around: all items need to be separated by a colon, except for the last one, that must not have a trailing comma, making line swapping a nightmare. Drives me nuts.
Then use a language which allows trailing commas in lists, param-lists, maps, e.g. grOOvy
Granted you don't need to type all that fast for programming. However I do have to type full-speed when writing documentation, composing an e-mail, working on a presentation, etc.
Not to mention Dvorak and "crazy non-standard keyboards" demonstrably reduce RSI, which is a far greater benefit than typing speed. (Unless you like having disgusting chemical compounds injected into your wrists; or want your wrist cut open so that doctors can work on some of the most delicate tissues in the body.)
xmodmap -e 'keycode 65 = space underscore'
Properly typing an underscore should utilize two hands and shouldn't require a stretch at all.
Either way, reaching the + key on QWERTY-102 is objectively more difficult than the underscore [uses the pinky, is farther away than the underscore from homerow], and also requires a shift-modifier. I use + far more than _ while I'm coding. (And I'm currently writing most of my code in Go, so I use _ quite a bit since it's a blank identifier.)
Maybe I just have big hands, but it honestly doesn't impede my typing at all to add an underscore to the text.
There are languages that survive in spite of the heavy use of uppercase letters, without useless complains like this.
Let me repeat, in German:
"Es gibt Sprachen die trotz starker Nutzung von Grossbuchstaben ueberleben, ohne ueberfluessige Beschwerden wie diese."
Damn, I just broke my pinky..
That passage sticks out to me as well. Especially the "Awkward. Inefficient" part.
Maybe the guy needs to learn how to type, because hitting shift for me is muscle memory that requires no thinking. That Grade 9 typing class in 1984 just continues to pay dividends for me.
i - -
A sometimes used _ is okay. Its_using_it_almost_all_the_time_that_the_author_is_questioning.
Oh my, oh my, how do people ever cope with that vile "having to hit shift" that have regular capitalisation of all nouns etc. Surely such a thing is impossible, as is hitting shift when you need an underscore.
I understand that some things could be more efficient, but "hitting shift" hardly qualifies in my opinion...
I does qualify as something that probably contributes to Repeated Strain Injury which most programmers should worry about.
I use alternate hands when using the shift key, so that's not the issue. Perhaps it's just that I've got into a bad habit of depressing the shift key with a little more force and holding it for longer. With sticky keys I seem to dance off the keys better.
If you have a number pad, perhaps it might be better to switch the number row with the symbols (i.e. don't use SHIFT to get to the symbol.) I think that's the default for French layouts.
And yet we insist that a programmer be able to write every modern programming language on a 50-year-old typewriter. You know, just in case.
Personally, I'd prefer to see the English conventions carried over to
the use of general use of hyphen and underscore in identifiers in
the core (and everywhere else).
By that, I mean that, in English, the hyphen is notionally a
"higher precedence" word-separator than the space
(or than its intra-identifier stand-in: the underscore).
For example: there's an important difference between:
The former initiates the detonator phase for the main sequence;
the latter initiates the main phase of the sequence detonator.
More simply, there's a difference between:
The first is setting a difference; the second is computing a difference-of-sets.
The rule I intend to use and recommend when employing this new
identifier character in multiword names is that you should place an
underscore between "ordinary unrelated" words, and a hyphen only
between a word and some modifier that applies specifically to that word.
Which, if applied to Temporal, would lead to:
my $now = DateTime.from_epoch(time);
The C<day> method also has the synonym C<day-of-month>.
(These are also available through the methods C<week-year> and
There's a C<day-of-week> method,
The C<weekday-of-month> method returns a number 1..5
The C<day-of-quarter> method returns the day of the quarter.
The C<day-of-year> method returns the day of the year,
The method C<whole-second> returns the second truncated to an integer.
The C<time-zone> method returns the C<DateTime::TimeZone> object
(i.e. only C<.from_epoch()> actually uses underscore).
Oh, and the optional C<:timezone> argument to C<.new()> should probably
become C<:time-zone> for consistency with the C<.time-zone()> method
(or, preferably, we should jut bite the bullet and go with C<timezone>
1) The goal of a programming language is to be as unambiguous as possible. This is why we have syntax, and do not program in english. As soon as you use the same token identifier for two purposes, you introduce ambiguity.
Take markdown, for example:
* Hello *world*
Lisp does not have this problem, because it is governed by an extremely simple and effective set of rules:
- An identifier can contain most unambiguous special characters, including - ! ? / + < > = etc.
- There are no operators, only functions.
- A hyphen by itself is a perfectly valid identifier.
- The standard library includes a function named hyphen which performs a subtraction operation.
Taking that into consideration, it makes the entire situation completely ambiguous. `a-b` is one identifier. `a - b` and `- a b` are both a string of meaningless identifiers. `(a - b)` performs function `a` with arguments `-` and `b`, and `(- a b)` performs function `-` with `a` and `b`.
Ruby is actually interesting in that it is one of the few languages which, like lisp, does not have operators. You could remove ambiguity by doing `a.-(b)`. However, readability immediately goes out of the window. Much more so than in lisp's case. Consider this:
(4.*(5)).+((7.*(9))./(3)).+(1) #<= the meaning of life
4*5 + 7*9/3 + 1
With that said, who cares? Don't use them if you don't want to. It might be more inefficient, but typing speed has never been the thing that slows me down while I'm writing code.
(I'm using a Kinesis Advantage keyboard.)
my.long.name = 1
a.short.name = 2
one.long.result = a.short.name - my.long.name
Now we just need to design a whole new language :(
Page long commenting on a fluffy article ( e.g "5 * 5 - 3" beats "5*5-3" ) may be good for your mind but imho, this thing should not be on the front page.
I dunno. For some reason this super-optimisation of keystrokes doesn't bother me. Avoiding reaching for the mouse is good, but I don't particularly care if I have to hit shift.* Maybe I prefer to think more and type less, or something. I can't quite say.
* Admittedly the French keyboard where all numbers in the top row are shifted, and symbols are unshifted, drives me kind of crazy, but that could just be due to hard in-grained training on an American keyboard layout.
At my old job, we had DSL that allowed and encouraged very descriptive variable and procedure names with space in them. To implement this unambiguously the language had to exclude all the operator characters from variable names and procedure calls with a sigil. A line of code may have appeared like this:
line slope estimate = @some function( x - step size ) / ( x - step size);
Feels a bit strange at first but quickly becomes natural, especially when dealing with wordy domains (math was a bad example).
Honestly, I've always found them to be more human readable than CamelCase.
Also underscores in uri strings look awful.
I wish python allowed for dash in its script names... cannot import from module-test.py but I can from module_test.py
Thank you for posting this, I "laughed-out-loud".
(I mean, it's not like I'm physically incapable. I just instinctively hit SPACE with my right thumb.)
Getting used to using my left thumb would just be awkward.
I also have a nasty habit of using the LSHIFT exclusively. Even when it's not technically appropriate to do so.
Japanese ThinkPad USB keyboard (with TrackPoint!):
Perhaps the spacebar is just too big, and could be divided better into other modifiers. My left thumb is usually wasted (though not on a Mac, where the thumb can easily reach the Command key.)
Some people however - seem to alternate thumbs on the spacebar - or rather use the other hands thumb to the hand that just typed a character. I personally always use the right thumb to space.
Install AutoHotkey and then create a file called Underspace.ahk with this content:
; Underspace for AutoHotkey
; Converts Shift+Spacebar to underscore
+Space:: Send _
Try Shift+Space and it should type an underscore for you.
If you want to stop the script later, find a green "H" in your system tray and right-click it.
After you try it, I'll be curious how it works out for you. Myself, after trying it for just a minute or two, I'm not so sure if I like it. I'm finding a lot of spurious underscores before capitalized words, or after a word like "I" that ends with a capital letter. That's because I've been sloppy about the timing of when I press the Shift key before or after a space.
So I won't be using this script myself (not that I need it - I don't like to code using names_with_underscores) but maybe others will find it useful.
 AutoHotkey home page: http://www.autohotkey.com/
 AutoHotkey_L: http://l.autohotkey.net/
(minor edit for clarity)
Not me. I would gladly trade off having to expand all my equations for having to type underscores a few times. In fact, this is very subjective - and one could argue having to sacrifice compactness of expressions is stupid.
All you have give up for this is tightly-packed arithmetic statements, like this:
Instead, you have to give your tokens some breathing room:
5 * 5 - 3
a = 5
b = 3
everybody from nearly every programming language would expect that after
c = a-b
you will find c to be 2 not nil (or throwing an exception or whatever)
I agree with the OP. CamelCase and underscores are both scourges.
The difference in length is pretty small anyway.
Names tend to be pretty small anyway. Here are some random ones from the code I'm working on (Sococo Teamspace)
CreateVariant - a method
soLogWriteInfo - a method (I didn't write)
clHeadSSt - a circular list head for a stream
It's just like qwerty, but with less pain.