
Most Pressed Keys and Programming Syntaxes - deniszgonjanin
http://www.mahdiyusuf.com/post/9947002105/most-pressed-keys-and-programming-syntaxes
======
quellhorst
I photoshopped this really quick for to compare ruby on Dvorak and Qwerty.
[https://img.skitch.com/20110908-q24qths9k4u6438wpd989qreci.j...](https://img.skitch.com/20110908-q24qths9k4u6438wpd989qreci.jpg)

~~~
jholman
Nice. This is the most interesting question about Dvorak, to me.

Everyone I know who recommends Dvorak is a programmer. And I'm more than
content with my comfort and speed typing English; the pain of slowing down to
type brackets all the time when programming is more of a pain point for me.

So with this in mind, it's interesting to note that Dvorak has moved the quote
key to a slightly less favourable location, and basically banished
paren/bracket. Not a solution for my pain point!

~~~
IvarTJ
I personally use a variant of Programmer’s Dvorak. Brackets and various
symbols are placed where you usually see numbers, while you have to press
shift to access numbers. I think that on a proper Programmer’s Dvorak Caps
Lock should cause shift to be in effect on the numbers/symbols buttons.

------
andylei
it'd be interesting to see these heatmaps in some sort of normalized way. for
example, 'e' is the most common letter in english, so its the most commonly
used letter in these programming languages. it'd be very interesting to see,
for example, this heatmap with the intensities divided by each letter's
frequency of use in the English language, or across a large set of data
including a lot of different programming languages

~~~
alextgordon
Just did it for 28000 C files. Here's the results:

    
    
        a  0.772163
        b  1.2679
        c  1.78209
        d  1.1195
        e  0.881398
        f  1.47252
        g  0.924242
        h  0.358954
        i  1.06756
        j  0.835313
        k  1.41458
        l  0.981729
        m  1.08955
        n  0.9156
        o  0.73849
        p  1.74468
        q  4.2497
        r  1.21577
        s  1.05023
        t  1.03627
        u  1.2967
        v  1.77662
        w  0.396003
        x  13.7292
        y  0.47566
        z  3.78748
    

The numbers are (relative frequency in C) / (relative frequency in English).
So "b" is slightly more common in C than English, but "w" is a lot more common
in English than C.

The raw counts for symbol characters:

    
    
        _  22890057
        ,  10895692
        )  10749798
        (  10745839
        *  9211904
        ;  8187969
        -  6628768
        =  5878296
        >  4428291
        /  3468260
        .  3011078
        {  2212412
        }  2211783
        "  2120264
        &  1647188
        :  1032587
        +  962554
        #  909859
        [  889538
        ]  888722
        <  839910
        |  643903
        %  583092
        !  561462
        \  540456
        '  454201
        @  131199
        ?  112488
        ~  84629
        ^  19064
        $  17922
        `  7272
        [space] 74199965

~~~
dredmorbius
What would be interesting here would be a difference analysis or regression
giving the preference for any given key in a given language. E.g.: '|' is
highly predictive of shell, '$' of perl, '()' for lisp. Might be fun to do in
R.

~~~
pwnguin
I really need to do reading and research on this, but I'm pretty sure that's
what Hidden Markov Models are for. You could watch a webpage go from HTML to
javascript and back!

------
swannodette
It's interesting to note that a big reason ( ) dominate in Lisp here is that
pg adopts the FP habit of short var names. If anything this is probably just a
measure of the tendency to use long vs short name - mainstream OO practice
encourages the former. It would be interesting to rerun the heatmap for Lisp
with a typical CLOS program. I think you'll find that ( ) no longer dominate.

EDIT: And in fact here's a heatmap of core.logic (1K LOC) which is fairly OO-
ish in its design - <http://twitpic.com/6hwj88>. ( ) are strong but do not
dominate everything.

UPDATE: And here's a 1.4K LOC Clojure program, core.match
<http://twitpic.com/6hwo8w/full>. ( ) again do not dominate.

------
saintfiends
This just graphically displays what I whine about most of the time. Why does
my pinky has to do most the work? My pinky is pretty short and all the pinky
movements are awkward. It considerably slows down my code typing speed.

I wonder If there would be another keyboard layout specially made for
programmers. If you look at it you'll see that most of it has a similar
pattern.

~~~
tomjen3
It is funny that you should mention it, because I have spent a lot of time
finding keyboard layouts more suitable for programing.

In the end I settled on the programmer dvorak layout. It is basically a
standard dvorak keyboard but with the special keys you need to program moved
to easier locations (and the dvorak keyboard itself uses the homerow much more
efficiently, so there is much less strain on your fingers and your wrists).

~~~
saintfiends
I have thought about switching to Dvorak, it's hard to find from where I live.
I'd love to know what model or brand you settled for.

~~~
ZeroGravitas
If you mean buying a hardware keyboard with dvorak letters, then don't. If you
learn to touch type (and there's no point switching to Dvorak if you don't
intend to) then you don't need to look at your keyboard, indeed being able to
will only teach you bad habits.

Print a paper copy of the key layout to pin next to your monitor when starting
out and use a typing tutor like _gtypist_ to train yourself instead.

~~~
saintfiends
I have no problem typing without looking. Problem as I mentioned above, it is
difficult to type with my pinky. It is kind of short (Not oddly short, just
that I have shorter fingers than average) and it's awkward. That's what slows
me down.

So I'm looking for a layout which makes my other fingers work more.

~~~
ZeroGravitas
It sounds like a smaller and/or curved keyboard might be a physical
alternative, rather than learn a new layout e.g. the expensive Kinesis
contoured devices.

<http://www.kinesis-ergo.com/contoured.htm>

~~~
cytzol
Funny you mention the Kinesis, which can switch between qwerty and dvorak with
a keypress, without you having to consult the OS about it. (great if you can't
configure the computer you're using)

------
tomh-
Lisp: <http://dl.dropbox.com/u/2196687/lisp-keystrokes.png>

~~~
snprbob86
My experience with Lisp is minimal, and I'm a Vim guy, so I may be totally
wrong about this, but...

Doen't nearly all serious Lisp developers use Emacs? And doesn't Emacs have
piles of shortcuts for wrapping/unwrapping/manipulating s-expressions? I'd
imagine that the resulting number of parens is wildly different than the
number originally typed.

Can any experienced Lispers comment on this? Where can I find a cheat sheet of
such shortcuts?

~~~
spacemanaki
Not an experienced Lisper, but yes, you're right. If you made heatmaps of the
original keystrokes, for a Lisper using Paredit, there would likely be a
decent highlight on the open paren but almost nothing on the close, as Paredit
inserts parens in pairs. Hitting ( inserts ().

I don't think Paredit isn't included in Emacs, but here's the relavent page on
the Emacs Wiki: <http://www.emacswiki.org/emacs/ParEdit> and here's a
cheatsheet: <http://www.emacswiki.org/emacs/PareditCheatsheet>

There's also something similar for Vim, or at least something bundled with
slimv, which is more than just Paredit (supporting something like SLIME)
<http://www.vim.org/scripts/script.php?script_id=2531>

------
robert_nsu
I looked at the heatmaps, then looked at my keyboard. The keys with rubbed out
labels nearly match his findings 100%. My 'N' isn't (only because the key is
slightly larger than my other keys). Other than that, he is spot on.

~~~
jevinskie
I'm curious, why do you have a large N key?

~~~
camtarn
Ergonomic style split keyboard, I'm guessing?

[http://www.compkeyboard.com/uploadpic/Microsoft%20Natural%20...](http://www.compkeyboard.com/uploadpic/Microsoft%20Natural%20Keyboard%204000-2.jpg)

~~~
robert_nsu
Yeah, that's the one. I use this for my office keyboard and a Logitech G510 at
home.

------
zyb09
Clearly ObjC programmers are the only ones, who comment their code
responsibly.

~~~
fullmoon
(or have to ;) )

~~~
mmariani
If you code responsibly in ObjC there's no need to write comments. Therefore,
if someone "has to" comment ObjC code there's something wrong with their
practices.

The same holds for almost any language, but in ObjC that's just natural.

------
quellhorst
Would like to see a Dvorak version of this.

------
5hoom
Interesting to note the difference between C and C++ with regards to the '*'
and '&' keys.

I know there is a lot of raw pointer and address usage in C, but I'm surprised
at how little these keys show up in C++.

It's good to see that people are taking advantage of smart pointers ;)

(It's subtle though, so I could be reading too much into it).

~~~
frou_dh
Seems odd that < and > are said to be used more in C than in C++ (templates).

------
KirinDave
4 of my haskell files put into heatmap. One of them is an applicative-functor-
style use of attoparsec, which tends to have more punctuation than normal
haskell code. Even with the frequent use of :'s, $'s and ()'s, the
alphanumeric keystrokes dominate the input.

<http://fayr.am/9xkE>

You can compare this to the Lorum Ipsum text map and see its only slightly
different: <http://fayr.am/9yk6>

I dunno what that means or what sort of value judgements it drives, but it's
pretty different from the other heatmaps.

~~~
jerf
And on Dvorak, all of the yellow letters except the "R" are on the home
row.... (R is on the O.)

------
jemfinch
This really needs to take into account modifier keys (in particular, shift).

------
duck
_Whitespace hasn’t been taken into consideration (tabs and spaces) which would
have been a cool thing to see._

I think if that was included this would be a lot more useful. Is there a
reason it wasn't?

~~~
tadfisher
Different editors take different amounts of effort to insert whitespace.

~~~
aangjie
Still without whitespaces, i have to look at the python heatmap a little
differently..

------
Newky
The javascript image shows limited to no usage on the $ key, Doesn't say a lot
for jQuery usage.

~~~
s00pcan
In the codebase here we use 'jQuery' instead of '$'.

~~~
falcolas
Ditto. If you pull in multiple plugins or additional frameworks (for some
reason, we mix YUI with jquery), the $() notation can become broken, and you
have to explicitly call jQuery().

~~~
udp
Unless you put all your code in a big closure to make a correct $ in function
scope:

    
    
      (function ($)
      {
      
      }) (jQuery);

------
cwp
Here's one for Smalltalk. It's based on my .changes file - about 200K LOC,
with all the lines containing '----' and $! removed. What's left is, I think,
stuff that actually got typed into a browser.

You can definitely see $:, but otherwise it looks pretty much like English.

------
bryze
Has anyone done this for programmer Dvorak yet? I guess I'm just looking for
validation..

------
pa7
If anyone cares: I just added the DVORAK keyboard layout to the keyboard
heatmap and open sourced the code. Here is the repo URL:
<https://github.com/pa7/Keyboard-Heatmap>

------
danobeavis
apropos of the brainfuck reference earlier today, here is a brainfuck
interpreter, written in brainfuck, visualized through the keyboard heatmap.

<http://i.imgur.com/lSDYJ.jpg>

------
swannodette
Just to drive the point home, Clojure's core.clj is 6,500+ lines of Lisp,
funny enough, _parens do not dominate_ \- <http://twitpic.com/6hwt28/full>.

~~~
daniel_solano
Well, there is a very good reason for this: Clojure tries to reduce the number
of parentheses by substituting other characters, namely square brackets [].
For example:

    
    
        ; Common Lisp
        (defun add (x y) (+ x y))
        ; Scheme
        (define (add x y) (+ x y))
        ; Clojure
        (defn add [x y] (+ x y))
    

Also, Clojure eliminates some parentheses that are used in other Lisps:

    
    
        ; Common Lisp and Scheme
        (cond ((> x 0) 1)
              ((= x 0) 0)
              (t -1))
        ; Clojure
        (cond (> x 0) 1
              (= x 0) 0
              :else -1)

------
MrVitaliy
The title is misleading. They just extracted character frequencies from source
files which fail to capture 'Delete', 'Shift', 'Ctrl', 'Alt', etc keys.

Even has Paul Graham name at the end, as if 'Look, this is totally legit!'

~~~
bostonpete
Well, Ctrl and Alt aren't related to the programming language (more the editor
or OS) and neither is Delete.

Shift is, but that could have (should have IMO) been extracted from the
character frequencies in the source files...

------
bfung
Of course there's also the missing shortcut keys. For example, Java projects
using an IDE would probably have crtl-space be the most frequently pressed
keys (autocomplete).

------
landhar
The problem with this is that you can't tell the difference between numerals
and symbols or even worse between two symbols in the same key (such as '_' and
'-').

------
ori_b
Interesting. It seems strange that Javascript and Ruby seem to use 'r'
significantly less than other languages. I have no idea why that would be.

------
yvdriess
It appears that it just scans source files, not actual key presses. I barely
touch the parenthesis keys when programming Lisp for example.

------
hernan7
Perl programmers do seem to comment a lot... the "#" looks almost as heavily
used as the "$", which is mandatory for variables.

------
saintfiends
In reality though enclosing glyphs will not be very well balanced. Opening
brackets will be typed more than closing brackets.

------
dodo53
vim and emacs would be fun too :oP

------
4ad
based on my visual observation, apart from lisp, python seems to skew furthest
away from average. Its heatmap is much cooler, with less extremes. I wonder
why.

------
doki_pen
I'd love to see a keyboard layout based on data like this.

~~~
egiva
It already exists in terms of a more logical layout for faster typing of many
latin-based languages. Instead of QUERTY keyboard, it's called the Dvorak
layout and is really common in certain circles and places - I believe that in
OS X you can switch to this alternative layout pretty easily if you're
interested in learning a faster way of typing. More info here:
<http://atmac.org/dvorak-keyboard-layout-switching>

~~~
ZeroGravitas
There's also some programmer versions of Dvorak, that make the punctuation
more accessible too (by demoting numerals to needing the shift-key to be
held):

[http://en.wikipedia.org/wiki/Dvorak_Simplified_Keyboard#Prog...](http://en.wikipedia.org/wiki/Dvorak_Simplified_Keyboard#Programmer_Dvorak)

Just as Dvorak was based on letter usage, this is based on examining large
bodies of code just like the heatmaps in the OP.

~~~
tomjen3
It is a bit more than just switching the keys and numbers around - it uses the
information from the code it examined to intelligently put the special chars
that are used most often were they are easiest to access.

And yes if you wonder, it is awesome to use.

------
killion
I bet the backspace key is used the most.

------
francescolaffi
time for a programmer keyboard layout? "ASERTNIOL" in the middle line would be
good for several langs

------
MicahWedemeyer
The most pressed keys are ⌘ (or CTRL), C, and V.

------
Kwpolska
Doesn't PHP use a colon at the end of every line? WTF?

~~~
shabble
No.

It uses a semicolon to delimit statements within blocks, but (afaik) placing
each statement inside its own <?php ... ?> tag would be valid.

There's also the alternative syntax[1], which is mostly used for templating
these days, which looks like

    
    
        <?php if ($foo): ?>
         ... html ...
        <?php endif; ?>
    

and is probably the only place other than a ternary operator that might
conceivably end a line with a colon.

[1] [http://php.net/manual/en/control-structures.alternative-
synt...](http://php.net/manual/en/control-structures.alternative-syntax.php)

~~~
Kwpolska
but the <?php ?> don't seem to be used too much on this heatmap.

~~~
Androsynth
As php code gets more structured, its moving towards MVC frameworks where php
tags are used sparingly in the views. (they are a fast way to make spaghetti
code)

