Hacker News new | past | comments | ask | show | jobs | submit login
The impact of syntax colouring on program comprehension [pdf] (ppig.org)
125 points by joubert on Nov 26, 2015 | hide | past | favorite | 86 comments

I've been writing code professionally for 20 years, and as an armatur for 30. I can parse code much quicker if it has syntax highlighting. If you don't, that's fine, but it doesn't mean your more experienced or somehow better than those that do. I always hate conversations about this because it always seems to come down to a dick measuring contest where everyone seems to think that more monochromatic the longer it is. It's not. Use what feels right to you.

> dick measuring contest

A more experienced colleague: "Pah, that's just a crutch". Well, yes. And?

My personal tipping point was watching a sysadmin edit resolv.conf and type "namesrever": vim picked it out in reverse red. This mistake could otherwise have gone unnoticed for a while. Good crutch.

When I first started writing python, my number one error was writing "else" without the necessary colon. A simple adjustment to my syntax highlighter and I stopped that class of error completely.

Other crutches: keyboards (just enter your data with switches), non-volatile memory (reload everything manually at power on), compilers (generate machine code by hand), etc.

If someone says "pah, that's just a crutch" in a dick measuring contest, it's safe to say that he has won.

I can parse code much quicker if doesn't have syntax highlighting... or so I thought until I tried a properly subtle colour scheme (offwhite with various dark foreground colours, as it happens). I wonder how many of the colour-objectors really object to inappropriate colours rather than to their existence. Pale blue on white, bright green on white, etc.

I'm also in the light background, dark text camp, when it comes to code at least. I think Visual Studio (using the light rather than dark or blue themes) has the colours just about spot-on for me, especially in the 2015 editions where they seem to have softened the palette a bit.

That said, I'm completely the opposite when it comes to the command line. I think colour coding can add a lot (ls for example) but it has to be light-on-dark or not at all. Funny how that works.

How that probably works is that with this distinction, you know at a glance which windows on your desktop are editor windows with code and which are command lines (or possibly editors that have suspended to the background to go to a command line).

It's ... meta-syntactic-highlighting for your entire desktop! Different types of windows are colored differently, just like different kinds of identifiers in one window.

Yeah. IMO, the particular color scheme matters. I think I'd get much more benefit, vs no highlight, from the color scheme I'm used to, than the one the researchers chose (somewhat arbitrarily).

I could easily spend an hour switching through color schemes, and have done so. It shows. My wife looks over my shoulder sometimes and says that it looks gorgeous!

I believe that the color scheme is crucially important; it's a form of ergonomics. Bad colors will lead to fatigue.

Certain choices are obviously bad, like hues that make comments or other elements almost impossible to read; insufficient contrasts and so on.

Yeah I like syntax highlighting but prefer subtle colors. I do prefer dark backgound. "Solarized" scheme has been my favorite for a few years now.

Of the canned ones in Vim, what do you think of "fnaqevan"?

By the way, something is wrong: my Vim isn't responding to ":set background=light" (or dark). The colors change, but not the background.

In [this talk](https://youtu.be/b0EF0VTs9Dc?t=15m2s) by Douglas Crockford he talks about syntax highlight being for kids - "that's something we put in our text editors for kindergarteners to do programming...I am more sort of a grown up and a professional programmer so I don't need to colours".

I found it weird he was so deriding towards syntax highlighting.

He is working backward from his own personal preference and working awfully hard to justify that as if he came to that conclusion for logical reasons.

People often think they are acting logically when frequently they are working backwards to write narratives that justify the way they choose to act.

Low self esteem I'm guessing.

I wouldn't even think about releasing a programming language without an accurate syntax coloring definition for at least one major editor, maintained side by side.

Looking at code without coloring is like looking at a road atlas that has been photocopied in black and white.

Even some grayscale cues are better than nothing.

I actually run the Vim editor dynamically out of Apache to colorize TXR files. You can see that here and here:



The second example shows a mixture of two languages: a Lisp dialect and an extraction language. Both have their own sets of standard symbols (which intersect: there are some symbols in common). They are correctly rendered in a different color: the Lisp words are green, and the extraction language words are burgundy red. This kind of thing is very helpful.

The Vim definition I maintain is quite accurate --- and not only for correct programs. I stuffed in rules so that errors are boldly flagged. For instance, even little things like using an undefined escape sequence in a character string or regex literal. (And it knows the exact set for all literals.) This works remarkably well; hardly any typos I make get by it.

Syntax highlighting definitions also help with formatting; they provide the editor with nesting cues for indentation.

Working without this stuff this day and age is completely silly.

I've been programming for 30 years now, I remember back before syntax highlighting. I had a bug that I couldn't track down - it took forever. I had an infinite loop that would only run 3 times and then exit.

Here was the problem:

  for (int i=0;i<3;++i) {
    /* Initialize important things.

  while (1) {
    /* Do ALL THE THINGS. */
Syntax highlighting prevents the entire class of bugs from happening (as do compiler warnings, but hey). I told one of the other guys I was working with, and he introduced me to emacs which had just gained font-lock mode.

The lack of syntax highlighting in your comment really illustrates the problem nicely. Took me longer than it should have to spot the problem!

Heh, I tried to scroll to the right to see the rest of the line. Swift's nested block comments would sort of fix the problem, but syntax highlighting would work with C.

I will say though that highlighting comments and strings separately from the code doesn't necessitate full syntax highlighting. I remember (and have saved somewhere) an article on syntax highlighting where only comments were highlighted.

Also automatic indentation.

Yes, this alone will often spot problems such as unterminated comments or unbalanced brackets. When the next line doesn't indent to where it should be, you know there's a problem somewhere above.

I wish we had more understanding of how effective different presentation styles really are when working with code.

For example, personally I much prefer quite mild syntax highlighting — depending on the language I might distinguish things like comments, literals, and maybe some routine keywords and punctuation that could be de-emphasized. I’d also choose colours that didn’t jump out, but were distinctive enough not to confuse, say, a literal string with an identifier.

I don’t see the attraction of colouring 67 different classes of code fragment in 67 barely distinct colours myself. On the other hand, many popular editors and colour schemes do this, so it would be interesting to know whether I’m missing out on something that might be helpful.

What I do find very useful is the kinds of context-sensitive highlighting that some editors will do, for example showing matching delimiters and maybe the region between the innermost ones, or showing all references to and any visible definition of the identifier under the cursor. Subtle highlighting is still preferable to in-your-face bold colours, at least for my taste, but I find these kinds of tools do help me to navigate and understand code more efficiently.

I've been experimenting with minimal highlighting using a theme I stole from jcs (mine (haskell, rust): http://imgur.com/jNgBHpn, his (c): https://i.imgur.com/PkNOjBZ.png) and I'm surprised how much I enjoy it.

edit: forgot to link to original post: https://lobste.rs/s/9ruzgw/screenshots_from_lobsters_users_2...

The dotfiles are linked in that post. It only really looks as pictured if you copy both the vim theme and the Xresources and use something that respects .Xresources like urxvt.

I often see language keywords and punctuation being emphasized, but I find emphasizing my own symbols more useful, since that is what defines my program, like nouns and verbs are the important parts of natural language. Reading code with emphasized punctuation makes me feel like I have the hiccups, pausing off-beat.

I also think emphasizing identifiers helps to see some of the structure of the code and makes it easier to notice if I misspell something.

I prefer my highlighting to not so much highlight specific things but to partition the code into easily parse-able blocks.

The default --angry fruit salad-- syntax highlighting of most editors and even other people's themes are usually too much for me. This is what I use: https://raw.githubusercontent.com/aerique/emacs-theme-aeriqu...

I dig it. Which font is that?

Terminus, I've tried out (and still try out) many fonts for programming but I always come back to this classic.

I think I might like to use this. Is it published anywhere?

Never mind the results, but 10 graduate students? That's a rather narrow margin of experience and small set in general.

n=10 is a small set. In addition, you don't know if there is a prior bias here. Say most of the 10 grads use syntax colors in their daily work. This is likely to perturb the numbers.

Figure 2 shows the plain samples have much greater variance, but does that translate to significance? Some analysis here would be nice, but with n=10 you don't have enough samples for this to be significant I'd wager.

Also, the test should have thrown in a color scheme which nobody has ever used as a control. I think the choice of color scheme matters a great deal as well.

> Say most of the 10 grads use syntax colors in their daily work. This is likely to perturb the numbers.

Wouln't you say that is representative of the general programmer population?

Fascinating, but not surprising that the study concludes that the "effect becomes weaker with an increase in programming experience".

Maybe of interest is Hughes' discussion [1] of the different forms of syntax colouring. It was discussed on HN before [2].

[1] http://www.wilfred.me.uk/blog/2014/09/27/the-definitive-guid...

[2] https://news.ycombinator.com/item?id=8378799

Been programming for over 30 years, I like syntax highlighting, along with good formatting.

I do say that there's good and bad highlighting, I will tweak the highlighting colors and appearance if I can, I find many highlighting schemes are more distracting than useful. And I' sure my preferences may not meld with others.

Can we now remove the comment in .bashrc for every ubunutu install:

  # uncomment for a colored prompt, if the terminal has the capability; turned
  # off by default to not distract the user: the focus in a terminal window
  # should be on the output of commands, not on the prompt

This comment annoys me. I want a coloured prompt so that I can focus on the output of the commands, which become neatly boxed by my coloured prompts.

Having monospaced / proportional font test as a sort of control would have been nice.

I've been wondering what is the analogue of syntax highlighting for natural languages. Still experimenting with it, but most promising is simple keyword highlighting. I've written a Chrome extension that does this:


Interestingly, keyword highlighting apparently helps some type of dyslexic people focus their attention on the text. With the web full of distractions, I've noticed similar effect myself. Keywords also seem to help skimming, when you're reading the text with a specific purpose in mind.

> what is the analogue of syntax highlighting for natural languages

Nouns begin with a capital letter in German, and used to be in Dutch and English until a few hundred years ago. Perhaps it was because nouns are generally spoken with the greatest stress of all words in an English sentence. Perhaps such capitalization is an analogue of programming syntax highlighting.

Sometimes long complex English sentences can be very confusing to me, especially in finding the verb in there. When two or three consecutive words have different grammatical meaning, and can be used in different grammatical meanings, it can become very confusing if English is not your mother tongue. In that case, color coding (or bold / italic / underscore) could help a lot. But for a word processor to do that, it has to understand grammar...

I've tried part-of-speech tagging and subjectively it doesn't seem to help much. POS taggers for English are a solved problem, so it can be done automatically, but for prototyping I tried marking up some text myself. Indeed what helped the most were the verbs (vs other parts of speech), but it still didn't seem to be worth doing.

Also marginally helpful was to delimit noun / verb phrases with 2 spaces instead of one.

> Nouns begin with a capital letter in German

Personally, I don't like that. I prefer if capitalization is a marker of names (e.g. the difference between apple and Apple).

I think of it as a space to explore. Punctuation is another preexisting visual aid in parsing. However, printed text is rather limited compared to the ways a computer can style text dynamically for you.

Edit: The fact that this capitalization disappeared suggest the returns from the effort of capitalization weren't so great. But now the investment per word is much lower because the process can be automated.

My japanese is pretty elementary, but it's quite fun how most "conceptual" things like verbs, adjectives, and nouns are expressed via kanji, and they're related to each other via conjugation with hiragana, particle tagging also in hiragana. And on top of that, foreign words or emphasis are expressed in katakana.

So for example, 私の名前はロデリックです。The words for "I" and "name" are expressed in "complex" kanji (私、名前), connectors and grammatical stuff in hiragana (の、は、です). And my name, transliterated, in katakana (ロデリック).

I've read somewhere that, as a result, japanese is painful to write, but a pleasure to read. I'm not there yet but I can sort of see it.

   I've been wondering what is the analogue of syntax highlighting for natural languages.

>I've been wondering what is the analogue of syntax highlighting for natural languages.

Well, we have bold and italics to emphasize important parts of a phrase (and underline, which is often used in handwriting for that purpose).

Since we don't work with different colors when reading/writing, we also use things such as punctuation, e.g. ? to denote that a statement is a question, ! that it intends to show wonder, etc.

I'm used to writing with ~3-8 different colours. My colleagues are amazed (and make a bit of fun) that when I lend them an essay, every colour has a meaning. I also use block letters and cursive to subtly alter the meaning of a phrase.

I've done this since highschool, because a full page with single coloured text seems incredibly boring to me, and it draws my attention away.

That sounds incredibly difficult to read for most people. How sure are you that it's an effective method to communicate with other people? (Or is it just something you do for your own stuff, and other people get plain text?)

It's an effective method to write for myself, most of this stuff is meant as drafts that only I (or few people) are meant to read.

But it still sounds worse than it actually is, I promise it's a lot clearer than it sounds. Most of this is technical stuff with formulas or code, so it helps a lot more than it seems from the description. And the "full 8 colours" are only used in special cases with a lot going around.

That's interesting! Can I see such an essay of yours?

Most of my writing is in spanish, does it bother? I may have a few things written in english though.

English is better as I'd have to run Spanish through Google Translate. Email is in my profile if you don't want to post it here.

I would love to see it too!

I've been wondering what is the analogue of syntax highlighting for natural languages

The Writer Pro app from iA has a "Syntax Control" feature: https://brooksreview.net/2014/02/does-syntax-highlighting-ac...

Thinking about it one of the things that bothers me about Haskell is that there's so little syntax that one can't really colour it. Coupled with the lack of brackets or any other visually distinct symbols, a piece of code always looks like an undifferentiated mass.

I personally find my syntax highlighting preferences change with mood and time. Mostly I find them useful. Sometimes I've found them to be tiring to look at, at which point I use monochrome[1] theme which is good compromise between no colors and many colors (i.e. 2/3 colors).

[1] https://github.com/fxn/vim-monochrome

For Xcode (Objective-C) I've recently started using the Salander theme, which looks kind of similar:


Having colors for different classes of code fragment just helps to reduce the amount of processing of the code in the brain. Seems to me that this is a win, at least it is for me.

In daily life I use a very colorful scheme (darcula on steroids) and just thinking about monochrome makes my head hurt.

I started learning programming without syntax highlighting (on a monochromatic display), and actually find it distracting to read multiply-coloured text. In fact I tend to close my eyes whenever I "mentally execute" code, as then I'm visualising variable values and output instead of the code itself.

Another article from the opposition, which I partly agree with: http://www.linusakesson.net/programming/syntaxhighlighting/

(Linus Akesson is what I'd consider a highly experienced programmer, and the rest of his site is worth looking at for lots of other interesting stuff.)

Oh god that article, I was waiting for someone to post it. I don't mean to insult your opinion but I just find it cliche because it is always posted in any discussion about syntax highlighting and it has some serious issues with false analogies. Comparing syntax highlighted prose to syntax highlighted code, equating typing on a keyboard to typing on a piano? Seriously? The author may or may not be a good programmer but that article stinks of BS.

It's nice to see that the comments section is still going strong, 8 years later.

> Comparing syntax highlighted prose to syntax highlighted code

This just makes me wish you could get syntax-highlighted prose. It's not necessary for languages you have a native understanding of, but think how helpful it would be for foreign languages. :/

Lack of syntax highlighting in prose is a feature - if you had it, the authors couldn't play you with stupid tricks like garden path sentences :).

Imagine how much more dense complex sentences could be if colors were used to disambiguate meanings, and yet still understandable!

Now I'm imagining a variant of Lojban, but where the colors have semantic meaning. It could be super dense.

I've read this article before, it's anecdotal evidence and he's jumping to one-sided conclusions (like "'=' and '==', the only situation where it would have been useful"), not leaving any space for YMMV.

I'm not questioning he's a very experienced programmer, but I know this type of highly skilled programmers - they're awesome working on solo projects and you'll be amazed at their nuclear alarm clock with Klingon language recognition, but working with them as a team tends to be an ordeal. I'm >80% sure I'd have to put the guy in this category had I met him.

Reminds me of Damian Conway that finds syntax coloring distracting, always wondered why. https://youtu.be/aHm36-na4-4?t=10m23s

In the video, he shows the standard coloring for vim. That coloring is indeed flashy and more distracting than no coloring at all.

I tried having no syntax highlighting for a few months, and it was usable, but not as good as with syntax highlighting.

"I tried having no syntax highlighting for a few months, and it was usable, but not as good as with syntax highlighting."

A lot of the problems I have with Vim and syntax highlighting is using VT100 or badly configured graphics systems. Personal taste, one style I do use is the old Borland colours which came out with Turbo Pascal. Works on all the monitors I've got ~ http://www.vim.org/scripts/script.php?script_id=92

I wonder if these results are somehow related to a weird fact that dyslexics have far less troubles understanding the coloured text (even a transparent coloured overlay can be sufficient).

I wonder about the potential of syntax highlighting English.

Syntax highlighting works because colour components are preattentive i.e. your brain picks out hue, saturation, intensity/luminance differences in constant time. Size, proximity and orientation work too.

That's why some schemes work better than others: what you want to find quickly is key structural signals in the code. If you highlight comments strongly, say, it'll miss the point (unless your proof reading comments of course).

So, to your question, the value will depend on whether or not there are structural cues that it'd help sufficiently to highlight.

Interestingly we do do that to some extent already. For example, indenting and capitalising the start of sentences, indenting (and spacing) paragraphs, italics and bolding for emphasis. That sort of thing.

So using highlighting in the places we currently do could well improve what we already have. For example, colouring the first word in a paragraph or increasing the font of the sentence capital.

What would be fascinating to study would be if we could use it to help with word or sentence stress. I could imagine that, done carefully, this could help with learning languages.

So for example you decide to mark adjectives with blue. Then you have: The grass is green. Nope, thanks.

I don’t understand figure 5. The left and right images are not only of code that’s not highlighted and that’s highlighted, it’s for two completely different functions. Of course eye tracking is going to be different.

Even though it’s not really worth noting due to the first issue, the right example also seems to require more focuses (87 over 65)?

Funny that nobody has mentioned unbalanced parenthesis yet... Too much color can be distracting, but catching paren structure at a glimpse has no price. Very interesting article, we need more of this 'measure, don't speculate' kind of science.

Here's Douglas Crowford on coloring he'd like to see in his code editor:


Next hypothesis to test: Does the existence of syntax highlighting enable less skilled programmers to write overly complex code?

Note that experience has exponential speedup for completion time.

From figure 4 we can draw the conclusion that the most experienced programmers shouldn't use syntax highlighting?

For 9 of the 30 examples, syntax coloring resulted in poorer performance. In about 5 examples, much poorer performance. The experimenters showed no interest in finding out why. Yet they had eye-tracking data. Did syntax coloring attract attention to the wrong text for some problems? Which ones?

The fault lies in the simplicity of the test. Experienced programmers will not be slowed by lack of syntax coloring when parsing a simple example, but it is reasonable to think that they still benefit from it in larger, more complex codebases -- that was not studied here.

Note that that is a negative log, and so isn't saying the person was slowed down (which you may or may not have thought, but just verifying).


Actually they were slowed down:

"Thus, if a participant completed the plain version of a task in 60s, and the highlighted counterpart in 30s, the time advantage for that task is 60s/30s = 2"

The part of the range below 0 corresponds to time advantages < 1, i.e. it took longer for the highlighted part than the plain one. Several times longer, if the results are to be believed (there's one around -0.7 and three at -0.5; that's a "time advantage" of approximately 0.2 and 0.3, or 5x to 3x slower.)

Huh. You win this round. FWIW, I actually get pretty seriously tripped up for a moment when I am staring at syntax highlighting that is unlike the syntax highlighting I have become overly used to...

I think the only sensible conclusion we should draw is to find what works best for you and stick with it. :)

Syntax colouring is for children.


Crockford is always right.

He's joking, his proposed "scope highlighting" still has syntax highlighting.

What is good for children is most often also good for adults.

Of course it would be terrible if children were able to learn to code!

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact