For 9 of the 30 examples, syntax coloring resulted in poorer performance. In about 5 examples, much poorer performance. The experimenters showed no interest in finding out why. Yet they had eye-tracking data. Did syntax coloring attract attention to the wrong text for some problems? Which ones?
The fault lies in the simplicity of the test. Experienced programmers will not be slowed by lack of syntax coloring when parsing a simple example, but it is reasonable to think that they still benefit from it in larger, more complex codebases -- that was not studied here.
"Thus, if a participant completed the plain version of a task in 60s, and the highlighted counterpart in 30s, the time advantage for that task is 60s/30s = 2"
The part of the range below 0 corresponds to time advantages < 1, i.e. it took longer for the highlighted part than the plain one. Several times longer, if the results are to be believed (there's one around -0.7 and three at -0.5; that's a "time advantage" of approximately 0.2 and 0.3, or 5x to 3x slower.)
Huh. You win this round. FWIW, I actually get pretty seriously tripped up for a moment when I am staring at syntax highlighting that is unlike the syntax highlighting I have become overly used to...