I get that this is the errors that are simple to analyze. Reference/value confusion would for instance be more interesting, but I guess that's harder to autodetect.
The first programming language I used had magic keys which produced whole keywords (eg. pressing "P" produced "PRINT"). Typos were impossible! If you still managed to type a line with a syntax error, the cursor changed to a flashing $ over the error and you literally couldn't submit the line of code until you'd fixed it. And that was on a machine with 1 kilobyte of RAM and a < 4 MHz processor!
It might seem silly, but the biggest impediment I can think of is copy-pasting code from elsewhere. If your variable names are different, the IDE should reject the code (since you shouldn't be able to commit variables that don't exist), but then what do you do? Open it in Notepad and "fix" it?
EDIT: structured haskell mode, complete with GIFs showing how it works https://github.com/chrisdone/structured-haskell-mode#feature...
Of course in Smalltalk, you do work with code-as-text, but then it becomes code-as-program (technically byte-code, with among other things a text representation). Does seem more reasonable than the anachronistic insistence on text->parse-to-AST->(whatever magic, and however many transforms the actual compile-part is).
Then again, we all know what a rich editor is. It's MS Word. And MS Word eats documents and depricates file formats.
On the other hand, I think Emacs Org-mode, Gimp images and the literate editor in Python, Leo[l] are examples of more pleasant "rich format" editors. It is a bit odd that what we take for granted for image files (edit history, binary format++) we fear in our IDEs.
See for example the Lisp 1.5 manual. Appendix E describes the handling of images:
> Overlord is the monitor of the LISP System. It controls the handling of tapes, the reading and writing of entire core images, the historical memory of the system, and the taking of dumps.
That's such a horrible expectation. Do we expect children to spell correctly right off the bat? Or do we teach them, slowly & patiently, instead? Or are you saying we should just let autocorrect be their guide?
No? Then why would programming languages be different. It's part of the learning process. And it's by far one of the easier parts.
D can arguably be mitigated by differentiating bitwise operators more them from their logical counterparts (for example by having the logical operators in plain English).
Many more are better handled at the editor/IDE level, but this seems like a really interesting read for anyone involved in PL design.
C broke BCPL syntax that was clear, memorable, and consistent and replaced it with a math-like syntax that must have wasted millions of person hours of debugging time in the decades since - for no good reason.
Similarly &(pointer address) and &&(logical and) are too close and too easy to typo.
Language design really should have considered human factors much more than it did.
Not to mention &(bitwise and).
if (x = y)
x += 10
The optional parens reminds me that this is a deliberate assignment. But as someone who now teaches Python, I thank the Python gods who disallowed such syntactic sugar. I've found it impossible to overestimate how difficult it is for beginners to grok "a = b"...and can you blame them, after years for math instruction telling them that that equals sign means something else? (Never mind the clusterfuck that occurs when trying to teach SQL -- which also uses singal equals sign for equality -- in tandem with a scripting language. The cognitive difficulty is so high that I've considered switching to teaching R, which at least has the optional arrow operator, "a <- b"
Moreover, good syntax highlighting or an IDE with some static analysis in it would help a lot too. I think that might be a useful thing to put in intro programming classes. Eclipse is free right? I use IDEA-based editors for most of my work, but even the syntax highlighting in, say, emacs without installing packages (at least on the OS/distros I'm familiar with) would go some fair distance to this goal. I assume the same would be true of vi.
To sad, that so many new programming languages chose to use exactly this syntax that is so error prone.
C was a very good programming language and the short syntax might have some appeal -- but for learning programming, this syntax is not the best option, as long as you don't use it as type of intellectual test to find the best computer-people ...
"The right thing for the wrong reasons".
(2) The authors assume every type error is unintentional. This may not be true: Consider transitioning from using a String to represent a number (eg., a command line argument), to a numeric type. This transition may be to check for errors upfront and to avoid parsing the number in multiple locations. All these locations will be pointed to by type errors, after the programmer changes the type.
What matters more is the mistakes that people continue to make even after they are not novices.
Knowledge about students’ mistakes and the time taken
to fix errors is useful for many reasons. For example, Sadler
et al  suggest that understanding student misconceptions
is important to educator efficacy. Knowing which mistakes
novices are likely to make or finding challenging informs the
writing of instructional materials, such as textbooks, and
can help improve the design and impact of beginner’s IDEs
or other educatoinal programming tools.
In other words, yes, understanding what types of errors novice programmers make can be very interesting and useful.
I had my language-designer goggles on, but you and the paper are right that educator goggles matter too.