
Lexical differential highlighting instead of syntax highlighting - based2
https://wordsandbuttons.online/lexical_differential_highlighting_instead_of_syntax_highlighting.html
======
dwheeler
I understood the problem, but I found the page's explanation a little
confusing at first. In particular, "lexical differential highlighting" misled
me, because the word "differential" made me think that his algorithm was
comparing lines or tokens in some way, and it doesn't do that.

Basically, this algorithm tokenizes the source code, and tries to color each
token so that _identical_ tokens have the same color, but _similar-looking_
tokens have very different colors. When tokenizing it specially handles
comments and quoted text.

That's an interesting approach to countering errors from "it's almost the same
but I didn't notice they were different". I wonder - if I were trying to
review source code that were malicious, maybe I could vary the color algorithm
using a random source so that the source code writer couldn't make different
tokens look similar in color. That might be an interesting countermeasure to
some kinds of underhanded code.

~~~
saagarjha
Yeah, I thought this would do something like highlight all "mov" derivatives
the same way and was somewhat surprised at the brevity of the code at the
bottom…

------
kazinator
This idea is related to "rainbow parentheses" (e.g. for Lisp): different
levels of parens just get arbitrary different colors. But matching parens are
the same color, just like two occurrences of %ecx in the same line are the
same.

~~~
andrepd
It's legitimately one of the best features of Excel. Does anybody know how I
can achieve that in Sublime? The few options I found were subpar.

~~~
kaibee
Don't know about Sublime, but there's a plugin that does this for Visual
Studio.

[https://marketplace.visualstudio.com/items?itemName=TomasRes...](https://marketplace.visualstudio.com/items?itemName=TomasRestrepo.Viasfora)

Probably not helpful to you, but maybe some other lurker.

------
fake-name
There's a sublime text package that does this for a bunch of different
languages: [https://github.com/vprimachenko/Sublime-
Colorcoder](https://github.com/vprimachenko/Sublime-Colorcoder)

I'm not involved in any way, I just ran it for a while at one point.

~~~
guessmyname
> _There 's a sublime text package that does this for a bunch of different
> languages_

You don’t need a package for this, Sublime Text 3 already does this
automatically [1].

[1]
[https://www.sublimetext.com/docs/3/color_schemes.html#hashed...](https://www.sublimetext.com/docs/3/color_schemes.html#hashed_syntax_highlighting)

~~~
nh2
How can I use it?

The simplest way seems to be to use the "Celeste" color scheme which
implements this. Is this the only way? I'd like to use a dark theme, like the
default Monokai.

~~~
guessmyname
Yes, “Celeste” is the only theme with support for semantic highlighting.

For dark mode, I use this project —
[https://github.com/cixtor/monnokay](https://github.com/cixtor/monnokay)

------
gpspake
I remember Doug Crockford mentioning the idea of scope based highlighting for
JavaScript in a workshop years back and thinking it would be useful. Cool to
see it pop back up here.

Edit: Here's a scope based js highlighting repo that cites Crockford as the
inspiration but unfortunately he posted the linked description on Google+
so... uh... oops [https://github.com/azz/vscode-
levels](https://github.com/azz/vscode-levels)

------
zokier
Complete tangent but one thing that I've wondered about modernish asm
mnemonics is how complex they are, and especially how much type information
they encode in a semi-structured way. Taking the authors example of PMULHUW,
the core operation is MUL(tiply), P for packed integers, H for high result, U
for unsigned, and W for word sized (16 bit). I feel like there must be a
better way to express the same thing that wouldn't lead stuff looking like one
word all caps alphabet soup. I don't know exactly what that would be, spelling
out everything would probably make assembly way too verbose. So some sort of
middle ground would be nice.

~~~
chc4
> I feel like there must be a better way to express the same thing that
> wouldn't lead stuff looking like one word all caps alphabet soup.

Yes, that's called a programming language :^)

Assembly is usually essentially a macro engine over the actual instructions
you are emitting for your processor, and the Intel x86 chip manuals or
whatever you're targeting use the outrageously long proper names, so your
assembly will too. Heck, the author mentions specifically _reading_ assembly
too, so knowing what you're reading is 1:1 with the actual instruction stream
is helpful, no matter how bad the official names are.

 _Actual_ programming languages just abstract away some complex instructions
like SSE vectorizing (which have famously terrible names) to some high-level
API and intrinsic functions. And you should too.

~~~
zokier
> the Intel x86 chip manuals or whatever you're targeting use the outrageously
> long proper names, so your assembly will too.

I don't see why that has to be the case; why I'd must use Intel specified
mnemonics instead of my own syntax? While not as radical, the att vs intel
syntax demonstrates that the vendor syntax is not the only option. As long as
the syntax captures all the details of instructions to be completely
unambiguous then it should be perfectly interchangeable.

I specifically do not desire higher level of abstraction because I want to
maintain that 1:1 relation with the actual machine code. Heck, even Intel
mnemonics do not truly have 1:1 relation to machine code, because the
instruction (encoding) can depend on operand types.

------
lifthrasiir
[1] was a similar idea where color is determined by the prefix, so for example
`currentIndex` and `randomIndex` are distinguished from each other but
`currentIndex` and `currentIdx` are not.

I'm not sure about both because, i) there are only a handful number of
mutually distinguishable colors ([1] does mention the same complication), ii)
we often want to highlight both the similarity and difference among
identifiers and the cutoff is not clear. For i) we may want to leverage more
formattings; for ii) I really don't have a good solution.

[1] [https://medium.com/@evnbr/coding-in-
color-3a6db2743a1e](https://medium.com/@evnbr/coding-in-color-3a6db2743a1e)

------
css
Wow, this actually looks amazing for math (though it seems to be stripping out
a lot of the code I pasted in):
[https://i.imgur.com/Iur9FgK.png](https://i.imgur.com/Iur9FgK.png)

How difficult would it be to implement this as a VSCode extension?

~~~
petschge
This looks pretty good, but notice how it does not split
"log(difference_squared" into two tokens. Adding '(' and ')' as delimiters
should fix that.

~~~
css
Good point. That helps, but it still strips about half of the lines of my code
out for some reason. Specifically, this part:
[https://i.imgur.com/L117fYm.png](https://i.imgur.com/L117fYm.png)

------
panopticon
Tangential, but "Just as every other piece of code on Words and Buttons, it's
properly unlicensed." reads like the code is literally unlicensed and not
using the Unlicense license.

It's a little weird to me because unlicensed code is very different than the
Unlicense license.

~~~
ChrisSD
And I'd add that CC0 is more "properly unlicensed" than Unlicensed is. Or at
least more thoroughly so.

------
canadaduane
I think this is also called semantic coloring. Visual Studio Code has it on
the roadmap to try this year:
[https://github.com/Microsoft/vscode/wiki/Roadmap#editor](https://github.com/Microsoft/vscode/wiki/Roadmap#editor)

~~~
jcelerier
KDevelop has pioneered this a decade ago :
[https://zwabel.wordpress.com/2009/01/08/c-ide-evolution-
from...](https://zwabel.wordpress.com/2009/01/08/c-ide-evolution-from-syntax-
highlighting-to-semantic-highlighting/)

~~~
gmueckl
Ecliose also has had this for ages at this point. I don't remember when they
introduced it, but when you can memorize the meanings of all the colors, it's
great.

------
m0zg
I'm not a fan of this approach in general, but I am a fan of highlighting
instructions from different subsets in different colors in asm, and perhaps
differentiating the saturation by latency/throughput. I.e. a "heavy"
instruction should probably be bright, urgent red, whereas loads, stores,
adds, bit ops should probably be more muted.

------
IshKebab
Something like this is implemented in vscode-clangd. I used it for a bit but
it's just too colourful. There are just colours everywhere and it's
overwhelming. I went back to normal syntax highlighting.

------
KuhlMensch
Curious. I mean it sounds like relying simply on contrast rather than the
structure. I know our visual system is insane at contrast, and we, as humans
tend to group tokens as a shorthand.

What mades me immediately pause, is when I reflect reading javascript: How
often do I scan past 3+ lines using colour as my "bridge"? As far as I can
remember, not often. Maybe I've overestimated colour-to-lead-me-through-
structure. Maybe it is often, colour-to-give-me-token-rhythm. Curious.

I'll have to remember to load up CSS or a test suite (with lots of framework
calls) using this approach.

------
SilkySailor
I really like this idea. I always wanted to try to take this to insane levels.
For example, for large code bases have different images associated with
different modules. So that your brain has more things to latch on to. e.g.:
This function from the banana module is calling the teddy bear module. It
seems a bit absurd since there is no correlation between the image and the
module functionality but I still want to try it.

------
stochastimus
This is really cool. It kinda looks like rainbow salad, but who cares? For me
at least, it is much easier to visually parse.

------
DarmokJalad1701
Nice to see some MASM32 code in there in one of the examples. That's from a
WIN32 app if I am not wrong.

Brings back memories.

------
FrancisNarwhal
Oh my god this would have saved my bacon two days ago. p_value_default is so
visually similar to v_value_default that after sitting there with another
developer trying to figure out the problem for 30 mins we rewrote the whole
method.

Only the next day after the deadline pressure was gone did I spot the problem.

------
Avamander
I understand it in the case of assembly, but I don't think it'd work for
something like Python better than existing syntax highlighting. So it's nice
and I hope things like Radare or IDA adopt it where people even intentionally
make syntax highlighting nearly impossible.

------
ggm
I encourage the original author to find a way to talk about assembly coding in
the nuclear industry.

~~~
gcbw2
what do you expect to be different from your run-of-the-mill maintenance of
outdated industrial automation gig?

~~~
YeGoblynQueenne
At a guess, an increased probability of causing a criticality accident as a
result of getting a program slightly wrong.

------
pcwalton
In this particular case, the highlighting is a clever workaround for the fact
that x86 register naming conventions are awful. RISC architectures tend to
number the registers, which makes things significantly easier to read.

------
m463
Not code, but I'm surprised that email clients don't have better colorization
from the getgo.

I think it would be the single best thing to help a huge amount of people.

------
gnuvince
There are too many colors in too many places. Everything is highlighted and
nothing stands out.

~~~
galaxyLogic
I agree. Rather than rainbow the brackets I think a better solution is to
highlight the matching brackets with a temporarily different color as user
moves the cursor.

Or at least make it easy to turn the rainbows on and off.

------
Analemma_
> In 2013 I was working in nuclear power plant automation ... the job required
> reading a lot of assembly code.

Does anyone else find this _terrifying_? Nuclear power plant automation should
be done in the safest of the safe languages. I would be alarmed at the thought
of stuff like this being written in C, never mind in assembly!

~~~
pvg
Systems like that tend to be designed with different kinds of safeties. A
mildly silly example - your typical Rails app doesn't have a watchdog timer,
your toaster probably does.

~~~
okaleniuk
An excellent example!

------
splittingTimes
Does something like this exist for Java eclipse?

