
Myers Diff Algorithm – Code and Interactive Visualization - robertelder
http://blog.robertelder.org/diff-algorithm/
======
gefh
I don't want a diff algorithm that finds the mathematically smallest edit
distance, I want one that preserves meaningful chunks of code. I never delete
the closing } from one function and then the entire function following it
except its last }.

~~~
taeric
This is a bit of a holy grail, that in practice isn't actually that important.
[https://stackoverflow.com/questions/523307/semantic-diff-
uti...](https://stackoverflow.com/questions/523307/semantic-diff-utilities)
has a good overview of some of the possibilities.

Makes for amazing demos. Might some day be what we are all using. My hopes
have been dampened severely by how long this has taken to make any head way.
(And, honestly, it couples your diff tool to the rest of your tool chain
rather heavily. I can see why it has a hard time taking off.)

~~~
zaptheimpaler
As a newbie to how diff-ing is done in production, does git use vanilla edit-
distance diffing too?

~~~
peff
Git uses Myers diff, but recently added some heuristics to "shift" the hunks
in semantically meaningful ways. See [https://github.com/mhagger/diff-slider-
tools](https://github.com/mhagger/diff-slider-tools) for the experiments that
led to this feature.

It also supports a few other diff algorithms: `--patience` and `--histogram`,
but in my experience they produce the same output as Myers in most cases.

------
rajathagasthya
Another explanation of Myers diff algorithm that I found interesting:
[https://blog.jcoglan.com/2017/02/12/the-myers-diff-
algorithm...](https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm-
part-1/)

------
fny
My God that code is terrifying. It looks like output from a code obfuscator.

~~~
Boxxed
Pretty sure this is the same Eugene Myers I had worked with a bit in the past
-- he's the kind of guy who took, "Wow, this looks complicated," or "I'm
having a hard time following this," as the highest sort of compliment.

------
pbnjay
This algorithm is actually really useful in genomic sequencing - daily usage
for me right now. Cool to see it pop up here, although I think it's less
generally useful than other edit distance methods.

------
enriquto
Wow! this is a very beautiful code and website

\- single letter variables, as in the rest of mathematics

\- succinct without being obfuscated

\- the _whole_ thing fits on your screen without need to scroll (no
dependences, no hidden pieces)

\- the complex algorithm is clarified in the text below, breaking it into
smaller parts and explaining each one in detail

\- comprehensive examples, figures and references

This is truly a model of exposition.

------
sordina
Seems like this would have been super useful when I was creating a Levenshtein
transition effect:
[https://jsfiddle.net/kdqost0z/2/](https://jsfiddle.net/kdqost0z/2/)

Not that efficiency really matters much for my use-case!

------
jgalt212
I always thought finding the optimal diff (from the user perspective) was an
NP Complete problem.

~~~
Someone
I would expect that to be an _unsolvable_ problem, because there will be
plenty of cases where not all users agree on what is the optimal diff.

------
s_bywater
Wweereee

------
taeric
Haven't read it yet. Curious about the implication of the title. Expecting
article to be that it could be asymptotically faster, but practically slower.
:) (That probably says more about how I choose to read headlines than anything
else.)

Edit: Read the article now. Incredibly happy to see that the implication was
on purpose. :)

