
The Myers diff algorithm that is used in Git (2017) - dgudkov
https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm-part-1/
======
ufo
By the way, git also comes with other diff algorithms which may give cleaner
diffs than the default one

[https://gist.github.com/roryokane/6f9061d3a60c1ba41237](https://gist.github.com/roryokane/6f9061d3a60c1ba41237)

Those alternative algorithms can be requested with command line flags like
"git diff --histogram" or "git diff --patience". The default algorithm can
also be changed via the git config file.

~~~
GolDDranks
Patience in this example looks a lot better for what I consider a good diff to
do a code review with. Is there any settings you could toggle in GitHub
Enterprise you could enable this diffing in the UI?

~~~
JoshTriplett
Note that the same user also has a gist showing a case where myers gives a
better result:
[https://gist.github.com/roryokane/79b8ebcb3813ebd934c4](https://gist.github.com/roryokane/79b8ebcb3813ebd934c4)

~~~
kohtatsu
Feels like there ought to be a cute scoring algorithm to determine which to
use. It is a pretty important UX to get right IMO.

I reckon it'd be nice to include which to use in the commit too.

~~~
swinglock
It'd be even nicer if git could figure it out.

From those two examples only, looks like fewer hunks are better, but so are
smaller hunks, at least when the number of hunks are equal.

~~~
GolDDranks
I hazard a guess that there doesn't exist a simple heuristic (such as "number
of chunks") that always does the right thing. I think that to achieve better
diffs, the diff tool should have some understanding over the structural
properties of the language – but that would, of course, mean that the diff
tool should be different for each language. (Or more like, there should be a
parsing frontend that parses a "labeled tree" structure, and then a generic
diffing algorithm over those kinds of trees.)

~~~
yencabulator
git's diff already has language-specific configurability, for example
xfuncname sets a "function declaration" or similar header (e.g. markdown
heading, org-mode section) that is shown in a diff.

Similarly, one could imagine a language-specific regexp that gives "anchor
points" such as declaration start, where one would prefer that a single diff
hunk doesn't mutate things across that anchor, if it doesn't have to.

------
jdsully
The one thing I really miss from perforce/source depot was seeing the original
code block when performing a merge. Its so much more difficult to merge just
seeing both new sections.

~~~
matvore
For diffs that include the common ancestor, add this to .gitconfig:

    
    
      [merge]
      conflictStyle = diff3

------
lawn
I'm currently going through his book, Building Git, and doing it with Rust.
It's a lot of effort but it's very good.

------
mitchtbaum
for diff algorithims in Rust, see
[https://nest.pijul.com/pijul_org/pijul:master/6afda19ba69307...](https://nest.pijul.com/pijul_org/pijul:master/6afda19ba693071c0f)

for a Rust diff library crates, see
[https://github.com/changeutils](https://github.com/changeutils)

------
dominotw
hey I got this 'edit distance' question in my FB phone screen last month

[https://leetcode.com/problems/edit-
distance/](https://leetcode.com/problems/edit-distance/)

graph search + dynamic programming to pass the phone screen.

Ofcourse the consideration here just the smallest number of edits.

------
DEADBEEFC0FFEE
I always wonder how it was done.

