Hacker News new | past | comments | ask | show | jobs | submit login

I think I'd appreciate some sort of "semantic grouping" of individual changes more than drawing someone random line and classifying all changes below it as "trivial".

The problem is that even a lot of the changes that normally constitute clutter can become relevant in certain situations or even introduce bugs.

One example would be ordering of Python imports: Changing the order of imports should have no effect on program behaviour if all your packages are well-behaved - and in 99.99% of cases it indeed hasn't. But the fact remains that imports are statements that are executed and can have side-effects. If a package does something nontrivial during load, changing the import order can have effects. Hiding such a change could mask introduction of a bug.

Hiding changes can also lead to confusion if you are trying to understand a series of changes that are based on each other, or if all changes of a commit are hidden. I've had the latter situation with IntelliJ, where the working tree was shown as "unclean" but the diff was completely empty. Solution: The diff wasn't actually empty, IntelliJ was just set to hide the changes.

I think a more interesting solution would be to build a sort of "tree of changes": At the bottom, you'd have the individual changes in the file; one level up, the changes would be grouped into higher-level operations, such as "change formatting", "rename identifier", "remove field", "move function", etc. If possible, those could be grouped into even higher-level changes, such as "implement new class" or "extract expression into function", etc.




Agreed, I don't think the value of a semantic diff would be in hiding changes. Instead, the value should be in generating more useful diffs.

Normal diff often gets "confused" compared to how you'd logically identify the code. For example, if you extract a piece of a larger function as a smaller function, instead of showing that a piece of code was moved, it will show that you changed a header, deleted some lines, added others below, etc. A semantic diff should be able to refine these diffs in a better way, but shouldn't hide them. Even for the whitespace changes, I'd like it to show the diff, but the overlay to explain that only whitespace is different, so I know I don't need to look at it carefully.


I think the problem you'll eventually run into is figuring out intent from the diff. It seems like an easier version of reverse compiling.

When it comes down to semantic diffs I'm more interested in something like the Semantic Patch Language by Coccinelle. Being able to represent mundane refactorings across an entire codebase in a few lines seems great. And it unifies intent with the diff.


And just like that, another GPT-4 wrapper startup was born.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: