
Ask HN: Why doesn't GitHub save just the AST, letting devs decide on formatting? - Eugeleo
Big companies &amp; repos have standardised code styles, big parts of which is formatter settings. Would it be possible to track only the AST on git, leaving the formatting of the code 100% on the viewer?<p>That would allow different devs from the same company view the code however they like it — it would separate meaning from style. And I&#x27;m sure working internally with only the AST would bring some more benefits I just fail to see at the moment.<p>Is anybody working in this space? Or is it just too much work to be worth it?
======
gvb
GitHub is based on git and inherits its features, assumptions, and quirks.

1) Git is implemented as line-orientated differences between text files. This
works well for any and every text file regardless of what the text represents.
If you switched to ASTs, you would have to create an AST-oriented diff
algorithm.

1a) You would have to write a language parser to generate the AST for every
language you wanted to support.

2) ASTs generally don't represent comments. You would have to make your AST
parser to understand and preserve comments.

2a) Often times comments are formatted in a specific way, e.g. using spacing
and line breaks, to help understanding. If you reformat the code, it may
destroy the information inherent in the comment formatting.

3) While there are "standardized" coding styles, there are innumerable corner
cases that the styles don't cover explicitly or that programmers "interpret
differently." Programmers tend to get unhappy if you reformat their code
_with_ ironclad justification and get really pissed if you do it _without_
ironclad justification.

So, yes, IMHO it is just too much work to be worth it.

~~~
Eugeleo
Thanks for the thoughtful comment. I think most of those things could be
solved, along the lines of

1) Just swap out the diff algo, keeping the rest of git intact.

1a) Is already done for most popular languages.

2)+2a) Save the comments verbatim, with whitespace. Shouldn't shake up the
normal AST very much.

3) No counterargument here, they would lose the cases that currently slip
under the radar.

But yeah, you're probably right... Too much work. Shame we didn't have this
from the start.

~~~
karmakaze
This is something I've thought about, not from the perspective of source
management but editor operations. The best way to handle semantic diffs is to
record the change semantically. E.g. rename field in struct. This only works
for languages that can exactly and reliably find/change all instances.

------
richardjennings
what would a diff look like?

