Heise Media states that GitHub has also responded, but I cannot find the statement . According to [0] they say that they did not controll for other factors.
They also published research that makes actually the opposite claim [1]
This is something I have noticed in practice, copilot and other AI coding tools help you solve your use-case faster because they see the pattern(s?) in your open files and use that.
While I was working at Facebook, one of the unsaid rules that was drilled down upon me was to avoid code repetition, copilot does not go the extra length and tells you: "hey, these 2 code blocks are very very similar, why not just create a common function?" its on the developer to take care of these things.
I do love copilot (saves me from writing a lot of boiler plate code) but it not going the extra step, we probably need another abstraction (ha!) on top of the copilot generated code to fix this behaviour with AI generated code, or just stop using copilot and go back to writing code with LSP.
I do think its solvable tho, and we will get there sooner rather than later.
I have not used AI generated code in professional settings (due to the institutional reasons) but I have thought about this and had postulated that there is high risk that the code will become more bloated and will lose architectural cohesion. It is interesting to have this hunch backed by some data.
Now this is not necessarily a bad thing. I have seen a lot of man made code that is trying to follow the DRY principles and the result is horribly tangled mess where you have no idea what gets when executed by just looking at the code.
There is also hope that AI summarizing ability could help to gain a better systematic understanding of the code. Perhaps it's generalization capabilities could even help to refactor the code and build up a good test suit.
But this is to be seen to happen and so far a better awareness is an advantage.
"We examine 4 years worth of data" then they extrapolate (using two data points?!) to get five data points.
Can you call this data science? Yes. Can you call this stupid? I would argue for yes.
Heise Media states that GitHub has also responded, but I cannot find the statement . According to [0] they say that they did not controll for other factors.
They also published research that makes actually the opposite claim [1]
Tbh I wouldn't trust either study.
[0] https://www.heise.de/news/Schlechte-Code-Qualitaet-durch-die...
[1] https://github.blog/2023-10-10-research-quantifying-github-c...