Very fascinating. I'm excited to see where they go with these data. The final paragraph is where the money is.
In particular, I think it'd be interesting to track students over time. Do some clusters have more difficulty picking up later concepts? Is their submission, while correct, showing some systematic error in their mental model of the language or topic?
It'd be very cool to give qualitative feedback in addition to the quantitative unit tests based on these clusters. E.g., "Your code, while correct, is demonstrating characteristics that may be less maintainable than other submissions. In addition, we recommend a review of [some topic]; using those concepts would simplify your code."
From what I read, I don't believe it is. In my opinion it should also never be used for this purpose.
While you could probably catch a lot of cheaters this way, there is a possibility for a large false positive rate. If this is true then I would especially advise against deploying this type of software in a traditional university since the academic dishonesty policies can often cause significant and undue harm on an innocent student.
As an instructor of programming on a university level, I like to think that I have enough sense to know that particularly for "trivial" assignments, some similarity is expected. However, as I've encountered, a great deal of similarity over multiple assignments (and exams) between two students of the same nationality who sit together in class provides additional evidence of plagiarism.
So, yes, I agree a single data point of similarity is insufficient, but a history of similarity, particularly in complex projects, becomes more damning.
I got flagged as a freshman for "55% similarity" (whatever that meant) to another students submission in a "learn how to write shit in C++" type assignment. As far as I could tell, the only thing that triggered the software was the fact that both I and the other kid used do-while loops, while nobody else in the course did. The rest of the programs were semi-similar, just a few lines of cout/cin/<</>>/... to ask your name and echo it back.
So basically what I'm saying here is that I think "for "trivial" assignments, some similarity is expected" isn't always widely understood, to the detriment of students.
I think these sort of systems become most valuable when used to check work against work submitted from previous years to bust frat-house collections of answers, but varying questions year from year probably helps even more in that regard. Similarity between complex projects in the class sizes that were typical at my university (in classes advanced enough to have complex answers) was pretty easy to spot manually. Maybe edit-distance software is useful there to put some weight behind accusations?
"Now it's a very long time since I did fractal geometry..." Yep, same here I'm afraid. But A Pollock at the fundamental level is just atoms, which I would think is just another countable set. Still, research has been done into the fractal nature of his paintings. Apparently as he matured the HD dimension increased.
I'm thinking of real phenomena exhibiting a 'partial' fractal nature. I think you are thinking in the pure maths realm.
BTW I still don't like his paintings generally. Manet and Holbein are more my thing, or Morandi on certain occasions.
Read the article, it's not code edit distance, it's AST edit distance. If the extracted function was the same as the code inline, I think this would have little effect, though that might depend on the AST parsing.