
A System for Detecting Software Plagiarism - lightonphiri
https://theory.stanford.edu/~aiken/moss/
======
todd8
I took a brief peek at the paper cited by the article
([http://theory.stanford.edu/~aiken/publications/papers/sigmod...](http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf))
to see how the system worked. It uses hashes of n-grams of the submitted
programs to compare similarity.

My own experience with plagiarism came when teaching a university course for
CS majors. For most, the class was their second course in programming. While
grading an early programming assignment I noticed a program that reminded me
of one of the early ones I had already graded. Comparing the two programs
revealed that they were identical, except that the variable names had been
changed to protect the guilty.

I wonder if this system would have detected this case of collaboration.

~~~
lightonphiri
Yes, MOSS does that.

I have been TA'ing a first year Python Programming course since beginning of
this year; what you describe is the most common trick used by students---even
though they have been told time and again that this will be detected.

As an example, MOSS picked up 91% similarity---between two students---for one
of the module files.

    
    
      # Student 1
      at1 = input("Attempt 1:\n")
      at2 = input("Attempt 2:\n")
      at3 = input("Attempt 3:\n")
      maxi.append(max(eval(at1),eval(at2),eval(at3)))
    
      # Student 2
      attempt1 = input("Attempt 1:\n")
      attempt2 = input("Attempt 2:\n")
      attempt3 = input("Attempt 3:\n")
      maximums.append(max(eval(attempt1),eval(attempt2),eval(attempt3)))
    

I find it interesting that MOSS does a one-on-one comparison for each student.

------
skissane
I wonder, can MOSS detect cross-language plagarism? For example, if a student
is asked to submit an assignment in Python, what if they find a solution on
the Internet in another language (e.g. Ruby), then translate it into Python?
Assuming MOSS had a database of code samples taken from the Internet to
compare against (containing the solution the student used), could it detect
this?

------
pvinis
we used moss once a few years ago while i was a ta. it was interesting. we
actually found a few similar programs. we were also doing a 15min oral exam
with the person's code in front of us, so among the ones that has similar
code, we were figuring out who wrote the original code, and who copied, etc.

