
Plagiarizing and Paraphrasing Code from an Online Class for Content Marketing - minimaxir
http://minimaxir.com/2015/09/code-of-plagiarism/
======
danso
> _But the willful and unnecessary paraphrasing of code, replacing all the
> CamelCase variables with underscored variables, and code which is worse than
> the original, is hard to overlook. “I recommend you for example this edX
> course” does not give sufficient credit to the teachers and staff of
> CS100.1x._

Anyone here have access to one of those code-plagiarizing detection systems
used at CS schools? How would it fare on this kind of obfuscation?

I know that a plagiarized program that tries to just futz variable names is
probably computationally easy to detect. And even extremely computationally
exhaustive comparisons would be feasible, because the average college only has
to deal with a few hundred to a thousand students a semester.

But would the algorithm even have to be more sophisticated than doing just a
plain diff and applying some heuristic, even just the ratio of identical lines
to total lines? My perception of students who would cheat in a 101CS class is
that they are so incompetent that they don't even know how to change a
variable name in a program without breaking the code...nevermind rearranging
the order of a loop operation. Any student who could confidently obfuscate
their plagiarism of 101-level program assignment is probably smart enough to
just do the program themselves.

~~~
stpe
At the Royal Institute of Technology (KTH), Stockholm a tool was developed
called KATTIS ([https://kth.kattis.com/](https://kth.kattis.com/)) that not
only check submitted code for plagiarism but also verifies it with test
input/output as well as measuring execution time/memory usage.

There is a paper with more details "Five Years with Kattis - Using an
Automated Assessment System in Teaching",
[https://www.csc.kth.se/~gkreitz/kattis-
fie11/](https://www.csc.kth.se/~gkreitz/kattis-fie11/)

The system is still developed and there is now an open version of it with a
huge problem archive (competitive programming style) used by several
universities: [https://open.kattis.com/](https://open.kattis.com/)

Trivia: KATTIS was originally very much developed by KTH's successful
competitive programming team "Three-headed monkey", consisting of some PhD
students
([https://www.csc.kth.se/~gkreitz/](https://www.csc.kth.se/~gkreitz/)) that
did their thesis on stuff like streaming and encryption which in turn resulted
in the company... Spotify.

------
imauld
> And more underscore variables?

Variable names and function names are supposed to use underscores and not
Camel Case in Python. I wish people who are writing tutorials teaching people
Python would at least glance at PEP-8.

------
noobie
I can't seem to get the page to load even though

    
    
        --- minimaxir.com ping statistics ---
        7 packets transmitted, 7 received, 0% packet loss, time 6223ms
        rtt min/avg/max/mdev = 443.694/691.180/969.051/196.041 ms

~~~
minimaxir
It should work fine after a reload or two.

I've been attempting to debug connection issues for a long time to no success.
Somehow, _curl_ works, but not _dig_ , which baffles me.

I _think_ it's because of the interaction between CloudFlare and GitHub Pages.
If anyone has a solution, let me know.

------
nl
I did the first of the Spark courses too. I'd second the recommendation,
though I did it without knowing Python and I got through OK.

Is it possible that the author wrote this without mentioning the course to
avoid giving away the answers to the course?

------
tekromancr
Alternate theory, there was never an obfuscation attempt. They simply ran it
through a code formatter. Still, it's unattributed, so I am not sure it
matters.

