This wasn't at some community college or anything. This was at Georgia Tech.
That's an extreme case, and I certainly am not saying that just because it happened in a respected engineering school, that makes it acceptable. But my point is that in entry level courses (like the ones where you'd be implementing a clip function), even professors grade on getting the job done. Code quality just doesn't enter the picture at that level.
The thing is, trying to teach good code directly is pointless. Your less bright students will accept the dogma and never actually understand how to apply it usefully. Your brightest students will see it as a bunch of useless bullshit that's holding them back.
If you want to teach good code, here's how you do it: Make a student write and maintain a large project. Make them keep it running for two years, while you make them add more and more features. Keep checking it against an automated test suite which they do not have access to, and grade them on its correctness. Give them the resources to learn about best practices, but never tell them they have to use them.
Then, at the end of two years, let them rewrite it from scratch. Then you will see a student who has learned the value of good coding practices.
Code auto-grading, at least at Coursera, is usually done by running comprehensive unit tests, which extensively test border cases as well. These test suites are often 5-10 times larger than the actual submitted code, and it is difficult to imagine anybody outside of this type of environment spending so much extra time designing (and testing!) test suites with 100% coverage.
Moreover, code submissions have to comply with (or implement, in case of Java) predefined interfaces. And some courses (e.g. Scala) have style checker output taken into account (20% of grade is decided by the style checker in the Scala course).
In summary, well-thought-out test suites and interface specifications demand well-designed code submissions; in real life, poor comments or sloppy expressions are a very minor nuisance compared to poorly designed interfaces and forgotten border cases.
I am mostly concerned with "soft" aspects. Just consider the case where a student has to define variables, but picks variable names in a language other than English, or where control flow in a submission is more convoluted than it would have to be. Those are the cases I discuss in the article.
Moments ago, someone left a very fitting comment on my blog:
"I am taking the edX CS169.1 course and I find that I will consistently have a "less than elegant" solution that the auto grader accepts but that I feel is sub-par. The irony is this class has a large BDD/TDD aspect and is teaching RED-GREEN-REFACTOR, but with an auto grader once its green there is little reason to go back and refactor."
If a fiction book has a great, gripping plot and interesting, relatable, wonderfully done characters, then weird spelling and heavy sentences are not a big deal and can be easily fixed by a competent editor. But nothing can save a very grammatical and clean-written text that is just flat, boring, or makes no sense at all. Ask any publisher - which kind of books they prefer?
Similarly in software, getting the big picture right is much much more important than "elegance" in each individual line.
Michael O.Church briefly talks about this in his article on startup culture:
Also, see the recent HN post, "Ask HN: I just inherited 700K+ lines of bad PHP. Advice?":
Lastly, in my article I highlight Open Office, which still has comments in German in its source code.
In our corporate environment we do enough to pass the tests, with one extra 'test' being a peer review which should take into account a list of criteria that aren't easy to check for automatically; house code style, test code coverage, future maintainability, g11n/i18n-ness, etc.
We often only go as far as 'just good enough' but the standard to which that is assessed is pretty high.
Try the (extensive!) programming challenges at http://uva.onlinejudge.org/
In almost all of the challenges the example input data is sized such that even the most naive algorithm will run within a second or so. When submitted the code is judged against much larger/more-complex sets of input that will catch out inappropriate algorithm choice, unhandled edge cases, etc.
If anyone has similar idea and like to team up. Please contact me.
Having been a teaching assistant who corrected programming assignments (and also a student), I always wondered how many of the students would read my comments, would go back to their solution and actually improve it. Probably none. If I (as a student) received a comment about a solution I submitted two weeks ago, I often didn't instantly know what the corrector talked about. I had to go back to look at my code. I'm not sure I always did that when I was busy. Additionally, I think even if I acknowledged the comment, I wouldn't actually go ahead and fix my solution.
I'm taking the scala course right now, and when I submit a solution and something is flagged, my thoughts are right in the code, I still have all the files open in vim, sbt running... so I can instantly go and fix it. And there is a real incentive to do that, because my score will improve.
For style problems that are really a detriment, students will drift toward better style to save themselves time when correcting and resubmitting.
I know that there is no easy answer for doing MOOC (massive open online course) in humanities, but, according to the web, Coursera's solution is not working very well and, what is more striking to me, Coursera doesn't seem to respond.
But again, I have no easy solution for grading essays in MOOC.
More information here:
We have a mixture of an autograder for functionality and human grading for style.
It's really important to get both. Our class uses a mastery model rather than grades, so you shouldn't move on until you've mastered an exercise, and mastery does not just stop at functionality. Style is included.
Making your code readable to other people is really important, and it can and should be taught and stressed even on small exercises.
At Stanford, code quality is half your grade in the first two intro classes because it's just as important that someone else understand your code as it is to just make it work.
I think the comprehensive grading of programs submitted for homework is good, but even if it is not perfect, in the 5 classes I have taken, the assignments helped me dig into the material.
I also like the model of letting students take graded quizzes more than once. I find that the time spent between the first and second time taking a quiz is very productive for improving my understanding of the material.
These classes are fundamentally superior to just reading through a good text book.
Don't get me wrong, I do agree with you that MOOCs are a boon. A well-structured course may be able to provide a better experience than working through a textbook on your own. Still, this doesn't mean that those courses live up to the hype.
MOOC is not going to replace formal education and I think the "limitations" mentioned are perfectly acceptable due to the issues of costs and incentives involved, e.g. In the Coursera's Scala course, there are more than 10K+ weekly assignment submissions, you must need a scalable assessment method. (The grader is not bad in fact, i.e. knows cyclomatic complexity, warn if you use mutable collections etc)
The code checkers give you immediate feedback with test suites that are more comprehensive than what students would (or could, in most cases) design themselves.
sure there's no professorial feedback on your code, but 90% of the time those comments you receive back on your printed out code will go unread. Not to mention the lead time, often as long as two weeks, from the time you submit to the time you receive back comments, often makes the comments worthless.
as for style, my Uni Intro to CS courses didn't check my style either. I find 6.00x and CS101 to be vastly superior in almost every respect.
finally, 6.00x and CS101 actually provide you with the "correct" answers after you've passed their tests with an adequate solution. I've a few times found myself hitting my head and thinking, "Why didn't I think of that! That's more elegant than my solution.", and going back and attempting to implement their solution. Try finding that in anything other than an online course.
Because for some problems they give a hint, e.g. "this is solvable with a one-liner", the student could figure out himself whether he is on the right track or not.
Line-length could also be checked by implementing something similar to a style checker. It could also check if some methods that are supposed to be used (e.g. min/max) are being used.
I guess the author is correct in that automatic grading is not perfect and it's never going to be as good as talking with someone more experienced... but it can go pretty far.
Having corrected programming assignments as a teacher assistant myself, I have to say that it can be a really tough job and an automatic grading system may give you more help. When I corrected an assignment, and there were a lot of issues with it, I would point out the most important ones. But an exhaustive list is really tough because there is time pressure. Also, I think it could be too demotivating for the student if he gets a list of 30+ issues from the corrector, I'd rather have him acknowledge the 3 most important ones.
1. Check style of a language
2. Run a comprehensive suite of unit tests
3. Static analysis of the code
These tools together can catch most problems of bad formatting, fragile code (cannot handle edge cases, errors, etc.), and structural errors. Additionally you could take into account some kind of performance of the code - does this solve this problem in a reasonable amount of time?
By using standard industry tools, one could do a good grading system that is entirely automated.
You were graded on complying to an interface, code style, correctness, performance and memory usage, with strict requirements on the latter two. You couldn't get away with, say, implementing a brute force solution and calling it a day, you had to solve the problem optimally.