Hacker News new | past | comments | ask | show | jobs | submit login
CMU Computer Systems: Self-Grading Lab Assignments (2018) (cmu.edu)
206 points by georgecmu 3 months ago | hide | past | web | favorite | 46 comments

Former student and TA here. There is so much great stuff in this class and its assignments! Couple of highlights:

- Data lab formally verifies all of the submissions and gives you an exact failing test case if the submission isn't correct.

- Malloc lab is run competitively. You don't need to compete to do well, but you do if you want to do great. And the competition drives students to great feats. I know people who rewrote their inner loops in assembly to get better perf. Another built splay trees to get lower fragmentation. They both had comfortable A's in the class at the time.

- The automated grading of assignments meant that the class could scale to many hundreds of students while the TAs could dedicate their time to office hours. Turnaround time for exams was sub-one-day.

- The instructors are wonderful (and well dressed!) people and they ran the class expertly.

> Turnaround time for exams was sub-one-day.

What's the current status on accurately grading mistakes? From the exams I remember (Fall 2015), the exams autograded perfect answers well, but gave similar marks to nearly correct, but still wrong answers as complete gibberish. I felt that unless I had the answers exactly correct, it was hard to demonstrate my knowledge of the topic.

When I was studying software engineering, "nearly correct" answers would get you the same grade as "complete gibberish" on purpose. Partially correct programs would have no more value than completely incorrect ones.

In retrospect, I think it was crucially important for students to learn the price of their mistakes. Most of my previous classmates are reputable engineers now.

I think this is only fair if the exam takes place under similar circumstances to actual engineering (which is rarely the case).

Now there is a regrade system that lets you ask for more partial credit if you feel that the autograder was too harsh. That being said, its still relatively easy to get a lot of points off for a small mistake :(. You definitely have to be really careful during 213 exams but thankfully there's plenty of time to redo all the questions 2 or 3 times.

At UNSW we had a similar system. The marking was a test suite, and each passing test would give you a portion of your grade. The system had a couple of simple smoke tests which were ran on submission time (to catch issues that would make the entire test run fail erroneously). As a TA I spent a fair bit of time fixing trivial issues - eg, sometimes a student would get 3/10 on the auto marker with a basically working solution because of a typo or something. I would dock a mark, fix the bug and rerun the tests - leaving the student with 8/10 or something.

When did you TA?

I was TA for 1911 (I think?) around 2008-2009 and don't recall any automated grading system. It would have been so nice too, as labs would end up with students trying (struggling!) to complete the lab, get the lab graded, and ask questions about lecture material they hadn't yet understood.

The grading would inevitably take up almost all the lab...

I was a TA around 2004-2005. (I started in 2003 and graduated in 2007).

We didn't have automated grading for the labs then either, but we did for all the assignments. Still manual marking on style though, which to this day I have complex feelings about.

Ok, I think it was fairly similar then. Assignments were graded automatically (based on passing test suites essentially) but individual labs hadn't been set up the same way.

With style I always felt it was easy to say if it was bad or good, but hard to justify sometimes; 'good style' often just means 'I can follow this easily as I read it' and that becomes subjective based on how you like to read code.

Even so, it was always obvious when a student had spent time on their program, and gone back over the code to rewrite or clean it up - those tended to get the good style marks.

University is fundamentally a social thing, both between peers and between students and knowledgeable people like the professor.

>- The automated grading of assignments meant that the class could scale to many hundreds of students

When you have things like this why are we paying so much?

That social thing exists sometimes, but I think it's overstated. Consider, for example, the many students whose only direct interaction with the professor is to try to get an extension on an assignment (arguably late because homework was encroaching on their hangover obligations).

You can think of ways with simple technology and social organization to scale some version of the mythical social thing to much larger numbers of students, and to be available to anyone who will use that sincerely, not some lottery of acceptance and whether/how you can manage to swing the huge tuition (possibly taking on crippling debt for the rest of your life).

I look forward to my degrees becoming officially worthless, with everyone having the same or better education easily available to them. We're already getting there, but we need to keep going in this direction, and to dispense with pretense. Things like better scalable feedback help.

> I look forward to my degrees becoming officially worthless, with everyone having the same or better education easily available to them

Lots of degree have become worthless, with the tuition fee keep going up........

Definitely was one of my favorite classes at CMU. Absolutely loved it.

I loved this class at CMU. Recently I did shell lab again in Rust (trying to learn Rust) and used the test scripts for sanity checks. 213 helps me to this day :)

I remember when my University copied a couple of these, I decided to go the full mile on the one where you exploit buffer overflow. I got myself a remote connection via execve, and was going to mess around, but soon realized I could also see and edit others' scores which could get me in significant trouble, so I had to make one log-scrubbing attack then stop. I'm surprised they never fixed the hole by containerizing it or something.

I ran an in-class test/competition once where students submitted C code to a web server running on my laptop. My laptop compiled and ran their code - then checked it against a test suite and graded it all live.

One student asked about security - and we got into a great discussion about permissions and what havoc he could and couldn’t pull off with the restricted user account. I encouraged him to figure out how to write a fork bomb - which he got working after the test. He was nervous about it because he didn’t want to get in trouble. With some reassurances he dove into API documentation and got it working. My computer totally died and needed a reboot. It was a great little teaching opportunity that I’m glad I didn’t pass up

I forkbombed our server once by accident. I normally code a snippet at a time, see if it works, then go a little bit farther. Unfortunately you don't really want to have a loop with a fork in it with no exit condition. Really failed to think that one through.

(in my defense you probably should have ulimits set on a server where students are going to be logging in and working...)

The web interface to our local cloud at university became downright unusable for a week before we found that the problem was that a server on the cloud used for teaching was getting ddos'ed.

Apparently some student set up a mmorpg server and then made some enemies...

Ah, how simple it would have been to run in a virtual machine.

How simple it would have been to miss this teaching opportunity!

If my laptop were fully protected from that attack, it wouldn't have been as much fun for my student to wreck my machine in front of everyone. And so he might not have done it, and my class would have missed out on a beautiful demonstration and discussion of practical security engineering.

CS:APP is hands down THE book that every person interested in the low-level stuff in computer systems should own. But heads up, anyone who is considering to buy the book - please get the North American Edition, NOT the global edition.

I made the mistake of getting the Global edition, because of its considerably less cost, and because I couldn't afford the North American one - it was only after that I checked out the book site, where the authors mention that the global edition is chock full of errors [0].

I don't blame the authors, nor even the people who were responsible for 'the generation of a different set of practice and homework problems'. I can get printing the book in B&W, reducing paper quality, and publishing as a paperback to cut costs, but it's baffling why the publishers compromise on the actual quality of the content itself.

Amazon is full of similar 'PSAs' about not buying the global edition [1].

[0] http://csapp.cs.cmu.edu/3e/errata.html

[1] https://www.amazon.com/Computer-Systems-Programmers-Perspect...

Ah 15-213, these labs were the most well designed assignments I ever had in college.

If I recall correctly, I believe the computer architecture course at USC (ducks) also uses these exact CS:APP labs. I loved them, it was fun realizing partway through some of the earlier labs which type of data structure they were having you walk. Decent prep for OS, if you really dig into the malloc lab.

13 years on, I continue to reference this class as the gold standard in lab design & execution. It’s the one class I continue to tell stories about to this day.

we all remember and respect 15-213 :)

My professor for my senior operating systems class who was a CMU PhD grad, gave us these assignments (with the permission of CMU). They were some of the most important learning experiences in my undergraduate career.

Didn't go to CMU, but I did take cs61 at Harvard extension last year (https://cs61.seas.harvard.edu/site/2018/). It too had some self grading assignments (time bomb). Thoroughly enjoyed that class.

What are the contents of the files behind the wall? Can anyone share those files?

Is there a way for non-CMU students to take complete these assignments and get scored? I worked through this book a few years back but would love the chance to do actual projects.

I worked through the book and did the actual projects, if I remember well, most of the grading logic can be run locally on your machine and is included in the projects.

Former head TA (Jack) here - 15-213 still remains incredibly foundational to the work I do for my job. Working with the professors of that course (who also wrote the CS:APP textbook, which I still regret never getting signed by them!) was an amazing opportunity and something I hope I never forget.

If you want some exercises to help you learn the foundations of computer systems, I honestly cannot think of a better resource than this!

When I took this class many, many moons ago, the close equivalent of the "cache lab" was graded not on the number of cache misses but on the cache hit ratio. When I pointed the problem with this metric out to a TA he blew me off. So I turned in the original code plus a loop that read the same volatile value a few billion times. It took a long time for the cache simulator to grade, but they dutifully gave me a very high grade (something like 200%) on the lab :-)

Former 15-213 student as well. This was one of the best CS classes I've ever had the pleasure of taking.

Also worth noting the plagiarism detector actually did its job, at least in some of the most egregious cases that I heard of.

These look fun; in particular the "Attack Lab".

Dockerfiles might be helpful and easy to keep updated. Alpine Linux or just busybox are probably sufficient?

The instructor set could extend FROM the assignment image and run a few tests with e.g. testinfra (pytest)

You can also test code written in C with gtest.

I haven't read through all of the materials: are there suggested (automated) fuzzing tools? Does OSS-Fuzz solve?

Are there references to CWE and/or the SEI CERT C Coding Standard rules? https://wiki.sei.cmu.edu/confluence/plugins/servlet/mobile?c...

"How could we have changed our development process to catch these bugs/vulns before release?"

"If we have 100% [...] test coverage, would that mean we've prevented these vulns?"

What about 200%?

What on earth would 200% test coverage mean?

Coverage for all the code that was written, and all the code that has yet to be typed, of course.

All thanks to quantum computing :)

⟨100%| + |100%⟩ = 200%!

(Even code with 100% branch coverage may have common weaknesses like those that these (great) labs have students exploit)

15-213 you beauty...

These would have been fun!

For the bomb one, are there protections for taking the binary and running it in an isolated environment?

> For the bomb one, are there protections for taking the binary and running it in an isolated environment?

Running it through a debugger is easy enough that this would be sorta unnecessary

You set a breakpoint at the phone-home routine.

My friends and I went the extra mile and overwrote the call to the server with nops.

Took the CSAPP course at Peking University. Easily the most valuable course through 4 years...

Another CMU alum here. Just looking at this gave me PTSD.

213, man. It's a killer.

Oh malloc lab.

Oh, good times. By the way, CMU's recitation class ppt for this lab is fucking hilarious: http://www.cs.cmu.edu/afs/cs/academic/class/15213-s08/www/re...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact