Having had some experience teaching and designing labs and evaluating students i...

Huppie · 2025-11-25T08:34:47 1764059687

I know a teacher who basically only does open questions but since everything is digital nowadays students just use tools like Cluely [0] that run on the background and provide answers.

Since the testing tool they use does notice and register 'paste'-events they've resorted to simply assigning 0 points to every answer that was pasted.

A few of us have been telling her to move to in-class testing etc. but like you also notice everything in the school organization pushes for teaching productivity so this does require convincing management / school board etc. which is a slow(er) process.

[0] https://cluely.com/

bjt · 2025-11-24T23:16:47 1764026207

> Written response needs TAs.

Can AI not grade written responses?

kingstnap · 2025-11-25T01:04:16 1764032656

I tried that once. Specifically because I wanted to see if we could leverage some sort of productivity enhancements.

I was using a local LLM around 4B to 14B, I tried Phi, Gemma, Qwen, and LLama. The idea was to prompt the LLM with the question, the answer key/rubric, and the student answer. The student answer at the end did some prompt caching to make it much faster.

It was okay but not good, there were a lot of things I tried:

* Endlessly messing with the prompt. * A few examples of grading. * Messing with the rubric to give more specific instructions. * Average of K. * Think step by step then give a grade.

It was janky and I'll throw it up to local LLMs at the time being somewhat too stupid for this to be reasonable. They basically didn't follow the rubric very well. Qwen in particular was very strict giving zeros regardless of the part marks described in the answer key as I recall.

I'm sure with the correct type of question and correct prompt and a good GPU it could work but it wasn't as trivially easy as I had thought at the time.

reverius42 · 2025-11-25T08:31:41 1764059501

I would try it now with GPT-5.1.

Maken · 2025-11-25T16:19:34 1764087574

You still need someone to check the grades. AI can and will totally misgrade.