> a big question here is how wise it is for the US education system to be based so heavily on multiple choice questions
There was a great blog post I read a while back on redesigning multiple choice tests to allow the student to indicate the "confidence" of their response, with a more confident answer being rewarded/penalized more heavily than a response with low confidence. This allowed for a statistically better sample of how well the student learned the material.
I thought for sure the post was written by Scott Aaronson, but I haven't been able to find it despite extensively searching his blog, so maybe it was someone else.
If I'm taking that test, why would I ever give a confidence estimate other than 0% and 100%? If I think I have a positive expected return, it's worth the risk to go for max points. If I don't, then I don't want to lose less points, I want to not answer at all, or equivalently give it zero weight.
Assuming the sibling to your comment linking to Terrence Tao is the correct one, the resolution is to harshly penalize being confidently wrong. The points are proportional to log(2p) (where p is your subjective probability of being correct, so you can theoretically lose all the points ever by saying your confidence is 100% and choosing the wrong answer.
That nightmare test design, even being the post they meant, doesn't fit the description I replied to. You are not ""allowing"" the student to state their certainly if you make the default penalty for a wrong answer either minus 4 or minus infinity. You are forcing a huge change.
Mean squared error is a function used in linear algebra/machine learning/statistics to optimize estimation functions. To calculate, you take the difference between the observed result and the “guess” of your model, then square it to get your loss.
In the case of taking a test, let’s say you’re answering a true/false question, true represented by 1 and false represented by 0. Let’s also assume you have no idea which one is correct, it’s a coin flip to you.
If you choose True, 50% of the time, the correct answer is true and you’ll have 0 loss, because (1-1)^2 is 0. The other 50% you’ll have (1-0)^2 is 1.
So your expected loss is 0.5(1)+0.5(0)=0.5
On the other hand, if you guess 0.5 (true with a confidence level of 50%), then 100% of the time your error is 0.5, and your mean squared error is 0.25.
In other words, you minimize your expected loss by guessing your true confidence level. This can be mathematically proven to work for any confidence level.
This could be adapted to multiple choice questions by treating each option as a true/false question.
Correct. Or for more complex questions you could feasibly model it as guessing a point in multi-dimensional space. For example, if the answer to a question is a single word, and your loss function is the mse of semantic similarity between the guess and the true answer, but you, the student, think it might be one of 3 words, you could take a weighted average of the vectors representing those words in latent space of a large language model to minimize your mean squared error, where each of the weights is your estimation of the probability of that word being correct.
Sorry was that’s very wordy but hopefully you can get the point.
An easier to understand, but perhaps less sensible example would be to do the same thing in a quiz about arithmetic, so 5+5=9 and 6+2=7 is less wrong than 5+5=10 and 6+2=1.
There was a great blog post I read a while back on redesigning multiple choice tests to allow the student to indicate the "confidence" of their response, with a more confident answer being rewarded/penalized more heavily than a response with low confidence. This allowed for a statistically better sample of how well the student learned the material.
I thought for sure the post was written by Scott Aaronson, but I haven't been able to find it despite extensively searching his blog, so maybe it was someone else.