Section 2.1
Then the github repo also has wording around this:
> We double-verify manually that the grading of the test set is correct. https://github.com/idrori/MITQ/blob/main/index.html#L552
I agree it looks like this may not have actually been done given some of the questions and answers in the dataset.
Section 2.1
Then the github repo also has wording around this:
> We double-verify manually that the grading of the test set is correct. https://github.com/idrori/MITQ/blob/main/index.html#L552
I agree it looks like this may not have actually been done given some of the questions and answers in the dataset.