
Higher math proficiency test: An extended-problem-solving skill evaluation tool - R3G1R
https://mathvault.ca/math-test/
======
SamReidHughes
I looked at most or all of the questions, before the test system went down.
(Cue rant about modern software.)

Some of the questions are quite poorly worded, and I'm highly suspicious about
one question with non-integral boundaries on a probability distribution
defined over the integers. Since I didn't get to see results, I can't say
whether they screwed up or not.

It seems like the question level was appropriate for students that took Calc
I/II and some course like Discrete Structures.

~~~
bloaf
Yeah, the wording is quite poor.

The second question about the distribution of random answers to the test is
literally incoherent. They probably _meant_ to ask a question about the
expected number of random answers on a randomly completed test, but they
worded it such that they are asking about the distribution of answers on a
single test.

~~~
virtuous_signal
>Assuming that every question in this 20-question test has only 1 right answer
and you choose to answer each question randomly, what is the chance that
you're so bad that you only get 4 questions right or less?

They probably mean for you to assume that 1) each question has 5 answer
choices, and 2) you select one of them with equal probability. But given that
this is question #2, assumption (1) is kind of a guess, and (2) should be
explicitly stated... (the answer btw has to do with the binomial distribution
- and you definitely need some heavy arithmetic, so are they testing your gut
instinct?)

~~~
SamReidHughes
The messed up question I was referring to, and I think bloaf was referring to,
is a later one, the second one involving random answers:

> Assuming that every question in this 20-question test has only 1 right
> answer and you choose to guess the answer of each question randomly, then in
> terms of the number of correctly-guessed answers, approximately 68% of them
> will fall between:

> A. 2.2 and 5.8

> B. 0.8 and 7.2

> C. 6.8 and 12.5

> D. 7 and 13

> E. None of the above

Another messed up question was the one early on asking whether the dot product
operator is "associative/commutative on addition" or not. That's not a thing
that actually means something.

~~~
tromp
I answered none of the above precisely because associativity doesn't apply to
an operator that maps 2 vectors to a scalar. But it still felt wrong.

~~~
SamReidHughes
Oh, I was parsing the responses like "(Non-associative, commutative and
distributive) on addition". And no, the dot product is not commutative on
addition. But it's meant to be "Non-associative, commutative and (distributive
on addition)".

------
eindiran
Who is this test aimed at? It mentions the math GRE, which would make me think
that it targets people applying to graduate programs, but it doesn't really
have that vibe and I imagine becoming a test that graduate programs would
consider is quite a difficult undertaking.

------
Traubenfuchs
I pretty much couldn't answer a single question. With a lot of time I could
have "bruteforced" a few, maybe. Getting something better than a F?
Impossible.

Soon I will have a MSc in Software Engineering... I simply memorized my way
through college math classes, just like most of my colleagues.

Math always made me feel dumb. I never had intuition for it. Deeper
understanding was always just out of reach. I truly loathe math.

~~~
ur-whale
Given what you just said, if I was you, I'd strongly reconsider my career
choice (if you ever plan to excel at it, that is).

------
tromp
The test fails to show you which answers (it thinks) you got wrong. Which
makes is pretty useless...

~~~
5bd35
The frontend actually receives all the quiz data with the correct answers from
the server, here is cleaned up version of it:
[https://pastebin.com/MF5TTBGr](https://pastebin.com/MF5TTBGr)

------
ogogmad
Haven't read the article (it's down) but isn't the best test for proficiency
in higher maths that you can actually _do_ higher maths? Why do we need these
proxies?

~~~
thaumasiotes
Cost of testing. You can get a good idea of someone's ability to do higher
math in an hour with an IQ test. If instead you give them a test of higher
math, you'd need to train them to do it first, which would take much, much
longer. The added accuracy of that test won't make up for the several-orders-
of-magnitude increase in cost.

~~~
ogogmad
Fair enough. But there needs to be evidence that this stuff actually works.
I'm reminded of the fact that Richard Feynman didn't have a high enough IQ to
get into Mensa.

~~~
barry-cotter
> But there needs to be evidence that this stuff actually works.

[https://www.gwern.net/docs/www/www1.udel.edu/63a495ef6b79bb4...](https://www.gwern.net/docs/www/www1.udel.edu/63a495ef6b79bb4ce508ed205546bcba826e42a5.pdf)

Why g Matters: The Complexity of Everyday Life

Personnel selection research provides much evidence that intelligence (g) is
an important predictor of performance in training and on the job, especially
in higher level work. This article provides evidence that g has pervasive
utility in work settings because it is essen- tially the ability to deal with
cognitive complexity, in particular, with complex information processing. The
more complex a work task, the greater the advantages that higher g confers in
performing it well. Everyday tasks, like job duties, also differ in their
level of complexity. The importance of intelligence therefore differs
systematically across differ- ent arenas of social life as well as economic
endeavor. Data from the National Adult Literacy Survey are used to show how
higher levels of cognitive ability systematically improve individuals’ odds of
dealing successfully with the ordinary demands of modem life (such as banking,
using maps and transportation schedules, reading and understanding forms,
interpreting news articles). These and other data are summarized to illustrate
how the advantages of higher g, even when they are small, cumulate to affect
the overall life chances of individuals at different ranges of the IQ bell
curve. The article concludes by suggesting ways to reduce the risks for low-IQ
individuals of being left behind by an increasingly complex postindustrial
economy.

~~~
est31
The "not much more than g" era in research has ended. TLDR: cognitive ability
is not just a single scalar value.

> For several decades, the question of whether measures of specific cognitive
> ability contributed anything meaningful to the prediction of performance on
> the job or performance in training once measures of general mental ability
> were taken into account appeared to be settled, and a consensus developed
> that there was little value in using specific ability measures in contexts
> where more general measures were available. It now appears that this
> consensus was premature, and that measures of specific abilities can make
> important contributions even if general measures are taken into account.

[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6526477/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6526477/)

~~~
ogogmad
The "g" hypothesis itself was built upon an abuse of factor analysis. An abuse
that had a high chance of producing something spurious.

~~~
barry-cotter
For a much, much longer treatment of the argument ogomad gestures towards,
read Cosma Shalizi[1] g, A Statistical Myth. Note that most of the criticism
applies as well to temperature or pressure as to g. For a detailed response to
Shalizi, see Human Varieties[2].

[1] [http://bactra.org/weblog/523.html](http://bactra.org/weblog/523.html)

[2] [https://humanvarieties.org/2013/04/03/is-psychometric-g-a-
my...](https://humanvarieties.org/2013/04/03/is-psychometric-g-a-myth/)

------
posix_compliant
Data point: 70%.

Background: Software Engineer.

I found this to be challenging.

~~~
5bd35
Just so you know, the test sends requests to the server on every answer and
sometimes these requests silently fail with 503 error code, you don't get any
points in such case, even if you answered correctly.

------
fazza99
looks like it's been slashdotted.

------
anonmidniteshpr
(The link will not load for me, as of writing.)

The US SAT I of the late-1990's was far too easy (I only missed one Q with a
mis-selected answer, but I'm no genius and didn't study or prep for it one
iota (or epsilon, as the case maybe)). Static, multiple-guess Q&A tests aren't
able to assess a broad range of orders-of-magnitude of capabilities because of
their various vulnerabilities.

Professional Engineer and physics tests that include fewer, open-ended
written-response problems that build on each other tend to be more rigorous
forms of domain knowledge testing. Some Q&A can be used as a first-pass
filter, but it shouldn't be relied-on how the US K-12 under NCLBA leans on
excessive multiple-choice standardized testing.

