
Ghent University Vocabulary Test - zhte415
http://vocabulary.ugent.be/
======
sbirch
I enjoyed reading some of their results. I see some methodological issues
here, however:

1) The test makes you distinguish between real words and a set of words
they've made up in some way. As others have pointed out, some of them are
pretty obvious non-words. I would expect the results to change depending on
the method used to generate non-words.

2) Measuring the performance of a binary classification is a well studied
problem with many metrics and approaches to quantifying performance
([http://en.m.wikipedia.org/wiki/Binary_classification#Evaluat...](http://en.m.wikipedia.org/wiki/Binary_classification#Evaluation_of_binary_classifiers)).
Subtracting the false positive rate from the true positive rate is not among
them. The final score is not a consistent estimator of the fraction of words
in their corpus you know.

------
ForHackernews
A lot of these were strange formations of regular words.

For example, I said I didn't know "symphonize and "discussible" because I have
never ever seen either form used anywhere. But obviously, I know the words
"symphony" and "discuss" so I can infer their meaning from the suffixes.

------
josephlord
I tried to only answer yes when I knew what the words meant rather than
guessing. I also said "bubba", "bumf" and "nonsuccess" weren't words which it
disagreed with.

Most of the other things that I didn't get were from biology. "lymphoid"
(guessed it might have been a word but hadn't heard it so entered no),
"dabchick" etc.

"bubba" and "bumf" are interesting ones as they ask the question of where the
language ends and local dialects and slang begin (or are local dialects and
slang within the language in which case an exhaustive list is impossible).

~~~
jameshart
Interesting, for me, to see 'bumf', because even though I recognize the word
from my English dialect (Southern UK, originally), I'd mark that form as a
nonword because I feel it's spelled wrong: I would write it 'bumph'. I have no
idea why I would feel so strongly about that, though, because I can't imagine
it's a word I've written or seen written down particularly frequently...

~~~
josephlord
I don't think I've seen "bumf" written down but I've certainly heard it a few
times.

------
sosuke
The non-words threw me off the first time because I wasn't expecting them to
be there. By the end of the test I felt like I didn't know enough English;
scored 89%.

~~~
will_work4tears
Yeah, same here, I got 89% of the real words, but messed up with 4-5 non-words
which lowered my score to 76%.

------
fchollet
Non native, 81%. I think the test might be easier for non-natives speaking
european native languages, as they cumulate etymological understanding from
several languages.

~~~
jonnathanson
I'd also guess that non-native speakers are less likely to say "yes" to fake
words. Non-native speakers are, presumably, more self-aware of the extents and
limitations of their English vocabularies. Native speakers seem more likely to
succumb to issues of vanity, overconfidence, or bet-hedging on this sort of
test.

------
reedlaw
The non-words are pretty easy to spot. On my first try I marked any word I was
uncertain of as a non-word. I got a 75% with zero false positives. I looked at
all the words that I marked as non-words and saw that most of them I had
suspected were real words. Then I tried it again with an increased confidence
and got 90% with zero false positives.

------
Patrick_Devine
Native speaker and did 77% and 94% the first and second times respectively. My
favourite "non-words" though would have to be "meedcave" and "cunstalize". I'm
not quite sure what "cunstalize" would mean, but I feel like I desperately
need a meedcave.

------
xchip
Non native, 73%. 0 non words.

accurse, glycol, propitiation, tumescence, landlocked, klystron, squab,
blithesome, lacertian, dingbat, gradate, adjudge, microsomal, latescence,
intercut, aviary, semis, vie, dollarfish

But given that Shakespeare wrote all his books with about 1k5 words this test
doesn't proof much :)

------
super_mario
Non-native speaker, got 93%. But English is almost like my first language. I
haven't spoken anything else for decades, even though I speak 3 other
languages, and think, dream and express myself best in English.

------
fecund
It seems I know 90% of English language words and it is extremely unlikely
that I know that many. I believe the numbers are off by a large margin. I
would love to look how they arrive at those estimates.

------
impy
Non Native: 69%. Took Iter as a non word. Been using that too much as a name
in programming I guess. Been too trigger happy on the 'f' key with a few words
I did know though.

------
debugunit
See also
[https://news.ycombinator.com/item?id=7949183](https://news.ycombinator.com/item?id=7949183)

------
didgeoridoo
Native; 93%.

Missed: rood, ceil, catchfly, slickens, tuberculation

Some of those non-words were gorgeous: costyhibbles, neatherden, quiffiness,
concodion

~~~
alexandros
Maybe they should spin their word generator off into a domain name finder?

------
Lord_DeathMatch
Can't seem to progress beyond the initial press yes or no screen, with no js
errors in the console. Odd.

------
yread
related [http://testyourvocab.com/](http://testyourvocab.com/)

------
djf1
It seems to me, that much personal information should be transmitted over a
secure connection.

------
nodata
Why did they make the interface so tricky? I think they're testing two things
here...

~~~
koffiezet
They also test response time, they want to make it an easy left/right decision
to record the time it takes to recognize a word.

------
fredley
Native, 84%. I missed 'pshaw', but 'myeah' is apparently a non-word!

~~~
theophrastus
same for me. however they declared "clead" to be a non-word and well,
[http://www.merriam-webster.com/dictionary/clead](http://www.merriam-
webster.com/dictionary/clead) makes one wonder how well their database would
score against the OED

------
bitexploder
90%. Native speaker. The non words were gems, blurishness... why did I say yes
to that?

~~~
Smaug123
But "gems" is a word - as in "gemstones".

~~~
Dragonai
Haha you may have parsed bitexploder's comment slightly incorrectly.

> The non words were gems, blurishness... why did I say yes to that?

Try reading that as:

> The non words were gems. Blurishness... why did I say yes to that?

------
judk
Apropos, the title spells Ghent differently from the domain name.

~~~
impy
Ghent is the English translation of the Belgian city Gent.

------
namenotrequired
Non native, 67%.

------
yread
for science! Non-native: 81%

EDIT: and 0 non-words

~~~
hornd
Pretty impressive! Native speaker: 69%.

Leaned on the side of `no`: 0 non word yesses at least.

~~~
pc86
Yes given the admonishment about "heavy penalties" at the beginning I hit no
on any word I was on the fence about (and accidentally when I got to
"triennially"). 71% as a native speaker.

------
sean_the_geek
non-native : 67% It's good to hear that I know 67% of English words.

------
anathebealio
I wish they had the statistics available for people who complete the test. It
would be interesting to look at how native speakers and non native speakers
differ.

------
coriny
OK, I failed this test before I even got to it. Apparently you are only
allowed to put one country in the "Where did you grow up in?" question. I
don't know why they aren't interested in the verbal abilities of the large
number of people whose parents weren't sedentary - e.g. military, mining
industry, diplomats, aid, disaster, mega-construction etc etc.

/whine over.

EDIT: Also, no way to communicate this flaw to them (I'm not on Twitter).

~~~
codeulike
Because having a multi-select plays havoc with stats.

~~~
coriny
Which is a terrible excuse for putting a fundamental flaw into the analysis.
It's building a big assumption into the model that's known not to hold true in
reality. Especially as the mis-represented cohort will, linguistically, be one
of the most interesting - likely above average intelligence and a very
different exposure to language. I suspect personally it's an oversight.

EDIT: Or, as is dawning on me, I might have missed a bit of sarcasm?

