the MBTI example and use of weighting only makes it more clear that this "case study" is dead from the start, because it shows it takes many questions just to ascertain someone's position on each eigenvector. For example, the most important explanatory personality trait is masculinity vs femininity. How do you ascertain that with one question and 8 answers? For example a question asking if you are male or female only tells you so much and not the whole story. none of the real underlying important explanatory variables can be directly measured with a question, or 8 questions.
And I highly doubt they have no matches in 10k samples, especially given the homogeneity of their audience. And if so then they are just measuring traits of personality that, while diversified, nobody cares about. Check out the birthday paradox. Even if there are 16M possibilities, the chance that two random people in 10,000 matched is much higher than you think.
If it were trying to match people based on their personalities and not their direct line of thinking - you would be correct.
Until the results speak for themselves (ie. accuracy of matches) you have no way of knowing if this is a complete failure or successful venture. Although you are free to speculate that it will be meaningless due to the methodology, you have no evidence to back that up yet.
Many dating sites work on a similar principle, although they all use weighted questions. Weighted questions for personality increase possible matches by narrowing down the preciseness necessary. The goal here is to decrease possible matches by requiring exact matches.
Dating sites want to match you with someone quickly who is "close enough" to be compatible based on a personality test and their user data of what answers pair well with other peoples' answers.
8x8 doesn't care how long it takes to find a match, they want an "exact match" to be compatible based on how you answer arbitrary questions.
8 questions was chosen because to have answers be "100% exact" are already rare. If they chose 50 questions - the chances of an exact match become increasingly rare as to possibly never exist in a persons' life time.
E:
The birthday paradox is 1/365, and I'm well aware of the statistics of someone out of a group of 20 random people sharing a birthday with another person of that same group.
They intend to look for matches offline with a few days delay. The count of people who have taken it looks like it is updated live.
It is quite possible that they have not tried to do the match online. Given their demonstrated competence, I wouldn't be surprise if it comes as a shock to them that they can simply put results in a flat file, sort, and then scan them for matches very quickly...
According to the Reddit page the number of tests and matches found is accurate.
There were 11,300~ tests when the page last loaded for me. They are caching the results of the tests and refreshing non-live time in an attempt to put less stress on the server.
There were 0 matches at that time. While I'm wary of site-counters - I'll give them the benefit of the doubt.
At 15,000~ there should be a match. So while pretty high at 12,000~ it's still in the realm of possibility that there is not a match, just very unlikely.
There's also the chance they haven't had the ability to check for matches because their servers been down so often so they haven't been calculating for them. So there is a chance that there is a match in the data but it hasn't been calculated and so wasn't showing when I checked at 11,300~
and it only gets worse as possibilities become less evenly distributed! we are approaching 100% probability this is all bullshit and won't work at all.
And I highly doubt they have no matches in 10k samples, especially given the homogeneity of their audience. And if so then they are just measuring traits of personality that, while diversified, nobody cares about. Check out the birthday paradox. Even if there are 16M possibilities, the chance that two random people in 10,000 matched is much higher than you think.