

The only brain teaser I use in interviews - MichaelGagnon
http://mikegagnon.com/2012/01/08/the-only-brain-teaser-i-use-in-interviews/

======
srean
What you broached open, intentionally or unintentionally, is a deep and a
fundamental problem in machine learning. One could attempt to write a program
that will guess (near) correctly. If the total number of hypothesis possible
is finite the optimal guesser should take no more than log of the the size of
the hypothesis class. The optimal strategy is to ask that example to be
labeled that would cut down the hypothesis class in half regardless of whether
the answer was true or not.

Things get more interesting when the hypothesis class is infinite and the
guesser is no longer in control of which examples to choose. What is known is
that if the examples are chosen randomly from the space of all possible
examples, more specifically if the random samples are drawn independently from
the same distribution, it is still possible to guess the correct hypothesis
even if the hypothesis class is infinite. There of course are a couple of
catches.

(i) The hypothesis class must have a finite VC dimension. What that means is
given any n sample there can be atmost a n^d possible different labelings that
can be induced on them.

(ii) the algorithm has to be allowed to be wrong on a certain non-zero
percentage of the sample space with some non-zero probability.

These two hedges can be made arbitrarily small, just that the minimum number
of training examples required will depend on how small they are. The reverse
is also true, if the hypothesis class does not have a finite VC dimension the
guessing problem is not learnable in this set up. This means one can come up
with a scheme in which the learner will almost always fail to stay within the
number of mistakes it was allowed. This learning problem set up is known as
PAC learning. It stands for probably approximately correct.

The question you ask is different because unlike in the PAC set up the guesser
has control over which example he/she wants labeled. Your setup is called
active learning. If I remember correctly the case for the infinite hypothesis
space is still open.

So in all fairness you do have to tell the interviewers what class the secret
hypothesis lies in. Or to turn tables I can easily construct a hypothesis that
you will not be able to guess.

An interesting case would be you tell the interviewee the token length of the
little schemer'ish lisp code that encodes the secret hypothesis.

Quite unbelievably enough, there is another framework of machine learning
problems where the choice of the example can be _adversarial_. Here of course
it is not possible to arrive at the correct hypothesis. What one tries here is
to do as best as the best of the "experts" you have access to. You of course
do not know a-priori who is the best. This framework is known as online
learning and is one of the most demanding, yet has simple algorithms that can
solve it for certain hypothesis classes.

------
FaceKicker
I like the spirit of the question, but it kind of bothers me. The fact is, no
matter what finite number of hypotheses you test before making your one guess
about the rule, there will always be an infinite number of possible "rules"
that could fit all the information you have so far (where a "rule" is a
function mapping triplets to booleans).

So in the example he gave, let's say I decide to try his "falsification"
technique as well. I ask about all the same 6 triplets he gives as examples,
and then decide that the rule is "ascending numbers". So I guess that. Nope,
that's wrong in this particular case - the rule was actually "ascending
numbers where the second one is at least as close to the first number as it is
to the third". As you can imagine, you could play that game forever. I don't
think his guess of "even increasing numbers" is any less reasonable than the
correct answer of "increasing numbers". Maybe the rule is "ascending integers"
- he didn't try falsifying that by guessing 3.14 in one of his triplets. There
are unlimited "patterns" he'd have to try falsifying.

I think the ideas behind the problem are good but you'd need to constrain it
in some way for it to be reasonable.

------
teaspoon
You're claiming that you can infer, just from a triple that I guess, whether I
am attempting to confirm or refute the "ascending even numbers" hypothesis.
But if I guess (2,4,6) and learn that it doesn't adhere, that's as much a
refutation as if I guess (1,2,3) and learn that it does adhere. (2,4,6) looks
like an attempted confirmation to you only because you have prior knowledge
that the rule is not _more_ specific than "ascending even numbers".

~~~
jeremysalwen
So I suppose it would be more accurate to say he is testing whether you
_adequately_ attempt to falsify your theory. If you only attempt to falsify it
by finding false positives, but never false negatives, you are not doing it
with sufficient rigor.

~~~
teaspoon
Is there an objective metric according to which guessing only positives is a
less rigorous strategy than guessing both positives and negatives? Given no
prior knowledge about the space of possible rules?

------
xxqs
here we discussed some technology interviewing topics:
<http://news.ycombinator.com/item?id=3437831>

your approach looks interesting, although I would prefer some practical micro-
assignments from the candidate's field of technology. If the questions require
some thinking and solving micro-problems, you immediately learn how the person
analyzes and solves them.

