Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What's your speciality, and what's your "FizzBuzz" equivalent?
260 points by ColinWright on Jan 2, 2014 | hide | past | web | favorite | 331 comments
I was recently chatting with friends and the question came up: When hiring a programmer, FizzBuzz is sometimes (and controversially) used as an initial filter. What's an equivalent filter for mathematicians?

I know that HN is filled with people who have specializations outside of programming, so I ask:

    What is your specialization, and
    what is your "FizzBuzz" equivalent?
Added in edit: Fascinating answers - thank you. I'd love to respond to many of them but that would probably trip the HN flame-war detector, penalise the item, and I'd get no more replies! If you want a reply, email me, or make sure your email is in your profile. And thanks again to all.



This is an off-topic answer, but I see no reason why this thread shouldn't be generalized to FizzBuzz equivalents in other careers (in fact, it might be interesting to do so). Our electrical engineering FizzBuzz is an introductory, Freshman-level voltage drop problem, found within the first chapter of any Circuit Analysis textbook:

         R1  (N)  R2
  [VDD]-^^^^--*--^^^^----|
Given VDD, R1, and R2, compute the voltage at node N.

This works surprisingly well as a negative hiring filter. It's not intended to be a trick question at all: it can be solved with Ohm's Law:

  I = VDD / (R1 + R2)
  Vn = VDD - R1 * I
or faster and more intuitively using voltage dividers:

  Vn = VDD * R2 / (R1 + R2).
What's disappointing is that a large number of senior undergraduate students are unable to solve this problem -- which I think makes it a good EE equivalent of the FizzBuzz problem.


>What's disappointing is that a large number of senior undergraduate students are unable to solve this problem

...Wat? That's like, one of the easiest questions you could ever be asked. As a senior undergrad in his final semester for EE (Computer Engineering specialization), could you give me an idea of some of the more difficult questions you would ask?


I was once given a boolean function, asked to implement it using logic gates. And then asked to optimize it based on a few criteria (for example: GNDs and VDDs cost a lot, reduce the gates, etc.)

The most annoying thing was, I was not expecting any gates related question so my optimizations were very... adhoc.


"What's disappointing is that a large number of senior undergraduate students are unable to solve this problem "

Ouch! :)

I'd say it's even simpler than a regular FizzBuzz.


I hesitate to say this, because it might be misconstrued. However, some people seem not to know it, so it might be of value regardless ...

Once an item has more than 40 comments, if the number of comments is more than the number of points, there is a ranking penalty applied. This is because it's a good proxy fr an item being a flame-war.

So if you're interested in seeing more answers to this question, or carrying on the discussion, it's worth making sure you up-vote the item to try to avoid tripping the flame-war detector.

Of course, it may already be too late ...


Is it known whether the flame war detector applies any significance to particular words appearing in the comments to determine if there is actually any war going on just because there are not enough item up votes?


To the best of my knowledge it's purely depended on number of comments and number of points. There was a submission some time ago talking about all the factors involved in getting to, or not getting to, the front page.

I'll see if I can find it.


I was a mathematician, and now work in finance (systematic trading). I've found a reasonable negative filter is

  A jar has 1000 coins, of which 999 are fair and 1 is double
  headed. Pick a coin at random, and toss it 10 times. Given
  that you see 10 heads, what is the probability that the next
  toss of that coin is also a head?
That tests their ability to turn a problem into mathematics, and some very basic conditional probability. Another common question (that I don't use myself) is to ask what happens to bond prices if interest rates go up.


That's a nice filter. (Of course, I'm a former mathematician as well.) Here's how I think of it:

- the prior odds that you picked the double-headed coin are 1/999.

- after seeing ten heads, the posterior odds that you picked the double-headed coin are (2^10)/999 - let's approximate this as 1. (Bayes' theorem usually gets expressed in terms of probabilities, but it's so much simpler in terms of odds.)

- so it's roughly equally likely that you have the double-headed coin or any non-double-headed coin; the probability of flipping an eleventh head is then approximate (1/2)(1) + (1/2)(1/2) = 3/4.


For the interested, some links to Bayes' theorem: http://en.wikipedia.org/wiki/Bayes'_theorem

http://yudkowsky.net/rational/bayes

Useful if you want to know (or need a good way to explain) what a posterior probability is and how it's different from a prior probability


I vastly prefer Bayes Theorem as ratios vs. percentages (plug, wrote about it here: http://betterexplained.com/articles/understanding-bayes-theo...)

I like a "factor label" style approach where you can see the prior probability (Fair: Biased), the information about the flips, and the posterior probability (the revised chances after the new information is taken into account):

Prior * Information = Posterior

so

(Fair : Biased) * (10 Fair heads : Fair ) : (10 Biased heads : Biased) = 10 Fair Heads : 10 Biased Heads

Plugging in, we'd have:

(999 / 1) * (1 / 1024 ) / (1 / 1) = 999 / 1024

The odds are ever-slightly in favor of a biased coin. So we can mentally guess ever-slightly above 3/4 for the chance of another heads. We could write (999/2 + 1024) / (999 + 1024) on the whiteboard to be exact.


Why is it 1/999? Shouldn't it be 1/1000 since there are 1000 total coins?


Odds, not probability. Probability p means odds of p:(1-p) or, if you prefer writing it as a fraction, p/(1-p).

(Note 1. The odds of a thing are the ratio Pr(thing) : Pr(not thing). You can generalize this to any mutually exclusive and exhaustive set of things: the odds are the ratio of the probabilities. The fact that there may therefore be more than 2 such things is the reason why I prefer not to turn odds into fractions as above.)

(Note 2. Bayes' theorem is, as others have mentioned, much nicer when you work with odds rather than probabilities for your prior and posterior probabilities. If you're comfortable with logarithms, it's nicer still when you work with logarithms of odds. Now you're just adding the vector of log-likelihoods to the prior odds vector to get the posterior odds vector. Which is how I think of the question above, at least if I'm allowed to be sloppy and imprecise. You start with almost exactly 10 bits of prior prejudice for "fair" over "two-headed", then you get exactly 10 bits of evidence for "two-headed" over "fair", at which point those cancel out almost exactly so you should assign almost equal probabilities to those two possibilities.)


That makes sense. I've never dealt with odds as a fraction before.


Can you elaborate on the posterior calculation of (2^10)/999?


Sure. In terms of odds, Bayes' theorem says

(posterior odds) = (prior odds) * (likelihood ratio)

The prior odds are 1/999, so we need to show that the likelihood ratio is 2^10.

The likelihood ratio is the probability of seeing 10 heads from a double-headed coin divided by the probability of seeing 10 heads from a fair coin, which is 1/((1/2)^10) or 2^10.


0.7531 if you don't assume that 2^10 = 999


Should I parse it as tossing the same coin 10 times, or choosing from the jar 10 times?


Good question, and one that sometimes comes up when I ask it in interviews. You are tossing the same coin 10 times.


0.753?

A the beginning, there's a 1/1000 chance that you pick a double-headed, and a 999/1000 chance you pick a fair coin.

A fair coin would act the way you've observed 1/1024 times. A double-headed coin would act that way 100% of the time.

(This is where I get fuzzy): Given what you've observed, there is a (1000+1024)/1024 = 0.506 chance that the coin is double-headed. There is a 0.494 chance that it's fair.

A double-headed coin would come up heads next 100% of the time. A fair coin would come up heads 50% of the time. So, 0.506 x 1 + 0.494 x 0.5 = 0.753.

How far off am I?


Your .506 is right but the arithmetic problem that you set equal to it is wrong. I think you meant to type 1024/(1024+1000). (BTW, it should be 1024/(1024+999) ).


Close enough for beta. We'll refine after user testing. (Guess my specialty!)


This is a fun question. Can I look at the coin's two sides? If not...I assume you now have to start applying statistical tests (given that a fair coin will only do this once out of 1024 times, what are the chances I've got one of those 999 coins vs the 1/1000 chance that I picked the double headed coin?) or is there some simplifying assumption I'm missing.

Anyway--assume I think all that aloud in an interview. What does that tell you about the candidate?


>Can I look at the coin's two sides?

I would give points for just asking that question, because many people bound by conventional thinking wouldn't dare to ask it, accepting default assumption that you can't. I'm not saying this says anything about your ability to solve the problem, but asking the question is a good sign of a supple mind.


You can only see the result of the flips, you can't examine the coin. Yes, it comes down to estimating the probability of having a biased coin given that you have seen it come up heads ten times in a row.


0.5005?


That's what I got.

There's a .999 chance you have a fair coin and a .001 chance you have the rigged coin.

(0.999 * 0.5) + (0.001 * 1) = 0.5005.

Seems too simple, but a coin is a coin, right?


Ask yourself a question: can you use the data (the 10 coin tosses) to update the probability of the current coin being two-headed?


I wouldn't use the data. The coin hasn't changed since I picked it out of the jar. If I flip it 1, 10, or 10e100 times, the coin would still be the same coin.

So figure the p(heads) for the coin and ignore the previous history. Overthinking it is why this makes a good FizzBuzz problem.


An example that should show this approach is wrong:

Suppose that the jar contains 500 double-head coins and 500 double-tail coins. You pull a coin from the jar, flip it 10 times, and get 10 heads. What is the probability it will come up heads next time?


That seems like a completely different problem to me, since all randomness is out of the system the moment you see the first flip.


OK, so now imagine that there are 1000000 double-headed coins, 1000000 double-tailed coins, and one fair coin. Now (1) there's still (potentially) randomness present, so it's not "completely different" from the original problem, but (2) the ignore-the-data approach gives an obviously wrong answer whereas using the data gives a believable answer.


Let's consider that it might be a fair coin, or it might be a double-headed coin.

Let's also say that every time you flip the coin and it comes up tails, you win $5. And every time you flip the coin and it comes up heads, you lose $1.

Clearly, this would be a great game to have the opportunity to play, if the coin is fair. Every time you flip you either win $5 or lose $1, so your profit, on average, is $4 per flip.

You've flipped it 10 times so far, and it's come up heads every time, and you've lost $10.

After you're $10, $100, $1000, or $10e100 in the red, without ever seeing a win, when do you change your mind about playing this game?


You are getting downvotes because you didn't follow Bayesian reasoning, but there is some justification for your instincts here http://www.stat.columbia.edu/~gelman/research/unpublished/ph...


Yes, it's still the same coin, but you don't know which coin you got. You know that you got 10 consecutive heads though. How improbable this is if you got a fair coin? How probable this is with the double-head coin? This is the data you can use to update the probability.


So if you flipped the coin twice, once it was tails, and once it was heads... you'd ignore that info? Or look at it another way, if you flipped it a million times and it always came up heads...


I've used a similar question, but with two coins. I think if I am ever in a position where I am hiring again, I may follow-up that question with this extension to 100 coins.

What is the most common answer? Typically, I get either blank stares or a gut answer of just a little over 50%. I find that physics people are pretty good at solving the problem.

I would also REALLY hope that somebody who is applying for a job in finance understands the bond prices and interest rates, but I suppose that does make it a good fizz buzz type question.


> I would also REALLY hope that somebody who is applying for a job in finance understands the bond prices and interest rates.

You'd be surprised at how many people can't answer instantly. Or how many people can't give a convincing description of what a share is, and what rights it gives you.

These are all easy questions, which to my mind is the point. The fact that someone can answer them doesn't tell you much, but if someone can't answer them then you need to think very hard about whether to hire them.


The Monty Hall problem (or a subtle variant) is also great one to watch people work through. It's a chance to see if people can thing about probability in a sort of asymptotic way.

Basically (if they get stuck), ask them how they would choose if there were actually a 100 doors, but the same rules apply. Obviously, everyone switches. What about 99? 98? ... turns out, 3 doors is the smallest number where the strategy is optimal. But when the number is large, the answer is much more obvious.


The Tuesday boy problem is a little less known: http://mikeschiraldi.blogspot.com/2011/11/tuesday-boy-proble...


I don't get it, even after the explanation. What's the mechanism that constrains those probabilities?


You should be able to convince yourself that the conclusion is correct by writing a quick script to run a simulation with a million or so iterations. Actually understanding the reasoning intuitively is more challenging, but I think the linked article has a good explanation (the part about how the manner in which we receive information is as important as the information itself): http://scienceblogs.com/evolutionblog/2011/11/08/the-tuesday...


That helps, thank you.


The problem with asking Monty Hall in an interview is that ~75% of candidates already know the answer.


Assume they already know it, and ask them to explain it. A good explanation is almost as rare as understanding it.

If they don't know it, then ask to solve.

The early controversy with the Monty Hall problem was that explanations left loopholes, or weren't compelling enough, and even people who who should have known better didn't understand the solution clearly enough.


Yeah, because it's in every easy undergrad stats textbook.

The problem with asking trick questions in an interview is that, although they may make the interviewer look smart, they are orthogonal to the ability to do the job. It's better to ask how to solve a current, pressing problem and see what questions the candidate asks. It's the questions that elucidate thinking style. Also, it's crowdsourcing and you can attribute or blame the candidate as the case maybe.

The other part is getting along, so if an interview doesn't include something fun, it's all boring formality that doesn't allow anyone to get to know each other. Take them to a normal lunch if possible, because much more is learned by how people eat.


There are tons of variations and few people know them.

For instance ask "What if the showman didn't know which door the car was behind?"

IIRC even if you exclude instances where the showman shows a car, it evens out the odds.


And if they didn't, they would probably get it wrong even if they're experienced statisticians.


One issue I have with Monty Hall problem as an interview question is that the problem statement is too subtle, with several hidden assumptions. Interviewers often end up posing a different puzzle without realizing just by using a slightly different language to pose it.


That's true, the rules really do have to be laid out in plain language to be a fair question.


Not only that, the problem statement has to be written down somewhere. I have gotten burnt in the past with problem statements changing along the way.


Depressing, just asked my coworkers this and can't convince them it isn't an independent event :(.


My wife and I have discussed this, and get a different answer to a few of the solutions we then googled for. I think it is as simple as: http://pastebin.com/gg5DTySG


I'm thinking this: http://pastebin.com/GW1hTXwD Dying to know if I made it through the math FizzBuzz :)


There is a mistake in the deduction in the second sentence. The chance that you picked the fair coin is approximately 49% not 1/1024.


This is the simplest explanation, thank you.


Given that this is hacker news, I'm surprised that no one bothered to write a simulation to sanity-check their answer. Here's a quick and dirty one: http://jsfiddle.net/qc9qk/2/

The mistake that most people seem to be making is using P(biased) and P(fair) instead of P(biased|10 heads) and P(fair|10 heads).

Spoiler: The answer is ~0.75.


The bond prices question is a pretty bad filter, since it doesn't give the competent candidate much chance to express how they think, and a weak candidate has a good chance of guessing "they go down" and even a fair chance of blustering their way through followup questions/explanations.

On the flip side, you could probably unintentionally trip up someone with a generally competent grasp of economics/stats but a lack of specific interest in bonds over terminology if your followup questions start asking them to distinguish between types of yield.

Same with exchange rates (although I do remember back in school in a competitive presentation being complemented on my confident and plausible sounding explanation of the effect of an interest rate rise on exchange rates that also happened to be the reverse of the correct answer :-) )


mumble mumble bayes mumble mumble I'll show myself out.


related problem:

a population has a 10% incidence of condition X. A test exists that is 90% accurate.

1) What is the expected percentage of positive test results? 2) if a person tests positive, what is the probability that they actually have the condition? 3) if a person tests negative, what is the probability that they are actually free of the condition?


By "90% accurate", you mean "10% false positive rate and 10% false negative rate", correct?


correct, I should have made that explicit


[deleted]


But it's very likely that you picked a fair coin to start with, because most of the coins are fair. How do you compensate for that?


with 1025 coins the numbers are nice:

1024/1025 * 1/1024 = 1/1025 (probability of unbiased coin, 10 heads) 1/1025 (probability of biased coin & 10 heads) relative probability = .5 the answer is then precisely 3/4


Do you mind posting a simple walkthrough for the answer


Sure. The slick answer is

The chance of picking the biased coin is 1/1000. The chance of seeing 10 heads from a fair coin is (1/2)^10 = 1/1024. These are nearly equal, so given that you've seen 10 heads, there is a 50/50 chance of having a biased coin. So the probability the next flip shows a head is

  P(H) = P(biased) * P(H|biased) + P(fair) * P(H|fair)
       = 0.75
The long answer -

Yo want to figure out P(biased | 10H). Using Bayes rule this is

  P(biased | 10H) = P(10H | biased) * P(biased) / P(10H)
                  = P(10H | biased) * P(biased) / (P(10H|biased) * P(biased) + P(10H|fair) * P(fair))
                  = 1 * (1/1000) / (1 * 1/1000 + 1/1024 * 999/1000)
                  ~ 0.5
and you now compute the probability of the next toss being a head as above.


Isn't the gotcha of this test the fact that the history of previous coin flips has no effect on the next flip, given a fair coin?

The OP is only asking what the outcome of the NEXT flip is, not the probability of flipping 11 heads in a row. Or did I read this wrong?


You didn't read it wrong, but you probably did fail the test ;-) There is no gotcha in the question, it's just a math problem that you either do or do not know how to solve. This isn't really about intelligence as much as it is about whether you have taken a course on probability. If you flipped 10 heads in a row the probability of the coin you have being the double heads coin increases dramatically, so you have to take that into account for the next flip. For intuitive understanding it often helps to go to extremes. Suppose you do 1 billion flips and all come up heads. What is the probability that the next flip comes up heads? Because we had 1 billion heads it is virtually certain that we are dealing with the double heads coin, so the probability that the next flip will come up heads is close to 1.


"If you flipped 10 heads in a row the probability of the coin you have being the double heads coin increases dramatically..."

I disagree. The coin is the coin. It didn't magically transport itself or change state after flipping it 10, 100, or a billion times.

Lets change the puzzle to the simplest state: you pull a coin from the 1000-coin jar and flip it just once. What's the probability of heads?

This is why roulette and baccarat tables in Vegas have those signs showing previous outcomes. It's meant to mess with your head. Previous history has no effect on future outcomes. A fair coin could come up heads a billion times in a row as well. The next flip will still be 0.5.


The coin is the coin, but your information about the coin has changed. Probability is a fact about you, not a fact about the coin.

Here is perhaps a more visceral example: On any given day, your car has a 10% chance of having blown a wheel the previous night. If it has blown a wheel, every bump will feel jarring. If it hasn't blown a wheel, any given bump will feel jarring with 10% probability. You drive over a hundred bumps, and all feel jarring. Do you think the next bump will also feel jarring?

Yes, right? Because the fact that the last hundred bumps have been jarring means you probably blew a wheel last night. Even though 'previous history has no effect on future outcomes', previous history does give you information about the state of the world.

Edit: Or a better example, for this community. You're on a slightly flakey wifi connection, which sometimes drops a packet at random. Any given packet is dropped with probability 1/100. Also, any given time you connect to the router, the modem is down with probability 1/10, and all of your packets will get dropped. You connect, and the first hundred packets you send are dropped. What is the probability the next packet will get dropped? Very high, because now you know that the modem is probably down. The history of the connection gives you information about the state of the connection. Similarly, the history of coin toss results gives you information about the state of the coin.

(Why is this different from roulette? Because there the previous history doesn't give you information.)


So, I think that tge question is ambiguous, and a little more direction could clear it up. If you want to know the probability, taking into account the 10 heads results thus far, I would tell the interviewee, "I am going to do this process repeatedly (picking a coin and flipping 10 times), and I'm going to keep track of what percentage do the all one side thing. What will that percentage be approximately?".


That's not ambiguous if the applicant has taken a basic college statistics course, which is what the question is intended to determine. The term "probability" has more precise definitions in mathematics than in everyday language.


Ah, that's the part I was missing. Thanks


Probability is a fact about you, not a fact about the coin.

Huh? It's a coin. You flip it. It has no knowledge of the past and no idea about the future. It lands and it's either heads 50% of the time and tails 50% of the time. It's all about the coin and nothing about what you've observed in the last N trials.

That's the simple logic that the OP is trying to see if you understand.


Did you miss the bit in the OP where one of the coins has heads on both sides? The coin may or may not be biased, and the results of flipping it give you information about whether or not it is biased.

Explaining the bit you quoted: if you had perfect knowledge of the wind conditions, how hard the person flipped the coin, and so on, you would know precisely which side it would land on. That fact is determined. It's just because you are missing information that there's 50% probability of heads (for an unbiased coin). The probability comes out of your imperfect knowledge, and changes depending on your knowledge: for example, if you have some reason to believe the coin is biased, then you no longer think it's going to land on heads 50% of the time. Since one of the coins is biased, getting a long string of heads is reason to think that the coin they drew and are flipping is the biased coin. This information affects the probability you assign to the coin coming up heads.


It's really interesting to watch this thread. I'm not saying that you should simply trust what the other commenters are saying, but the user crntaylor is the one that originally posted the question, and his solution is that the likelihood of the next flip being heads is 75%


It may be a useful sanity check to simulate the experiment multiple times to see the result:

  def choose_coin
    rand < 0.001 ? 1 : 0.5
  end
   
  def flip_coin(coin)
    rand < coin ? :heads : :tails
  end
   
  def all_10_heads?(coin)
    10.times { return false if flip_coin(coin) == :tails }
    true
  end
   
  next_flip = { :heads => 0, :tails => 0 }
   
  1000000.times do
    coin = choose_coin
    next unless all_10_heads? coin
    next_flip[flip_coin(coin)] += 1
  end
   
  puts next_flip[:heads].to_f / (next_flip[:heads] + next_flip[:tails])


What's really interesting is that there are probabilistic programming languages where you can write a program that does a simulation just like you did, but the execution engine can compute the probabilities exactly and much faster too. It does this by computing along all possible paths in the program, and keeping track of the probability mass of each path, and then summing them all up in the end.

http://en.wikipedia.org/wiki/Probabilistic_programming_langu...


Completely shameless plug for a tiny probabilistic programming language that I wrote as an embedded DSL in Haskell:

https://github.com/chris-taylor/hs-probability

The code that solves this problem is:

  solve = do
    coin   <- choose (999/1000) fair biased
    tosses <- replicateM 10 coin
    condition (tosses == replicate 10 Head)
    nextToss <- coin
    return nextToss
   where
    fair   = choose (1/2) Head Tail
    biased = certainly Head


Likewise, a tire has no knowledge of the past or future. It will transfer a bump from the road to you following the laws of physics, depending on whether it is functioning properly, represented by a probability.


I think you're incorrect because your conceptual model of what constitutes "probability" is incorrect for this type of problem.

Try thinking about it in a more brute force way: imagine literally all possible outcomes of performing this experiment. In other words, create a list like this (each coin in the jar is numbered from 000 to 999 with 999 being the only biased coin, and coin flips are represented by 0 being heads and 1 being tails):

    Picked fair coin #000, flipped 00000000000 (eleven flips)
    Picked fair coin #000, flipped 00000000001
    Picked fair coin #000, flipped 00000000010 ...
    Picked fair coin #000, flipped 11111111111
    ....
    Picked biased coin #999, flipped 11111111111
Now select all of the lines above where the first ten flips are heads. Of these outcomes, how many have an eleventh flip of heads and how many have an eleventh flip of tails? Unless my idea of probability is flawed, this should be the same answer that the mathematicians in this thread are providing, so something right around 75% heads.


Well if I can count:

  00000 00000 0
  00000 00000 1
for 999 fair coins = 1998 + 1 for fake coin = 1999

1000 of which has 11. flip 0

So 1000/1999 ~ 50%


True. I glossed over the crucial part, which is that you have to enumerate the same number of potential outcomes for the biased coin's ten flips as you do for each fair coin, because each coin is equally as likely to be selected from the jar. Of course, all 2^10 potential outcomes for the biased coin's ten flips are the same: all heads. So we have:

    00000 00000 0
    00000 00000 1
For 999 fair coins = 1998

And 2^11 instances of 00000 00000 0 for the biased coin = 2048.

That's 1998 + 2048 = 4046 equally likely outcomes that begin with ten heads. Of those, only 1998 / 2 = 999 outcomes feature an eleventh tails. So the odds of getting an eleventh heads after seeing ten heads is (4046 - 999) / 4046 =~ 0.753.


... because each coin is equally as likely to be selected from the jar

That's not the reason. It's because the fake coin also has two sides and therefore also 2^11 different outcomes. It just happens they all look the same.

Otherwise I agree with your argument.


True, I could have still used zeros and ones to enumerate all outcomes of the biased coin, but said that both digits represent heads.


Where is the unfair coin in your list? Without that you just get 50% heads. This method can definitely work to get the correct answer, but you have to account for all possibilities and weigh them by their probability.


Sorry, I edited coin #999 to be the biased coin.


The probability of heads is

    0.999*0.5 + 0.001*1 = 0.5005
The important thing is that as you observe heads from the coin, you learn something about the coin. As you observe more heads it is less likely to be a fair coin and more likely to be the coin with double heads. This doesn't change anything about the coin, but it changes something about what you know about the coin. See here for the correct answer: https://news.ycombinator.com/item?id=7000523


Since the question simply asked what p(heads) was on the next flip, it seems our answers match. Thanks!


Our answers to your question may match, but I very much doubt that our answers on the original question match. The crux is that your question is not equivalent to the original question. If you are interested in learning why that is I can explain it further, but it doesn't look like you are?


Okay yeah, I'm stupid for pressing on while crntaylor already gave his answer. I'll shut up now.


Not at all. The answer is about 0.75, and stated in the answer jules linked to.


I think you're missing the point of the question which seems to get at whether the interviewee is familiar with/understands bayesian probability. For an excellent explanation you should see: http://yudkowsky.net/rational/bayes.


I don't see why knowledge of Bayesian probability is necessary for this question. I don't have anything more than a passing knowledge of Bayesian probability, but I was able to produce the correct answer and a reasonable explanation using what I believe to be classical probability: https://news.ycombinator.com/item?id=7001288


>>>Previous history has no effect on future outcomes. A fair coin could come up heads a billion times in a row as well. The next flip will still be 0.5.

You hit the nail on the head here - a fair coin would. But if there are unfair coins, things start to change. If we know there's an unfair coin, and we see an unusual run, we should consider that it may be that coin.

Think of this: increase the flips to 1000, and you're still getting heads each time. Would you bet on the next being heads or tails?


More or less

"The OP is only asking what the outcome of the NEXT flip is, not the probability of flipping 11 heads in a row"

This is correct, however the history of flips gives us an information on the type of coin we have in our hands

Would be interesting to see the probabilities if we had gotten 11 heads, 15 heads or 20 heads in a row.


The history of coin flips has no effect on the future tosses, BUT you can use the history of flips to try to infer which coin you are dealing with.


But that's not what the OP/Interviewer wants to know. All he asked was this:

"Given that you see 10 heads, what is the probability that the next toss of that coin is also a head?"

He didn't ask you to identify the coin. He just wants to know if the flip is going to be heads.


The probability of which coin you have affects the probability of the next toss coming up heads, so having this knowledge is implicit in determining the solution.


That probability was determined the moment you picked the coin out of the jar.

It makes no difference what you do to it after the pick. Hold it in your hand for a day, flip it 10 times, sit on it, whatever - the end result is that p(heads) for that particular coin has not changed. p(heads) will be either 0.5 for a real coin or 1.0 for the rigged one.

The probability then comes down to what coin you picked at the start of the trial. There's a 0.999 chance you have a real one, and 0.001 chance that you have the rigged one.


What if you picked a random coin from the jar, then you looked at it and saw that both sides are heads. Is the probability that this is the coin with both sides heads still 0.001? No, the probability that this is the coin with double heads is 100%.

Now if you pick a random coin from the jar, and you randomly observe one of the sides of the coin 1 billion times and every time you see heads, is the probability that this is the coin with both sides heads still 0.001? No, the probability that this is the coin with double heads is very close to 100%.

Now if you pick a random coin from the jar, and you flip the coin 1 billion times, and every time it comes up heads, is the probability that this is the coin with both sides heads still 0.001? No, the probability that this is the coin with double heads is very close to 100%.

How about if you flipped it 10 times and it came up heads 10 times? Turns out the probability that it is the coin with double heads is about 51%.

Probability quantifies the degree of uncertainty YOU have about the world. This can change even when the world doesn't change, namely when you observe something about the world.


I believe someone already presented this analogy to you, but I'm curious what your response is. Imagine the jar has only two coins, one always heads and one always tails. Choose a coin randomly, then flip it ten times. If you get ten heads, what is the probability that the next flip is heads?

According to the methodology you are advocating, the probability would be 50%, because you are only considering the initial probability of selecting a coin from the jar. But using the methodology I suggested in another comment, you would list out every possible outcome and conclude that there is a 100% chance of getting another heads.


OK, What if I flipped it 10,000,000 times and got all heads? What is the probability that I have the non-fair coin? What is the probability for the next toss?


Another way to look at it - one coin has two heads, the other 999 have two tails. You flip the coin and get one heads. What is the possibility that the next will be heads?


I think you are discussing different interpretations of probability (classical vs bayesian), which is causing the disconnect.

http://en.wikipedia.org/wiki/Probability_interpretations


That's not really the problem. I believe I was able to explain the correct response using classical probability: https://news.ycombinator.com/item?id=7001288. This is no more complex than asking what the odds are of being dealt a full house in poker. You can enumerate all equally likely outcomes and simply count them.


When we pick the coin, we have 1 in 1000 chance of getting the double heads coin, and 999 in 1000 chance of getting a fair coin. Lets call this P(fair) = 0.001, and P(fake) = 0.999.

When we have the double heads coin, the probability of getting 10 heads is 1: P(10 heads|fake) = 1. When we have a normal coin, the probability of getting 10 heads is P(10 heads|fair) = 0.5^10.

The quantity we want to compute is

    P(heads|10 heads) = P(fair|10 heads)*0.5 + P(fake|10 heads)*1 
                      = P(fair|10 heads)*0.5 + (1-P(fair|10 heads)) 
                      = 1 - P(fair|10 heads)*0.5.
To compute P(fair|10 heads) we use Bayes' rule:

    P(fair|10 heads) = P(10 heads|fair) * P(fair)/P(10 heads)
Here

    P(10 heads) = P(10 heads|fake)*P(fake) + P(10 heads|fair)*P(fair) 
                = 1*0.001 + 0.5^10*0.999.
We fill in the formula we got by Bayes' rule:

    P(fair|10 heads) = 0.5^10 * 0.999 / (1*0.001 + 0.5^10*0.999)
Then we fill in the original formula:

    P(heads|10 heads) = 1 - 0.5^10 * 0.999 / (1*0.001 + 0.5^10*0.999) * 0.5 
                      = 0.75308947108


Here's my walkthrough: http://pastebin.com/e2ea9XUD


The funny thing is, I imagine I'd fail this, and a lot of other "interview" questions. (Not fizzbuzz incidentally :P)

Why? Because my brain just doesn't seem to work in the way people expect "experts" brains to work in our world of tests/qualifications.

Now for context, I don't consider my self some wishy washy "Oh I have a different KIND of intelligence" making-excuses dumb as nails nancy. I've worked several years now for my nation's statistics agency. I've written programs to calculate things professional statisticians couldn't (indeed that seems to be one of the reasons I get to keep my job :P) and just about every useless bloody stat there is, I've written my own probabilistic data linking software, finished in the top 10% of unrelated competitions on kaggle, and my formal qualifications are in economics. I don't think I'm the greatest thing ever, but if i might be so blunt, I feel I'm at the stage where I can confidently claim to have "proven my capabilities".

But I haven't memorized Bayes theorem (or any other probability or math-formulas), despite having applied it about 100 gazillion times. And despite being an economist, I haven't memorized "the relationship between bond prices and interest rates". Now, I could try to reason these things out from scratch in front of you, but I imagine most people in our world would see that as a "weakness" or trying to hide the fact that I couldn't answer the question.

With the probability one, I'd start going down the various interpretations of probability and whether your notion of probability is internally consistent/justified, etc etc etc. I doubt I could write the math on the white board of the top of my head in an interview (and i know several statisticians who couldn't also), but I could probably outperform most of the candidates who could in its application in the real world (speaking from experience), or be able to question whether there is a better tool for the job in the real world.

With the bond one, I know there's an answer that you expect me to give from rote during my education. But i won't. I haven't memorized it (because from experience, memorising these types of things is bad form, conditions you into erroneous thinking when they eventually turn out not to be universally true, and is far worse than reasoning, questioning or thinking about problems). So i'll ask you to describe your idea of a bond to me. I'll ask you to describe your idea of interest rates. Why are the interest rates rising? Maybe then, depending on what answers YOU give, I'll say "well then obviously bond prices must move inversely to interest rate movements", but there's just as much chance i'll pick up on some mistake you've made and never reach that point.

Now I'm not attacking you specifically. I don't know you, or what we think of each other, or how we'd interview each other. But from my experience, most people/recruiters/employers/interviewers would take my behaviour as a negative sign. A sign of "stalling" or "evading the question". Questioning, or reasoning, or skeptically interrogating things or mulling over questions for long periods of time, especially if they consider the answer "known", is a "bad sign".

And the rote learner, who'd just happened to memorise a particular formula or spent most of their time in one particular context, or bought the book on interview questions for said industry will get a big fat tick.

Moral of the story: Please don't use "interview questions" :P


So, to see if I understand you:

* The person suggesting the question says that this is trivial for people suited to the job in mind;

* You say you couldn't do this question;

* You also say that you would be suitable to do the job.

Given that the proposer of the question is the one that understands the job, I'm having a hard time understanding why you claim you would be suitable for it, given that you couldn't do the question. Please understand that I'm not attacking you specifically, I'm just having a hard time understanding your reasoning.


Do you talk to your mother with that mouth?


Well, I'd disagree with at least two of your premises:

1) That the proposer of the question often understands the job. (by which i mean, in real world interviews. I'm not commenting on the specific poster).

2) I am not saying that I couldn't do the question, per se.

On the contrary, my professional work shows not only that I can do the question, but that I've probably done it innumerable times and to the standard of (and sometimes greater than) other professionals.

The subtext of my response to the bond question shows that, in an absolute juvenile sense, I "know" the answer that's being looked for because I've notionally been trained as en economist. Which is to say, I am aware of the rote learned current models/theory.

I'm saying that I probably couldn't answer it in the interview environment in the way that most interviewers would deem "acceptable" or "right".

Why? Because I both haven't memorized Bayes theorem, and wouldn't accept there as being a simple probabilistic interpretation applicable to real world problems. Now its true, I could probably attempt to re-derive it there in the interview from some even more basic understandings of probability, but is that what interviewers want? I'd almost certainly offer the answer of "Can i look at the two sides of the coin?" or "the probability is either 0 or 1". And then we'd go down the path of probability interpretations, and then only after they made it clear that they wanted a Bayesian approach, we could start heading in that direction. Is that what they're looking for? I don't think it is. But they are all valid responses.

I think they want someone who has remembered Bayes and its application recently and then just applied it. I think in the bond question, they want someone who has rote learnt some economic/finance theory. And I think the worst aspect is that the stronger candidate, who pauses, who thinks, who questions, and who doesn't just go down the regurgitated answer will probably, in real life, be marked as the weaker candidate.

And I'm also saying this as someone who is already employed, being paid and and working professionally on things the questions are supposed to be screening for. So while the original poster is in a position of authority to speak of such things, so am I.

Are you familiar with Project Euler? I've completed many of those problems as well, but I would never expect myself or some other candidate to be able to do them spontaneously in said interview, or think worse of them if they couldn't. And if they could do them, I don't think that would be an indicator of very much more than the fact that they happened to be working on said problem recently, had just taken a course on said problem, or had just been doing interview questions for my field.


You don't need to know Bayes' theorem to solve this problem.


What sorts of questions or tests were posed to you in interviews for jobs you ended up accepting? Or if you've never been in that situation, can you propose some interview techniques that would give the interviewer a good idea of your abilities?


Lets break down the role of interviews into two parts. The first part: To ensure the candidate is sufficiently-socialised and isn't a dangerous toxic mouth-breather. For this, current interviews can work relatively well. An hour or so, face to face, just talking back and forth mulling over problems/theory with an authority in the field or people they're working with.

I am genuinely skeptical about whether people who don't know what they're talking about can actually hide that fact from people who do know what they're talking about in such a situation. The secret of course is to basically ignore qualifications/status and do it as a conversation. Perhaps in such a scenario, five minutes in me and the original poster would be getting onto ridiculous questions of trading/probability with great big smiles on our faces and forgetting all about whether one can remember bayes off the top of their head.

The second part: Assessing the technical ability of the candidate at hand. And for this I think current interview culture is horribly toxic.

I do not work for Facebook, nor do I believe that such techniques are not without their flaws, and are difficult/expensive to implement, but I think techniques aimed at "actually get the candidate to do the task involved in the job, and see how they go" has merit. Forget your preconceptions about what is required. Give them a real world task you'd expect them to do in the job, the time it would be expected of someone in the job. The resources of the job. See how they go.: See http://www.kaggle.com/c/facebook-recruiting-iii-keyword-extr... for example.

Not perfect, but better than 5 minute "interviewer questions of the year".

Funnily enough, I got into my current organisation answering questions in the interview on analyzing health statistics. Something I knew nothing about. A senior executive then snatched/let me move across to his area in the more technical side of things without a subsequent interview after I made it known I'd be interested, and after working with their guys for a bit on another project. Swings and roundabouts :P


> But from my experience, most people/recruiters/employers/interviewers would take my behaviour as a negative sign.

The negative sign is your stubborn conviction about the lack of value of rote memorization.

The stereotypical "rote learner" you describe in your before-last paragraph is just as one-sided as the opposite.

Of course neither extremes really exist and you probably have rote-memorized more things than you realize. You don't work out everything from scratch every time you encounter a medium hard problem.

Unless you actually do, in which case someone reasonably clever who does not stubbornly refuse to learn a few formulas will beat you by speed. Which is a reason to hire them over you.

That's the the point. You may be very smart and capable doing it this way, but learning a few basics (such as Bayes, in the context of statistics) doesn't really cost you anything. Memorizing a few things isn't hard. You won't actually run into problems like "erroneous thinking" because you're smart enough and knowing the conditions under which something holds is easy enough when you understand the thing. Additionally you get the knowledge when a particular important theorem does not hold, which is higher level knowledge that you just can't get when you reason from scratch every time (positive vs negative knowledge).

Memorizing a few of the basic important theorems and formulas in your field is such low-hanging fruit to improve yourself that a potential employer might ask why you're not doing it, even if you're so very capable without doing it.

Finally, there's communication with your peers. Maybe you can figure that stuff out by yourself easily, but when your colleagues say "ah, so we just use Bayes and poof", you'll be lagging behind figuring that out, and in reverse when you do these things from scratch, I'm assuming you'll be doing them much quicker, speeding through your ad hoc methods, leaving your team mates confused and wondering "why don't he just use Bayes?" (which may or may not be what you are in fact doing, but you lack the knowledge to tell them as much).

I'm sure you do a very good job at what you're doing now, the way you're doing it. I'm just pointing out that there are various well-justified reasons why employers can be wary of your refusal to memorize some basic field-knowledge, even when you're perfectly capable figuring out that stuff when you need it.


I'm a mathematician, and sometimes I need to assess the level at which I'll pitch an explanation. Although I don't do this in a setting where it's possible to set a test, as such, I do ask this question:

    An equivalence relation is reflexive, symmetric, and
    transitive.  If E is an equivalence relation, and we
    know E(x,y) but we know !E(y,z), what do we know
    about E(x,z), and why?
It seems a great filter for people who can think and reason formally, as opposed to using intuition based on their understanding of the words.

Edited in response to lmm's perfectly reasonable question. In context I would see if they just assumed they knew what I meant, or asked for clarification. If they assumed they knew what I meant, it would be interesting to see what interpretation they used. This is like FizzBuzz - it shouldn't stop with the first draft of the code - it's an opening for discussion.


"we know not" seems like an unfairly trapped way of phrasing that question; it's not entirely clear that you mean "we know !E(y,z)" rather than "we don't know anything about E(y,z)".


I would definitely reason intuitively to solve this question, rather than formally. When I see E(x,y) I paint a picture in my head putting x and y in the same equivalence class:

    =======
    | x y |
    =======
When I see !E(y,z) I put a picture in my head where z is outside of the equivalence class of y (and x):

    =======    =====
    | x y |    | z |
    =======    =====
Hence we can conclude that !E(x,z). So while this is a nice question, I don't think it achieves your goal of testing formal as opposed to intuitive reasoning.


You know about equivalence classes and know to apply them. You can reason about partitions rather than needing to drop down the the level of logical implications (higher level abstractions for the win). If you remove the "think" part and become a little bit more assertive, your intuition becomes a proof.

You pass.


How often do you get people who apply but cannot pass that test? I always felt that CS gets a pretty large influx of people who are very fresh to programming (as isn't everyone!) but I would have thought a math fizz buzz would need to be harder to get signal.


Re: E(x,z) is false (or !E(x,z)). Otherwise we would have E(y,z) given that E(y,x), that's a contradiction.


I'm not a mathematician, but it looks to me like we know E(x,z) is false. The transitive property gives us

    E(x,y) & E(y,z) => E(x,z)
but since we know !E(y,z), we can infer !E(x,z).

(I'm not a logician, either, so I hope you'll excuse what may be imprecise phrasing on my part.)


Your result is correct but your reasoning is not sound. A & B => C and !B does not imply !C.


Well, I already admitted I'm not a logician: if A & B => C, (!A | !B) => !C looks implicit to me, and while this isn't the first time I've had my nose rubbed in the fact that it's not, I never could understand why not.


That's the same as [ A => B ] and therefore [ !A => !B ]. This can easily be shown to be an invalid inference by instantiating A and B appropriately.

  A = "It's Raining"
  B = "Water is falling from the sky"
The first is clearly true. If it's raining, then water is definitely falling from the sky. The second is clearly false. Just because water is falling from the sky, that does not necessarily mean that it's raining.

You can probably invent your own, somewhat more humorous examples.


Just because A & B get you C doesn't mean that having A and B is the only way to get to C.

So you can't conclude that because you don't have A or B you don't have C.

It's like saying "Going down the A20 and the M20 (roads in the UK) gets you to Rochester (I've no idea if it does). I've not been down the A20 or the M20, thus I'm not in Rochester." The second sentence there isn't really valid, maybe you took a different route.


When !E(y,z), then also !E(z,y) (because symmetry). Also, E(y,x).

Thus !E(x,z), because E(y,x) and E(x,z) would imply E(y,z), which is false.


x is "near" y and y is near x. y is not near z, so z is not near y, and therefore z is not near x nor is x near z.


Bad example. "Near" is not an equivalence relation. Every step you make keeps you near where you were, but you can eventually get thousands of miles away.


I'm a marketer, and have yet to find a way to judge a marketer's competence better than:

"Here's this app, here's what it does. How would you increase downloads?"

A quick brainstorm session shows you how they approach the situation, what tricks they have in their bag, and what methods they feel confident using (or if they're just BSing).

Interestingly enough, 50% of the time people answer "SEO," which is among the worst answers you could possibly give. The best answers often involve the person playing with the app and figuring out something you could actually build into the app itself.


There's a difference between marketing and product design. A marketer should know how to get qualified leads for lowest cost to your app, and then it's your product manager who should worry about making the best impression with new users.


This is really a false dichotomy. Many apps can market themselves by 'baking in' sharing, and I don't mean just a 'tweet/like/pin' button on every other page.

These two functions are not separate.


> There's a difference between marketing and product design. A marketer should know how to get qualified leads for lowest cost to your app, and then it's your product manager who should worry about making the best impression with new users.

In my own experience the marketer's primary job is to identify how and why your end-users might use your product, i.e. understand who these qualified leads are to begin with.


> The best answers often involve the person playing with the app and figuring out something you could actually build into the app itself.

I'm clueless when it comes to marketing. Could you elaborate here with some examples? I think the majority of apps are in a place where nobody knows they even exist, so how could improving something in them improve downloads?


And then you ask them what they mean by SEO, and it gets worse. SEO is one of my favorite buzzphrases, because it causes ignorant marketers to rapidly self-select. (Unless their explanation shows otherwise.)


Would the answer of the inverse question be interesting? Taking our successful landing page, how would you reduce downloads?


That's an interesting thought, but I can't see how it would be, simply because of the sheer number of ways that would be possible. That's kind of like saying, "Here's this working program, how could you break it?"


- Sledgehammer to CPU. Torch the data-centre. Kidnap the users.

- Roll your own crypto. Claim unbreakable. Hold contest. Post on HN.

- Army of four-year olds.

- Invite NSA to design committee.


Remove the download link.


Marketing takes all sorts - there are plenty of people who do a great job by just doing SEO, or by just buying adwords, etc.


I'm an electrical engineer who has pared my skill list down to 'troubleshooting.' For my equivalent of a fizzbuzz, the exercise is to present someone with a potential failure (the light in the room is off) and ask them how they'd go about investigating the problem.

I like to see all kinds of answers, but particularly more theories are better, especially when justified at least loosely. Things like "change the light bulb", "flick the switch", and "check the circuit breakers" are evidence that they have the right mind for breaking a situation out into many different causes.

edit: I absolutely loathe 'out of the box' as a meme, but I love hearing responses like "is anything else in the room turned on?" as it shows a simpler, more direct way of assessment.


I have misread the last quote as "is anyone else in the room turned on" and wondered whether that's not a bit far out of the box ;)


That's not a question I've ever had cause to ask on either side of an interview, to say the least.


I'm a software developer, but the code I work does a lot of DB access. I always ask a "two joins and an aggregate question."

Given three tables:

    authors     books       sales
    ----------  ----------  ----------
    id          id          id
    name        author_id   book_id
                title       date
                            price
find the total sales by author in November 2013. The result set should have two columns: author's name and their total sales.


I would make a couple of assumptions:

- price could be 0.0 for a book

- Total sales means the sum and not count

- Date format will be yyyymmdd (if not, assume that we converted in Oracle or Sybase)

    SELECT a.name, SUM(s.price) FROM authors a
    INNER JOIN books b ON (a.id=b.author_id)
    LEFT OUTER JOIN sales s ON (b.id = s.book_id)
    GROUP BY a.name
    HAVING (s.date >= '20131101' AND s.date <= '20131130')


In SQL, this would be an issue due to the date column not existing in the group by. Additionally, because of the having statement, the left outer join is being treated as an inner join with respect to the results given.

Something like this could work (without going for an alternate approach such as Unions or sub-selects):

    SELECT a.name, ISNULL(SUM(s.price), 0.0) FROM authors a
    INNER JOIN books b ON (a.id=b.author_id)
    LEFT OUTER JOIN sales s ON (b.id = s.book_id)
    GROUP BY a.name, s.date
    HAVING (ISNULL(s.date, '20131101') >= '20131101' AND ISNULL(s.date, '20131101') <= '20131130')
Edit: Actually might be better to keep the null in the results like the original query. It's definitely better to know whether a book sold 1000 copies for $0.00 each or if no copies were sold.


Instead of the Left Outer Join, couldn't you also make it an Inner Join to S.BOOK_ID? The question, if I read it correctly, only requires you to return sales in Nov 2013. If there is no data for a book_id, you want to exclude it from your returned data, no?


I would imagine it is up to interpretation. It would probably be a good thing to ask the interviewer about to show that you are thinking about edge cases.

If it were me, I'd prefer to see those that made no sales based on the question. I guess if I didn't want to see them, I would ask for "where sales are greater than $0.00".


s.date is not a group column and a.name seems like a bad PK


My solution:

    SELECT authors.name, SUM(sales.price) FROM authors
      INNER JOIN books ON authors.id = books.author_id
      INNER JOIN sales ON books.id = sales.book_id
      WHERE sales.date >= Nov 2013 AND sales.date < Dec 2013
      GROUP BY authors.name
Although a coworker pointed out that inner joins would drop authors who haven't written books, and books that have no sales. If you want to include all of those, then use LEFT OUTER JOIN instead of INNER JOINs.


I think the fact that all solutions posted so far are broken and even repeat bugs I already pointed out validates this as a useful FizzBuzz test :-)


By "total sales", do you mean number sold or revenue?


I wonder if that could be part of the assessment, to tell if the candidate is likely to verify uncertainties or make assumptions.


Now that you mention it, probably so.


I mean total revenue, but I leave it unspecified in the hope that someone will ask for clarification. Same with if they should include authors with no sales during the time period.

I like this question because the base query, however you interpret the unclear parts, is easy, but it still allows candidates to impress me by asking about the "edge" cases. And even if they just makes assumptions, I can still ask them to do the other versions to see if they really know what they're talking about. I'm always amazed at how many people can't convert an inner join to a left join, or can't change a sum() to a count().


On/off producer + audio engineer. A good discriminator might be "it's squaring out; what do you do?". A good answer would lead to an argument about the compression war.


As a producer and non-native speaker, do you mean digital clipping?


How does it sound? You might want to leave it.

If you want to clean it up, drive the compressors/limiters less. ie. Turn the input trim down or turn compressor input gain down.

Another option is to open the compressors up. Take some (or all) of the compressors in the chain and halve their ratios.

A/B and compare.


10-20db cut in the lower freqs?

(Depends on what the source of the track is.)


As a chip tuner ... "What do you expect?" :)


I do grant writing for nonprofit and public agencies. [1] I usually ask for a couple of major commas rules. Sometimes for semi-colon rules or for a description of passive voice. Chances are good that anyone who is zero for three isn't very good.

A lot of writers and would-be writers also have blogs; it only takes one to three paragraphs to figure out who can sling a coherent sentence and who can't.

Like FizzBuzz, being able to sling a coherent sentence doesn't mean that one can write a coherent 50-page document, but those who can't write a coherent sentence can't write a coherent 50-page document.

[1] Try http://blog.seliger.com if you're curious.


How well does the ability to write well correlate with the ability to articulate rules for writing well?

I'd have thought there are a lot of people who can write any quantity of text with impeccably placed commas (and semicolons, when the required tone permits them), but would freeze and stutter and gabble if asked for "a couple of major comma rules".

I'd put myself among their number, if only because it's not clear to me what sort of thing you mean by "comma rules". Descriptions of some situations where it is appropriate to use commas? "Commas are commonly used to separate items in a list when there are more than two such items" or "commas may be used, like this, to surround interpolations -- but for all but the shortest such interpolations you should consider using parentheses or dashes, or restructuring the sentence"? Descriptions of situations where it's inappropriate to use commas? "If you have two things that can function as separate sentences, don't separate them only with a comma; use a full stop or, in some contexts, a semicolon"? Or higher-level fuzzier principles? "Stops of all kind are placed to correspond roughly with pauses in speech and boundaries of grammatical units. The comma is the 'smallest' of stops, and is generally not used where the pause would be long or where the break in grammatical structure is great." Or what?

(I suspect a cultural difference at work here: I think formal analysis of grammar and punctuation may be a bigger deal in the US than it is in the UK.)


How well does the ability to write well correlate with the ability to articulate rules for writing well?

The short answer is "pretty well." Usually I ask specifically for major comma rules, which I'd tend to define as:

* Connecting independent clauses with a conjunction * As part of a list * To offset a word or phase at the start of a sentence. * Like parentheses (or, in the lingo, an appositive phrase).

If they say, "I have no idea," that's a very bad sign; if they ask questions like yours, that's a good sign.

A fun, short book called Write Right! by Jan Venolia is worth reading if you're interested.


Known the rules is necessary but not sufficient for an A+ writer.


It's arguable (though I'm not sure it's actually true) that being able to write perfectly, and consistently, correctly is necessary for "an A+ writer".

But that is not the same thing as "knowing the rules". A writer may be able to write perfectly correctly without being able to enunciate any rules accurately at all.

Perhaps you play one or more sports. Could you write down accurate rules telling you how to hold and move a tennis racquet, or exactly how to flex the relevant joints when kicking a soccer ball? Of course writing is more deliberative than tennis or soccer, but a good writer isn't thinking about grammar and punctuation much more than a good sports player is thinking about joint angles and muscle groups.

Also, of course, there isn't universal agreement about what is and isn't correct, nor about what the best set of rules is for describing what's correct. For instance, if you compare the famous "Comprehensive Grammar of the English Language" by Quirk and Greenbaum, and the more recent "Cambridge Grammar of the English Language" by Huddleston and Pullum, you'll find that their analyses are sometimes very different. Totally different rules, even though they're describing substantially the same language.


For a data scientist I ask "what's a p-value?" Both Junior and senior candidates often get this wrong. The only difference is that senior people bristle at such a simple question before also getting it wrong.


It's always possible that if so many candidates get this wrong, especially senior ones that your understanding might be incomplete, or possible not understanding clearly. I often find that people get these pet questions and think they understand them when they don't really. I'm not even a data scientist or statistician and could tell you what a p-value is. I find it hard to believe experienced candidates can't.


So what do you consider to be the/an appropriate answer to this question?

This seems like a case where "why do we use it," an "how do we use it correctly" are obvious but the definition of what it actually is is a complexity we covered once in freshman statistics and never needed again.

You don't hire a telemarketer by asking them "what is a telephone" you hire them by making sure they know how to use one correctly.


Ok, now I'm confused, because I was under the impression that this is the first thing I ever learned in statistics - that a p value is how likely it is that a value created by chance would have been as extreme as the one the experiment gave. Is this wrong?


You have the right idea.

You're assuming an underlying model for the data. You have a test statistic( that estimates a model parameter) and you have a hypothesis regarding a parameter. The p-value is the probability that you get a test statistic more extreme than the one observed assuming that your hypothesis is true.

Ex. You have a sample of 1000 men's heights. You compute the sample average height as 5'9 and a sample standard deviation of 3 inches.

(Unlikely) hypothesis: the average height is 4 feet. Your p-value is the probability of getting an sample average more extreme than 5'9 given that your 4 ft height hypothesis is true. Given that the sample standard deviation is 3 inches and 5'9 is 7 standard deviations from 4ft... the p-value is going to be small, so you'll reject that.

Note: I'm leaving out details and assumptions


Yeah, that's what I was taught. So what are people answering instead?


People can be easily lead to misinterpret p-values even if they can define them. Most often people assume that p values indicate something about the correctness of a model or an inference. This is the classic p(d|h) v p(h|d) debate.


It's not wrong, but it did open up some great questions as to what it means to have values created by chance and how that relates to "this experiment".


I work in the life sciences. You would be amazed at what proportion of researchers have no conceptual understanding of what a p-value actually is.


For sysadmin/ops jobs, I want to see debugging ability in addition to programming skill.

I usually ask about HTTP and DNS (for web-focused) or Kerberos/AD logins (for user focused).

There are a lot of moving parts to either of these, so a question like: "You type a web address into a browser, it doesn't load" can turn into questions about the DNS resolver process, system and network config, split horizon, time setup, etc.

Basically, I want to know if they have a grasp of how things all go together, and how to test the individual parts that make up the whole.


More sysadmin / troubleshooter focused person here, and I love this sort of question. It also brings out people who will throw out things like "do a network capture". Then you either get the "well, send it to the network team!" answer or the (much more desirable) "look for X, Y, Z, or something weird in N...".


Are you saying "do a network capture" isn't desirable? It's not the first thing I would do, but I wouldn't get to Z before taking one.


No, I'm saying that it's used as a blowoff answer for something that sounds important because it's outside of what one in said job would normally do, but when asked about it (to see how generalist the person is) they wave it off as something for another person to handle.


These are questions I remember being asked. But they also asked what experience with servers I have and stuff..

     * At which layer things are in the OSI model.

     * How to reset AD user passwords.

     * What different kinds of servers are for.

     * Access port 25 in VLAN1 but not VLAN2 using port forwarding in the firewall

      * What ways of securing a DMZ are available in emergencies.


> "You type a web address into a browser, it doesn't load"

1) Check your internet connection. 2) Check if the website is broadcasting (assuming you control it).

99% of all instances of this issue solved right there. :-)


Information security. Our most basic interview demonstration question at my job is to hand the candidate a couple pages of tcpdump output and ask them to describe what was going on.

We don't use tcpdump all that often as we have more advanced tools, but if you can't understand a pretty raw representation of network traffic, you're probably not qualified to administer those advanced tools.


I build web sites. I would ask that the person complete a series of tasks that involved using Javascript to do a bit a typical DOM manipulations and CSS changes, especially using transitions and whatnot...

...without jQuery.


Just curious, why do you not allow the interviewee to use jQuery? Do you expect the candidate to know JavaScript's DOM-level APIs without having to look them up?


Hang out on forums, #javascript, etc some time and see how many people have problems because their understanding of the DOM begins and ends with the jQuery API. I see it all the time. I've seen big-name sites bug out or slow to a crawl because some dev wasn't thinking about what was going on behind their $()'s.

The DOM API isn't very large/complicated, and it shouldn't be hard to MOSTLY memorize it. It isn't a "gotcha" question though, a good interviewer is not gonna ding the heck out of you if your forget bits of it (most web devs do spend most of their time with jQuery, YUI, etc after all) - it's about whether you understand what's going on behind the scenes.


I don't mind the need to research DOM-level APIs on the fly. In fact, I'd be just as impressed that they not only know what they need to look up to refresh their memory, but where to look. Where and how they look might say things about them as well.

After all, I often have to look up a rarely used method. Seems unfair to judge harshly if they do so.


Native API are probably easier to learn, just more cumbersome to use :)


That's not really fizzbuzz, that's more the equivalent of asking a C programmer to write fizzbuzz in assembly (perhaps that's a little too strong a comparison, but you get the idea).

Fizzbuzz for websites would be to have someone code two paragraphs side by side with a header and footer in CSS.


If their primary task was to build HTML emails then I could understand that train of thought.

I suppose there are people out there that are working as front-end developers for websites that only work in HTML and CSS, but I would think that's getting more rare as time goes on. Of course, that could just be bias based on my own career.

But, going with your suggestion as a thought exercise, I would go in a different direction. To make a pattern like fizzbuzz I suppose I would do something more like: build a page with a series of paragraphs so that these CSS properties apply to every third paragraph, these CSS properties apply to every fifth paragraph, and so on. Something possible with CSS but requires the ability to think things through.


The point of fizzbuzz is that it is a ridiculously easy test that any programmer that says they "understand" a language should be able to do, because they would touch things like loops and str manipulation in pretty much every program they ever write.

Nobody is going to color code repeating paragraphs in CSS on every project. You wouldn't use that javascript on every project either. (Perhaps these days you would I guess, but I'd argue its overkill for basic websites, to be honest) I can guarantee you're farting around with the box model though, and odds are you could be using a framework to do it for you. Which is precisely why I'd want to test it.


I believe I would say that knowing how to do fizzbuzz for the reasons you describe and knowing how to do what I described is much the same thing within their own area of expertise.

For what I would consider a basic website I'm doing a decent amount of CSS and JS work. But I suppose that could depend upon your definition of a basic website.


Hah. I once asked an intern some random JS question and he used JQuery for everything. So I then asked him what the $ meant. That was a very awkward 5 minutes of silence.


Follow-up question: "Should you?"


Why?


Because there are times that jQuery is not available to you for whatever reason. Most of time the tasks I stated are not really that hard to do in vanilla Javascript, it just can be more verbose as opposed to doing it with jQuery.


I was an electrical engineer before I was a computer programmer and people liked asking questions about Ohm's law and calculating total resistance of resistors in series and parallel.

EDIT: I should clarify... They weren't asking about the theory behind Ohm's law but rather to apply it to various circuits.


Interesting - thanks. I was hoping for more results, but with this falling rapidly off the "newest" page, and with only one upvote, I guess we won't get (m)any more insights.

Edit: Last minute surge and we're on the front page! Amazing - thanks to those who voted - I'm waiting to see if we get any more answers ...


That feels more like asking the difference between a class and object. Its not really asking someone to solve nearly the simplest problem to prove they can can code.

Would the equivalent not be making a basic circuit to do something specific and hence a specific example that has one or two potential obvious issues?


Sorry Ohms law is not like a class or an object in any way at all.


he didn't mean literally like a class, it was an analogy. x is to y as z is to ...


Yes but an analogy needs to have some reasonable connection to the thing you are comparing it to for the analogy to make sense.


Engineer roles or Technician?


can you give an example? tia!


  A-->   +--[1 Ω]--+--[1 Ω]--+
         |         |         |
       [1 Ω]     [1 Ω]     [1 Ω]
         |         |         |
         +--[1 Ω]--+--[1 Ω]--+
         |         |         |
       [1 Ω]     [1 Ω]     [1 Ω]
         |         |         |
         +--[1 Ω]--+--[1 Ω]--+    <-- B
Find the resistance between A and B.


When I teach calculus 1, I usually put a "high discrimination" question at the beginning of every exam. Student performance on that question is usually very well correlated with the exam grade. It helps me quickly assess class performance right at the start of the grading process.

Examples of such questions:

Draw the graph of a function f, continuous on the reals, such that: f(x) > 0 always, f'(x) < 0 always, and f''(x) always has the same sign as x.

The line y=3x-2 is tangent to the graph of y=f(x) at x=4. What is f'(4)?


Hi - vfx/games programmer here - I've never been given FizzBuzz - although similar problems generally appear on any programming test I've seen (reverse a string, find loop in linked list etc.) the better ones I've seen a couple of times now are:

"How do you reflect a vector about a normal? Name some uses."

"What is the dot product? Name some uses."

These aren't just filters but an experienced real-time rendering or physics programmer is going to have quite a lot to say about the practical implementations of either in particular situations and what hardware features are available to help you with them...

if there is a good theoretical understanding then they can talk a lot about what the dot product is, its relationship to matrix multiplication and tensor inner products in general. they can explain multiple approaches to the problem of reflecting the vector and derive their solutions from first principles instead of just repeating remembered knowledge.

there are lots of little domain specific bits of knowledge as well e.g. the reflection question can lead to discussion of the Blinn half angle approximation - why it works and why its no longer a particularly valid optimisation on today's hardware - the dot product question can lead to discussion of how to efficiently handle solving a quadratic equation on the GPU

really i expect a decent (games/vfx) programmer to run out of time whilst elaborating on answers to either of these questions (although not to the detriment of the rest of the test of course...).

the best filter is a strong demo though imo. never met a bad programmer who has a decent demo... none of the average, lazy or bad programmers i've worked with have one that i couldn't have crapped out in the evening before an interview.


  A nation's newest war ship is slowly disappearing. The ship's hull
  is made of Aluminium while its cannon bores are made of Stainless Steel.
  What is happening, why, and how do you stop it?
Grad school chemistry/physics candicacy exam question, sure to generate colorful answers!


'Galvanic corrosion' and applying a non-metalic layer between the cannon/deck would be my first thought. Now I'm curious as to some of the more 'colorful answers'.


Yes indeed: two different metals in electrical contact through an electrolyte solution (sea water). One way to stop this is to use a sacrificial anode (where the oxidation occurs) made of Magnesium, although this would probably be unpractical in salt water. A colorful answer would be to replace the water with a fluid like 3M's Fluorinert and replace the air with a nitrogen atmosphere. But that would also be unpractical.


For news reporters, there's the AP Style test, which gives you text/headlines that you have to translate to the AP style, which is its own kind of grammar (e.g. for state abbreviations, Calif. instead of CA; always use [numerical-age]-[old] in describing ages: 15-year-old). It's a test more geared toward editors though.

In terms of reporting...there's no litmus test for how well you can report an event. The education beat, however, is pretty common for new reporters (it was my first), which is ironic considering how ridiculously complicated education is as a topic (same kind of financial/budgetary documents as any business, government bureaucracy, board politics, and of course, dealing with children)


Imaging science. If I were hiring another imaging scientist:

Write your own Fourier Transform. [If your FizzBuzz questions involve actually writing/coding the FizzBuzz, I'd not request actually writing it...]


I wonder how effective of a test this would be. Are the algorithmic & implementation details of the Fourier transform that important to imaging science? I would imagine that explaining the properties of the FT, it's connection to imaging modalities, etc would be far more relevant than actually coding a DFT.

How would you weight a naive implementation of the DFT vs the FFT?


>How would you weight a naive implementation of the DFT vs the FFT?

Luckily he didn't specify the dimensional requirements, so we can get away with f^[0] = f[0].


Where I work, we have a 4 question SQL test where we sit you in front of a dummy database, give you 4 printed results, and say "make the queries to produce exactly these results." Very very few people get all of them, but it gives us an idea of where somebody is at in their ability and knowledge.

The first is the easiest, and the second one isn't bad. The 3rd and 4th trip up most people though.


I do a very similar thing at my job, but instead of showing the results I ask for a report such as "we want to get X,Y and Z for the Ws where P holds true".


Can they look stuff up on the internet? A lot of queries require some ridiculous syntax like windowing in Postgres.


Yes, the full power of the internet is at your disposal. The catch is that you're limited to an hour to get it done.


I'd be curious to see that test.


Ditto!

I'd love something like [0]You can't javascript under pressure, but You can't Query under pressure for SQL. :-)

[0]http://games.usvsth3m.com/javascript-under-pressure/


It's a shame they don't have a leader board on that site; I completed the challenge in 4:21, but I have no idea whether that's good, bad, or indifferent among everyone who's done so. A Twitter search on 'I completed "You Can't JavaScript Under Pressure"' suggests that score is merely on the good side of indifferent, but given the relatively competitive idea behind the whole endeavor, it seems like formally ranking completion times would be a useful thing to do.

(The "you won" screen is also godawful, not least because of the potential risk of triggering seizures in the epileptic, but also very much because it's incredibly ugly. I mean, I know that tasteless is the new tasteful, but really.)


Think you might be interested in applying? :-)

We're hiring...


No thanks; I'm happy with the job I've got right now. But I appreciate you asking!

More seriously, shouldn't it be possible to post a version thoroughly enough sanitized to avoid giving the game away to your potential applicants, without losing the sense of the original? You don't identify your organization in your HN profile, so it doesn't seem terribly likely anyone who's looking for hints on your hiring process will find it. And, hey, it's entirely possible you'll get feedback which helps improve the question you're actually using in your interviews!


Jumping onto this chain, I'd also like to take a stab at a sanitized version of your test. We often have to reverse engineer our Client's existing reports when creating new BI reports for them in Cognos, and I'm curious to see how challenging your 3rd and 4th problem sets are.


If you're doing BI reports, I'd guess you would probably do well.


I might see if I can come up with something approximating the test... The tricky part about making it so that people online could take it is that I'd need to give you an avenue to play around in a query editor and run queries on a database I've created. Short of making a publicly available database on my own domain (which uses a different database platform than what we use at work), is there a good way to accomplish this?


Looking at it from another angle, I'm considering the same question, because I'd like to build the "You Can't SQL Under Pressure" somebody else mentioned in this thread. But I'm not all that happy about the idea of letting random users run arbitrary queries against a database I'm hosting, because it seems like protecting that against SQL injection and similar vulnerabilities would be extremely difficult if not entirely impossible.

I'd use the Web SQL Database stuff that W3C was working on, except that they stopped doing that in 2010, which guarantees the corpse is dead and rotting; unfortunately, the only emulation libraries I've been able to find implement various key-value stores on top of WebSQL, which is the precise inverse of what I'd need.

I suppose it'd be theoretically feasible, albeit hideously ugly, to create and populate a fresh database and account per user on a MySQL engine; with careful grants and a bit of scripting to kill slow queries before they can multiply and throttle the server, that might be workable on the small to medium scale, although I don't doubt a "Show HN" post would kill it in short order.

Thinking about it, I suspect I've more or less recapitulated why there is currently no "You Can't SQL Under Pressure"; it's a more subtle problem than it seems at first blush. If you come up with any clever ideas on the subject, I'd love to hear about them!


You might be overthinking it; for my purposes, at least, a set of example schemas and required outputs would suffice -- the rest I can probably do on a MySQL instance of my own.


I'm always interested in applying :-)


Send me your resume and I'll make sure it gets into the right hands. For the purposes of this comment thread, my email address is heresmyresume@viewthesource.org

(I catch anything sent to viewthesource.org, so I like to customize it when giving my email to people)

It should be noted that we're in Bentonville, AR. That could potentially be a limiting factor for some people.


No kidding? I know at least one extremely large company which has its headquarters in that city, and I have recently been hearing good things about the modernity of their IT infrastructure.


Yep, I worked at the company you're probably speaking about for 9 years (I say probably, because there are two...Walmart and JB Hunt). I just left Walmart 3 months ago, largely because I was ready for something new, and the place I work now is just a lot more cool. Big change for me, going from giant IT building to a company that has 25 people total. They had less than 20 when I applied and 9 people a year ago.


Well, I've never heard anything about JB Hunt's IT, for good or ill, so...

That must be a weird change, yeah, but it sounds good on the whole. It's a shame the timing didn't work out on this; if I still lived in Memphis, I'd have jumped at the opportunity. (I didn't like living in Memphis very well -- Bentonville seems like it'd be much more congenial.)


I personally like Bentonville pretty well. People from larger cities often complain "there's not enough to do" but I'm a bit of a home-body myself, so that doesn't bother me. One nice thing about this area is you can get a house with 5 acres of land for between $100k and $200k that would be comparable to houses easily twice or three times as much in someplace like Seattle.

I know a little bit about JB Hunt's IT. They use COBOL and Java. They might have some other stuff, but they seem big on those two from what I've heard.


Depends on telecommute options :P


We have a guy who works from home.


My Math "FizzBuzz" question is: prove Fermat's Little Theorem and Wilson's Theorem. Even those with very rudimentary understanding of modular arithmetic (coming from non-math backgrounds) can prove both with some pointers (obviously the process is interactive)


Interesting. I think the proof of Wilson's theorem is easy enough to be accessible for a non-mathematician - for non-prime n you can find a pair of numbers in the product (n-1)! that are congruent to 0 (mod n), and if n is prime then you pair up numbers in the product with their inverses until you are left with ±1.

But I can't think of a similarly low-level proof of Fermat's Little Theorem. Is there an obvious one I'm missing?


Consider Q to be the product (p-1)!. Then consider the product a.2a.3a.4a...(p-1)a. These are the same items rearranged, because a has an inverse.

But the product is also a^(p-1).(p-1)! by commutativity. So a^(p-1)=1 (mod p).

There's a clever idea in there, to consider prod(ka) for fixed a and k=1..(p-1). Not finding the clever idea means you won't prove it. Finding the clever idea is tricky. Once you know the clever idea the proof is trivial.


Applied math/scientific computing:

If hiring someone, I would describe some practical scenarios problems involving the solution of finite-dimensional linear systems, and ask to see how they would suggest solving them in each circumstance (LU, QR, SVD, iterative methods, etc). More discussion oriented than in doing rote recital of algorithm structure, probably.

I would do something similar for functional characterization/pattern extraction via FFT/DCT/SVD/wavelets/ICA/etc.

Finally, I would construct some problems involving appropriate selection of statistical inference models and associated tests of hypothesis (from the usual the Fisher and Neyman-Pearson perspectives). The basics of Bayesian inference is good to know as well.


Interesting - this is exactly the kind of answer I'm looking to learn from. Are these really so utterly fundamental that someone could do them under pressure without thinking?

Is it the case that someone who can't answer these near automatically is, in effect, incompetent (in this field)?

I'm wondering if you've set the bar higher than FizzBuzz is in computing, or if this really is equivalent.

Thanks!


I would agree with the op here. SVD in particular is so fundamental to the idea of data analysis I couldn't imagine talking to someone who even flinched at hearing it. Most other things fit at roughly the same level in my mind.

I feel like the mechanics of test selection biases against people who've specialized in Bayesian stuff—which might be very interesting! It's definitely a make-or-break kind of thing as to whether you can correctly formulate all the various moving parts and relate them correctly.


For this list, I'm basically assuming that I'm interviewing a candidate for a Ph.D. level (or equivalent) numerical analyst R&D type position. If they don't feel comfortable discussing the basics of these items, then they probably can't do too much more for us than algorithm plug&chug. Which has its value, but not what I had in mind.

That said, I also wouldn't expect someone to write out implementations on a whiteboard under pressure (I'm not a Google recruiter). I'd be more interested in prodding their brain to gauge their general level of understanding, which is (I'm pretty sure) how hiring committees for mathematicians in academia operate. If they have a good foundation, I think it's less important that they've rote memorized implementation details.


Do you have any example problem to share? I've taken a numerical analysis class and this interview-like questions are appealing.


These are rough guidelines based on current best-practices that I know of, and shouldn't be treated as doctrine obviously. Numerical analysis/linear algebra is actually a pretty fast evolving field as far as applied math goes. Though statistics is a bit less dynamic at the moment, I'd say.

Honestly, after a certain amount of time, I'd expect a new hire would be able to teach me what is state-of-the-art in the field based on new literature.

But the questions I'd have in mind would be couched like this:

Linear system:

    - Is it small, square, and numerically well-conditioned? Use LU - it's pretty fast to write and pretty fast to use in practice.

    - Is it small, but rectangular (ie. overdetermined), or not as well conditioned? QR is a good choice.

    - Is it small, but terribly conditioned, or do you want to do rank revealing, or low-rank approximation while you're at it? Will you be using this matrix to solve many problems (multiple right-hand sides)? SVD would fit the bill

   - Is it large, or sparse, or implicitly-defined (ie. you don't actually have access to the elements of the matrix defining the system - you just have a surrogate function that gives you vectors in its range, or something)? Use an iterative algorithm. Krylov subspace methods (MINRES, GMRES, conjugate gradient, etc) are your friends here.
Pattern matching (more specific in question formulation):

    - If you wanted to determine the "strength" of a waveform (in a finite uniform sampling of data) that is re-occurring in a fairly regular way (like arterial pulse in an array of data taken from an oximeter), what type of transformation would you use, and how would you use the resulting information given in transform domain?

   - What if you wanted to determine the strength of a waveform that is short lived/impulse in nature, but re-occurs without any known periodicities (eg. eye blinks artifacts in a sample of EEG data)

   - How would your answers to the above questions change if there are n separate channels of data collected simultaneously (ie. sampled in different locations), which may be analyzed together?
Statistical analysis (might seem vague, I'd be more interested in good discussion with a candidate here than in actual whiteboard writing):

    - What does statistical significance mean, in the context of decision making? Is it a property of the test you perform, or is it a property of your data? (sort of a trick question, this basically rehashes the Fisher vs. Neyman & Pearson debates of the 20th century stats community)

   - Some cod problems of when to use z, t, f tests. Basically, you use them when your situation matches the appropriate inference models (two normally-distributed sample comparison of means with identical variances? t test.)

   - How do you construct an optimal test from scratch, if one doesn't already exist for your particular situation? (basically, if minimize type II error with fixed type I makes you comfortable for your problem, can you do use the Neyman-Pearson lemma to do likelihood ratio construction correctly)

   - What does a p-value actually mean? What if you instead wanted to have actual probabilities for your hypotheses, or you had apriori information that you wanted to use? (Bayesian inference is the winner here)

   - Probably something from point estimation: least squares, minimax criterion, Bayesian MAP estimates, general model fitting, those sort of thing. Brings it all back around to numerical (where I'm most comfortable). Like how applying statistical ridge regression is just knowing how to code up Tikhonov regularization, when you get down to implementation.


Thanks for such extended reply!


My pet problem after FizzBuzz has been this for some time now:

    You are given a text in a file. Count all the words in the
    text and print out the words with their number of occurences
    in descending order.
This can be depressing at times.


Define "word". However, strings separated by space:

Print number of words:

    wc -w <filename>
Print words with counts in decreasing order:

    cat <filename>  \
    | gawk '{for (i=1;i<=NF;i++) print $i}' \
    | sort          \
    | uniq -c       \
    | sort -rn      \
    | less
There are optimizations to be made, but that would be my go to solution in the first instance.


Quality Assurance (QA) engineer. I do QA Tools specifically but still need to be on top of most QA stuff. Other QA engineers exist in the company for in depth QA.

My initial question starts with: "build a function that takes a string and produces an integer". About 30% of candidates fail to get anywhere. 50% produce an incorrect implementation. 20% produce an implementation that is mostly correct. 9% produce an implementation that is mostly correct and well tested. 1% actually produce a complete and correct response. Which, as of right now, translates to exactly 1 candidate in 2 years. Who was rejected due to culture fit. A complete and correct answer is not required to pass the screen. ;-)

The reason this question is so effective is due to how the question is asked and the details of the solution. There is a lot more than just the initial problem statement. The full script assess: the candidates ability to determine requirements; ability to produce complete set of unit tests; ability to handle the boundary conditions.

There is a sharp division that occurs between QA engineers and non-QA engineers on the requirements gathering. All but 1 non-QA engineer I've interviewed assumed they had all the requirements before writing code. A clever solution is almost useless if it solves the wrong problem. Sometimes even worse than useless if the code adds a maintenance cost for no value.

Covering all the critical cases in testing is the next part that most candidates fail at. Even if the requirements they assumed are correct, none of the non-QA engineers have succeeded in writing a complete and good set of tests. Many add too many tests.

After those two the other parts are just seeing how close they can get to building the correct implementation. The candidates typically pass if they do well in all parts; perfection is not required. Still, I think it's interesting to note that almost nobody has ever written a 100% correct solution. Even if they are given all the requirements in detail.

This question does not asses algorithm design ability. At all really. Well, I suppose there was somebody who immediately proposed "binary search!" and really wanted to stick with that idea. No clue why...


This would match your specification.

    def string_to_int(str):
        
        return 1


That would match the initial specification. The point is for the candidate to note that the initial specification is too ambiguous and determine what they do about it.

If they gave that answer and then said "which is why the specification is weak" then great! If they gave that answer and called it done... well... not so good.


Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: