Hacker News new | past | comments | ask | show | jobs | submit login
Fizz Buzz in Tensorflow (joelgrus.com)
378 points by joelgrus on May 23, 2016 | hide | past | web | favorite | 84 comments

This whole thing is making fun of asking fizz-buzz to a senior programmer, but yet I've found it to be one of the best phone screen questions possible.

For someone who is truly a senior programmer, they knock it out in about 30 seconds and we move on. For the ones who pretend to be senior programmers on their resume, it trips them up and I know right away that their resume is either a pack of lies or their previous coworkers were helping them a lot more than they let on.

If I had actually had this person, I would have probably laughed as soon as he did the imports, said, "clearly you think this is a silly question" and then explained why I ask that question. Hopefully we could have moved on from there.

It's just a jape, not to be taken seriously, made in the great comic tradition of FizzBuzz Enterprise Edition: https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...

Joking aside, this example does a great job to show one of my complaints with common (edit: enterprise) Java practices - Where is the logic?

Common enterprise OO problems. It is neither confined to Java nor representative of Java as used outside enterprise development.

> nor representative of Java as used outside enterprise development.

It isn't? I've turned down a couple of sweet gigs, at times when I really needed the comfort of work-group structure and maybe even a pay check, because I assumed that (based on my reading) it was all like that. I knew from history, back in the days when it was called Oak that it was a pretty delightful language. Then WS-* and Enterprise OO happened, and I assumed that's where it all went...

I write NLP software in Java and we don't have any of that crap. We have had a couple devs pass through from that world, but we have largely avoided any of the architecture astronaut crap that plagues enterprise development.

It's a shame that the name "Java" has become synonymous with it, since it really isn't a bad language (as of Java 7, at least).

Why is Java 8 even worse though? It supposedly introduces several mechanisms to reduce redundancy?


Senior programmer? ...

Shouldn't a junior programmer be able to solve fizzbuzz?

I'm light years away from being senior, but I can write functions like fizzbuzz in 10 seconds and can't find a junior job

> Shouldn't a junior programmer be able to solve fizz buzz?

Yes, which is why when I would ask it of a someone who calls themselves a senior programmer I was dumbfounded at how many could not answer the question.

FWIW, please be mindful of your biases. I recently had an interviewer with a major bank conglomerate who had some silly puzzle. I outlined a system for solving it in algorithmic notation, and the interviewer had no idea that such a response was appropriate and said sorry, that they were looking for someone with programming experience, not specific, just "many languages" ... Not saying you do this, but what if someone provided Fizz Buzz in COQ or Elixir, or some kind of rule based solver? Honestly a lot of that may look like gibberish to me if I was looking just for a way to dismiss people. Or what if they never heard of Fizz Buzz? Anyway... my brain starts to fixate on all of these corner case scenarios in hiring where you're screwed because something "easy" never does solve the hard problem of hiring. What if the person couldn't do it because they think you're trying to trick them? Anyway... best of luck in your hiring.

I hate puzzles too, especially the gotcha ones with esoteric knowledge required.

But for FizzBuzz I explain the problem, explain that there is no trick and the simplest answer that produces the correct output will be fine. To solve it you need to know about building a loop, what the mod operator does, and maybe keeping state depending on how you build it. I tell them to write it in the language they know best. None of those things should be "gotchas" in your favorite language.

You don't need mod, loops or recursion to solve it. For instance:

    local remove = table.remove
    local insert = table.insert
    local print  = print
    local sequence = { 
        false  , false  , 'Fizz' , false  , 'Buzz',
        'Fizz' , false  , false  , 'Fizz' , 'Buzz' ,
        false  , 'Fizz' , false  , false  , 'FizzBuzz'
    local function o(v)
      local name = remove(sequence,1)
      print(name or v)
      return v + 1
    local function t(v)
      return o(o(o(o(o(o(o(o(o(o(v))))))))))
    local function h(v)
It does rely upon state (the sequence table) but even that could probably be worked around if Lua had a slice syntax.

You're right, it doesn't (clever solution BTW).

Although we could debate that the way you call o and t is just you manually expanding a loop. In which case I would question why you didn't just use a loop. :)

I could also debate with you that you had to at least understand what mod is to build the sequence.

I would also give you bonus points for detecting that it is a repeating pattern.

But the point is, you solved it, you wrote code that could solve it. So now I know that you're at least capable of writing basic code, and we've got a couple of things to talk about!

My team isn't exactly doing "top-tier CS work", so if someone "showed off" how far above the average they are with a Coq solution or the quirky GP you replied to or the OP's neural net I'd be delighted and want to skip the rest of the interview, recommending an immediate hire (or at least a go-to-next-stage since I only give these questions over phone screens). The question is beneath them and they have demonstrated that, our team's work is also probably beneath them. We just use that question to filter out the totally unqualified. Their potential value is high because they are overqualified -- they didn't just pass it, they passed it with style. Most dev work at large companies is dull and doesn't require much knowledge that e.g. a CS degree supposedly gives, which is part of why it's amusing/frustrating we have such a bad whiteboard hazing culture insisting candidates can invert a tree or whatever.

I head the "what if they do it this other weird way?" question off by giving them a simple problem, having them solve it (or at least convince me their solution solves it), and then asking them to solve it a different way. My screens mostly follow the ideas from https://sites.google.com/site/steveyegge2/five-essential-pho... I allocate about 10 minutes max for "basic coding" where I just ask them (lately) to write a function that returns true if the given input int (which is assumed to be > 0) is even and false if odd. Then I ask them to do it in different ways. If they use bit-and, I figure they know enough about the "bits and bytes" section too for my team's purposes. Last time someone couldn't recall using bit-and but surprised me with a way I didn't consider, which was convert to string and check the last character. I'm sure there are other ways to do it I haven't considered either. Maybe someone will give me a solution with tensorflow someday, but what would be really impressive is if they bust that out within a few minutes. :) All I'm really after is "can you write code?", if the answer is no then very little time was wasted, if the answer was yes we can go into variations and other questions to rank the other yeses / find other red flags.

No loops, no sequence, one horrific line. :)

    Console.Write(String.Join("\r\n",  Enumerable.Range(1, 100).Select(x => x % 15 == 0 ? "FizzBuzz" : x % 5 == 0 ? "Buzz" : x % 3 == 0 ? "Fizz" : x.ToString())));

There's a hidden loop in Enumerable.Range(), and quite possibly another one in String.Join(). Just because you avoided an explicit loop doesn't mean there isn't one (for example, an "if" statements has an implicit GOTO).

Ternaries aren't super readable.

  Array.from(Array(100).keys()).map(k => k + 1).map(i => !(i % 15) && 'fizbizz' || !(i % 5) && 'buzz' || !(i % 3) && 'fizz' || i)

No need for ternaries!

    Array.from(Array(100).keys()).map(k => k + 1).map(i => [i,'Fizz','Buzz','FizzBuzz'][!(i%3)+2*!(i%5)])

One of my friends failed an interview out of college where they asked him fizz buzz, and [using python] he created an array [list(range(1,101))] and then overwrote multiples of 3 with "fizz" multiples of 5 with "buzz" and multiples of 15 with "fizzbuzz."

Because it was an unconventional solution, the interviewer didn't like it.

That's exactly how I'd do it! What's supposed to be the better way? Modulo/division by three and five?

    for x in xrange(1,101):
        if not x % 3:
            print "Fizz",
        if not x % 5:
            print "Buzz",
        if (x % 3) and (x % 5):
            print x
            print ""

>Modulo/division by three and five?

And fifteen.

  for(var i = 1; i <= 100; i++) {
    a[fizz +=3] = "fizz" + (a[fizz] || "");
    a[buzz +=5] = "buzz";
    print(a[i] || i);

If someone answered FizzBuzz in Coq, I'd question their judgement. On the other hand, usually in situations like that I'd ask the interviewee to explain their solution. If they can explain it well, that's fine.

> what if someone provided Fizz Buzz in COQ or Elixir

Wie waere es fuer dich, wenn ich in einer Sprache antworte, die du wahrscheinlich nicht verstehst?

Zavisi. Ako si mi rekao da mogu da koristim bilo koji jezik, onda je na tebi da pronađeš prevod ;)

What are the odds my wife's language would show up in an HN comment?

Au secours! J'ai compris une-et-demi de ces réponses. Време ли е за Физз-Бузз?

يمكنك دائما استخدام مترجم جوجل

Last year I interviewed about 5 or 6 senior developers for a .Net role. All took 10 or more minutes to complete the fizzbuzz exercise. One couldn't remember the modulo operator. One of the fastest candidates handed me his solution and it didn't compile (this guy's CV also had all of the latest and greatest frameworks listed in his experience).

Fine, I accept that there's performance anxiety in interview situations, but if I'm trying out for the Broncos I can't blame it nervousness that I couldn't kick the ball.

10 seconds with good test coverage is quite nice.

I do warm-up problems like this even when starting hour long code interviews. I don't like FizzBuzz though, I like to use same difficulty questions that can be expanded onto other tasks, a point-in-rect sample question can turn into a real bounding box question.

The thing is, there are a lot of people whose ability to code can be verified pretty easily without wasting the time of multiple people in your company. So why not do that?

interviewer: How far are you intending to take this? me: Oh, just two layers deep


I would have given him the job

If that were a real situation, I wouldn't have, he over-engineered the shit out of it and while cool, I prefer devs who can solve simple problems simply; they get more work done for the price.

This is not over-engineering. Defining a couple of functions to solve it would be over-engineering, like offloading Fizz to it's own function, Buzz to another one etc. Anything (deliberately) more is just comedy.

Training a neural network to solve fizzbuzz is over-engineering, fun over-engineering, but over-engineering none the less.

Depends on the position no? If you are looking for a developer sure. But if you are looking for an engineer, creative solutions are a plus.

Sometime it is better to hire for culture, someone who is intelligent but also, and this is critical, can take a joke.

Yeah, I suppose the guy wouldn't feel good in the company anyway. Personally, I wouldn't either. Sense of humour is an essential part of good dev environment, IMO.

Which is why I said if this was for real.

I prefer employers that don't treat me like a baby, so to each their own ;)

Well, when you give a hilariously bad interview question...

It's not a bad interview question, if you think it is, you perhaps haven't had enough experience with candidates applying to programming jobs who talk a good talk but can't program to save their lives.

Well, technically, as a programmer and not enterpreneur you shouldn't be expected to have "experience with candidates applying to programming jobs who talk a good talk but can't program to save their lives".

No one said programmers were expected to; however, not having that experience makes judgement of what is or isn't a bad question baseless. FizzBuzz and all such trivial code tests are fantastic interview questions because most applications can't program, that is exactly the point and purpose of FizzBuzz, weeding out liars of which there are many due to high salaries in comparison to other fields.

Yeah, hopefully this kind of question is handled in a phone screen before bringing them in.

This is the most off the wall and hilarious thing I've seen in a month. For calibration: I spent the whole weekend at Maker Faire.

No wonder you didn't get the job, using a shallow ann when they were probably looking for a pointer network.

I can't tell if this guy did this during a real interview, or if that's part of the joke. Maybe that's intentional. Very funny, either way.

It reminds of the paper where they use SVM to "visually identify" the matrix rank:


He should have used a Recurrent Neural Network with Long Short Term Memory so that the neural network won't have been dependent of the maximum number given NUM_DIGITS. Scalability! No wonders he didn't got the job.

Joke aside, it's funny how simple machine learning problems can reveal people who think you can just give neural networks anything and output anything, and it will work like magic.

It also reveals something that's not obvious, but is very clear to anyone who's tried (deliberately or accidentally): Teaching a neural network simply to count is surprisingly hard.

> people who think you can just give neural networks anything and output anything, and it will work like magic.

What _is_ the proper method?

Well, there is a variety of possible answers here. Giving the direct integer ("1", "23") to a simple deep network won't work.

- RNN with LSTM might be a better approach, since it will more based on the value of the ordered bits than their "disposition", and could scale to an arbitrary high number.

- @zardo mentioned Pointer Network (https://arxiv.org/abs/1506.03134). It looks to solve this kind of problems but honestly I'm discovering it now.

- Give the base 3 and 5 of the input would be the fastest solution. (but since it's kind of hardcoding part of the solution, it's more a trick than a general solution).

Can anyone correct me what if I got something wrong here? Did I missed something?

Not that I advocate this approach, but another way to look at FizzBuzz is as a personality test.

If a (senior) developer balks at writing FizzBuzz, perhaps it shows arrogance and a lack of humility. It could also be useful to see if the developer follows up to inquire as to why they were asked a FizzBuzz question, and about its relevance to the work. I see this kind of questioning as not only fair game, but good to ask about. Not asking about something that seems out of place could seem off.

Maybe the one-hot binary encoding wasn't the best feature set. Base 3 and Base 5 encodings might have worked better.

Stupid question: We are trying to train the network how to calculate modulus of division. Of a binary encoding. Why should we expect this to be learnable in only two layers?

Yeah, secretly I was surprised that it worked as well as it did!

What architecture do you need to get 100% accuracy?

Don't forget to use a validation set for the model and hyperparameter selection though!

You could encode the input with one-hot decimal digits. Then checking divisibility by 5 is just seeing whether input 0 or 5 of the least significant digit is high. Divisibility by 3 requires checking whether the digit sum is divisible by 3 which you could easily do by connecting a neuron to all inputs, weighing the inputs proportional to the digit they represent and activate if the sum is divisible by 3. For the three digit numbers from 0 to 999 this would require 30 inputs, one neuron to check divisibility by 5, twenty neurons to check divisibility by 3 and four neurons in the output layer to create the desired outputs. You could reduce the number of neurons a bit if you added another layer and did the divisibility by 3 in one additional step, I think twelve would do if you subtracted 10 or 20 with four neurons and just checked for 0, 3, 6 and 9 with eight neurons. You could also choose the input encoding to be one-hot base 15 digits which would make the net trivial and offload the hard part into the input encoding. Whether you could easily arrive at that design by training is of course a whole different question.

I suspect that there is no way to predict in advance the smallest model which can represent any given function. However, we should be able to at least develop an intuition for depth and breadth that is sufficient on any given problem. Classically, division circuits are large and complex, so I naively expect a NN built out of add, mul, and Heaviside(x)*x to also be large and complex.

As with many models, I suspect that this network really learned some other property that has almost-but-not-quite the same pattern as divisible-by-N.

Pretty sure it's been proven that sufficiently large 3-layer networks can learn any function.

It's called the Universal Approximation Theorem [1].

[1] https://en.wikipedia.org/wiki/Universal_approximation_theore...

Technically, it's not two layers. It's only two layers of parameters. A RNN runs the two layers repeatedly at each sequential input. So if my input/output is like '1000~>fizz', it can be seen as a feedforward NN with at least 8 layers. (This is how it's possible to train a RNN in the first place: it gets 'unrolled' over time and turned into a big NN.)

The bit string representation actually isn't that bad for modulo computations: 2^n mod 3 is just 1 + (n % 2), and 2^n mod 5 is just {1,2,4,3}[n % 4].

Perfectly learnable to add them all up with the right weights? Given we're doing mod N, negative weights fit in naturally. So, pretty good seems plausible.

Please tell me you really did this. A company name would make my year, but I'll be satisfied with a simple yes/no confirmation.

It's a fun article, but solving the problem with the simplest, clearest solution possible should be the goal and if someone provided this solution I would hesitate to hire them. Yak shaving in training a neural network would actually make me nervous that the candidate would over-engineer solutions. It's not like whiteboarding sleep sort or something for a fast easy clever solution (if impractical and inefficient). The guy who has to maintain this version of fizzbuzz after it's wrote would loose his mind. I'm probably being overly practical - it's a very fun article.

Lighten up

I think I'd start with, "Let's define two combinators S and K..." ....

Fizz Buzz in combinatory logic, that would be quite entertaining.

I'm absolutely delighted that there is more and more such developer-generated prose with context of recent events. Is there a place where it would be aggregated?

Seems like a bit of a wasted opportunity to assume input as binary digits, rather than as a sequence of hand-written numbers on the white-board. Then there should of course be a second whiteboard, and a robot arm with a marker for output, along with a pair of cameras for stereo input and feedback while training the robot arm in hand-writing "fizz", "buzz" and "fizzbuzz" ...

Should've trained to generate the output strings directly, using an LSTM decoder... That would be more end-to-end.

I think this just illustrates the Alpha principle, the interviewer was obviously intimated. This is because they are a Beta interviewing and Alpha. If the interviewer had more imagination and confidence rather than the long pauses it could have been a great learning opportunity!

   >>> for i in range(1,101): print "FizzBuzz"[i*i%3*4:8--i**4%5] or i
The above line cost your interview :) He might be expecting like this one

All that and it still couldn't get the right answer? With a basic pattern like that?

I suspect somewhere Hofstadter is having a good laugh.

Great post. Both funny and informative.

And here I thought that the Haskell style FizzBuzz was clever.

It would be much more interesting to see a TensorFlow program actually learn to solve FizzBuzz from examples, instead of hardcoding in the logic:

    if   i % 15 == 0: return np.array([0, 0, 0, 1])
    elif i % 5  == 0: return np.array([0, 0, 1, 0])
    elif i % 3  == 0: return np.array([0, 1, 0, 0])
    else:             return np.array([1, 0, 0, 0])

The TF program does learn to sovle FizzBuzz. The above code was used to generate data with which to train the network.

As far as I can see, that function was used only to generate training data. How would you like to see it trained instead? Should it output a string and only be told whether that was the correct output or not for the given input number?

It's really easy even just hand generate a "neural network" if you assume that the inputs can be preprocessed into base-3 and base-5 representations

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact