
Fizz Buzz in Tensorflow - joelgrus
http://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow/
======
jedberg
This whole thing is making fun of asking fizz-buzz to a senior programmer, but
yet I've found it to be one of the best phone screen questions possible.

For someone who is truly a senior programmer, they knock it out in about 30
seconds and we move on. For the ones who pretend to be senior programmers on
their resume, it trips them up and I know right away that their resume is
either a pack of lies or their previous coworkers were helping them a lot more
than they let on.

If I had actually had this person, I would have probably laughed as soon as he
did the imports, said, "clearly you think this is a silly question" and then
explained why I ask that question. Hopefully we could have moved on from
there.

~~~
anysz
Senior programmer? ...

Shouldn't a junior programmer be able to solve fizzbuzz?

I'm light years away from being senior, but I can write functions like
fizzbuzz in 10 seconds and can't find a junior job

~~~
jedberg
> Shouldn't a junior programmer be able to solve fizz buzz?

Yes, which is why when I would ask it of a someone who calls themselves a
senior programmer I was dumbfounded at how many could not answer the question.

~~~
th0ma5
FWIW, please be mindful of your biases. I recently had an interviewer with a
major bank conglomerate who had some silly puzzle. I outlined a system for
solving it in algorithmic notation, and the interviewer had no idea that such
a response was appropriate and said sorry, that they were looking for someone
with programming experience, not specific, just "many languages" ... Not
saying you do this, but what if someone provided Fizz Buzz in COQ or Elixir,
or some kind of rule based solver? Honestly a lot of that may look like
gibberish to me if I was looking just for a way to dismiss people. Or what if
they never heard of Fizz Buzz? Anyway... my brain starts to fixate on all of
these corner case scenarios in hiring where you're screwed because something
"easy" never does solve the hard problem of hiring. What if the person
couldn't do it because they think you're trying to trick them? Anyway... best
of luck in your hiring.

~~~
jedberg
I hate puzzles too, especially the gotcha ones with esoteric knowledge
required.

But for FizzBuzz I explain the problem, explain that there is no trick and the
simplest answer that produces the correct output will be fine. To solve it you
need to know about building a loop, what the mod operator does, and maybe
keeping state depending on how you build it. I tell them to write it in the
language they know best. None of those things should be "gotchas" in your
favorite language.

~~~
spc476
You don't need mod, loops or recursion to solve it. For instance:

    
    
        local remove = table.remove
        local insert = table.insert
        local print  = print
        
        local sequence = { 
            false  , false  , 'Fizz' , false  , 'Buzz',
            'Fizz' , false  , false  , 'Fizz' , 'Buzz' ,
            false  , 'Fizz' , false  , false  , 'FizzBuzz'
        }
        
        local function o(v)
          local name = remove(sequence,1)
          insert(sequence,name)
          print(name or v)
          return v + 1
        end
        
        local function t(v)
          return o(o(o(o(o(o(o(o(o(o(v))))))))))
        end
        
        local function h(v)
          t(t(t(t(t(t(t(t(t(t(v))))))))))
        end
        
        h(1)
    

It does rely upon state (the sequence table) but even that could probably be
worked around if Lua had a slice syntax.

~~~
Pharylon
No loops, no sequence, one horrific line. :)

    
    
        Console.Write(String.Join("\r\n",  Enumerable.Range(1, 100).Select(x => x % 15 == 0 ? "FizzBuzz" : x % 5 == 0 ? "Buzz" : x % 3 == 0 ? "Fizz" : x.ToString())));

~~~
davidy123
Ternaries aren't super readable.

    
    
      Array.from(Array(100).keys()).map(k => k + 1).map(i => !(i % 15) && 'fizbizz' || !(i % 5) && 'buzz' || !(i % 3) && 'fizz' || i)

~~~
ingenter
No need for ternaries!

    
    
        Array.from(Array(100).keys()).map(k => k + 1).map(i => [i,'Fizz','Buzz','FizzBuzz'][!(i%3)+2*!(i%5)])

------
fayimora
_interviewer: How far are you intending to take this?_ _me: Oh, just two
layers deep_

haha

~~~
LaurenceW1
I would have given him the job

~~~
gnaritas
If that were a real situation, I wouldn't have, he over-engineered the shit
out of it and while cool, I prefer devs who can solve simple problems simply;
they get more work done for the price.

~~~
vonklaus
Sometime it is better to hire for culture, someone who is intelligent but
also, and this is _critical_ , can take a joke.

~~~
TeMPOraL
Yeah, I suppose the guy wouldn't feel good in the company anyway. Personally,
I wouldn't either. Sense of humour is an essential part of good dev
environment, IMO.

------
dbcurtis
This is the most off the wall and hilarious thing I've seen in a month. For
calibration: I spent the whole weekend at Maker Faire.

------
zardo
No wonder you didn't get the job, using a shallow ann when they were probably
looking for a pointer network.

------
chestervonwinch
I can't tell if this guy did this during a real interview, or if that's part
of the joke. Maybe that's intentional. Very funny, either way.

It reminds of the paper where they use SVM to "visually identify" the matrix
rank:

[http://www.oneweirdkerneltrick.com/rank.pdf](http://www.oneweirdkerneltrick.com/rank.pdf)

------
BenderV
He should have used a Recurrent Neural Network with Long Short Term Memory so
that the neural network won't have been dependent of the maximum number given
NUM_DIGITS. Scalability! No wonders he didn't got the job.

Joke aside, it's funny how simple machine learning problems can reveal people
who think you can just give neural networks anything and output anything, and
it will work like magic.

~~~
Bromskloss
> people who think you can just give neural networks anything and output
> anything, and it will work like magic.

What _is_ the proper method?

~~~
BenderV
Well, there is a variety of possible answers here. Giving the direct integer
("1", "23") to a simple deep network won't work.

\- RNN with LSTM might be a better approach, since it will more based on the
value of the ordered bits than their "disposition", and could scale to an
arbitrary high number.

\- @zardo mentioned Pointer Network
([https://arxiv.org/abs/1506.03134](https://arxiv.org/abs/1506.03134)). It
looks to solve this kind of problems but honestly I'm discovering it now.

\- Give the base 3 and 5 of the input would be the fastest solution. (but
since it's kind of hardcoding part of the solution, it's more a trick than a
general solution).

Can anyone correct me what if I got something wrong here? Did I missed
something?

------
slantedview
Not that I advocate this approach, but another way to look at FizzBuzz is as a
personality test.

If a (senior) developer balks at writing FizzBuzz, perhaps it shows arrogance
and a lack of humility. It could also be useful to see if the developer
follows up to inquire as to why they were asked a FizzBuzz question, and about
its relevance to the work. I see this kind of questioning as not only fair
game, but good to ask about. Not asking about something that seems out of
place could seem off.

------
mattmcknight
Maybe the one-hot binary encoding wasn't the best feature set. Base 3 and Base
5 encodings might have worked better.

~~~
brandmeyer
Stupid question: We are trying to train the network how to calculate modulus
of division. Of a binary encoding. Why should we expect this to be learnable
in only two layers?

~~~
joelgrus
Yeah, secretly I was surprised that it worked as well as it did!

~~~
nharada
What architecture do you need to get 100% accuracy?

Don't forget to use a validation set for the model and hyperparameter
selection though!

~~~
brandmeyer
I suspect that there is no way to predict in advance the smallest model which
can represent any given function. However, we should be able to at least
develop an intuition for depth and breadth that is sufficient on any given
problem. Classically, division circuits are large and complex, so I naively
expect a NN built out of add, mul, and Heaviside(x)*x to also be large and
complex.

As with many models, I suspect that this network really learned some other
property that has almost-but-not-quite the same pattern as divisible-by-N.

~~~
wyager
Pretty sure it's been proven that sufficiently large 3-layer networks can
learn any function.

~~~
danbruc
It's called the Universal Approximation Theorem [1].

[1]
[https://en.wikipedia.org/wiki/Universal_approximation_theore...](https://en.wikipedia.org/wiki/Universal_approximation_theorem)

------
uhhuhuh
Please tell me you really did this. A company name would make my year, but
I'll be satisfied with a simple yes/no confirmation.

------
ninjakeyboard
It's a fun article, but solving the problem with the simplest, clearest
solution possible should be the goal and if someone provided this solution I
would hesitate to hire them. Yak shaving in training a neural network would
actually make me nervous that the candidate would over-engineer solutions.
It's not like whiteboarding sleep sort or something for a fast easy clever
solution (if impractical and inefficient). The guy who has to maintain this
version of fizzbuzz after it's wrote would loose his mind. I'm probably being
overly practical - it's a very fun article.

~~~
theoracle101
Lighten up

------
arethuza
I think I'd start with, "Let's define two combinators S and K..." ....

Fizz Buzz in combinatory logic, that would be quite entertaining.

~~~
spc476
It's been done: [https://codon.com/programming-with-
nothing](https://codon.com/programming-with-nothing)

------
laacz
I'm absolutely delighted that there is more and more such developer-generated
prose with context of recent events. Is there a place where it would be
aggregated?

------
e12e
Seems like a bit of a wasted opportunity to assume input as binary digits,
rather than as a sequence of hand-written numbers on the white-board. Then
there should of course be a second whiteboard, and a robot arm with a marker
for output, along with a pair of cameras for stereo input and feedback while
training the robot arm in hand-writing "fizz", "buzz" and "fizzbuzz" ...

------
psb217
Should've trained to generate the output strings directly, using an LSTM
decoder... That would be more end-to-end.

------
astannard
I think this just illustrates the Alpha principle, the interviewer was
obviously intimated. This is because they are a Beta interviewing and Alpha.
If the interviewer had more imagination and confidence rather than the long
pauses it could have been a great learning opportunity!

------
_navaneethan

       >>> for i in range(1,101): print "FizzBuzz"[i*i%3*4:8--i**4%5] or i
    

The above line cost your interview :) He might be expecting like this one

------
phlakaton
All that and it still couldn't get the right answer? With a basic pattern like
that?

I suspect somewhere Hofstadter is having a good laugh.

------
namelezz
Great post. Both funny and informative.

------
vvanders
And here I thought that the Haskell style FizzBuzz was clever.

------
gertef
It would be much more interesting to see a TensorFlow program actually learn
to solve FizzBuzz from examples, instead of hardcoding in the logic:

    
    
        if   i % 15 == 0: return np.array([0, 0, 0, 1])
        elif i % 5  == 0: return np.array([0, 0, 1, 0])
        elif i % 3  == 0: return np.array([0, 1, 0, 0])
        else:             return np.array([1, 0, 0, 0])

~~~
sherjilozair
The TF program does learn to sovle FizzBuzz. The above code was used to
generate data with which to train the network.

