
How HBO’s Silicon Valley Built “Not Hotdog” with TensorFlow, Keras and React Native - timanglade
https://medium.com/@timanglade/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3
======
timanglade
Just wanted to say thanks for the warm welcome from HN when the app was
released last month — I hope this blogpost answers the questions that were
raised back then.

I’d be happy to answer anything else you’d like to know!

Original thread:
[https://news.ycombinator.com/item?id=14347211](https://news.ycombinator.com/item?id=14347211)

Demo of the app (in the show):
[https://www.youtube.com/watch?v=ACmydtFDTGs](https://www.youtube.com/watch?v=ACmydtFDTGs)

App for iOS: [https://itunes.apple.com/app/not-
hotdog/id1212457521](https://itunes.apple.com/app/not-hotdog/id1212457521)

App for Android (just released yesterday):
[https://play.google.com/store/apps/details?id=com.seefoodtec...](https://play.google.com/store/apps/details?id=com.seefoodtechnologies.nothotdog)

~~~
zitterbewegung
I am making an app that takes pictures and tries to tell you if the food in
the picture has allergens. I didn't know if I should feel humble or just
laugh. (I decided it was hilarious in the end) But it made me aim higher in a
hackathon last weekend. I also use your app in my elevator pitch for people to
understand.

~~~
0003
Great idea that I would be terrified to pursue from a legal perspective.

~~~
KGIII
I am going to chime in and _really_ suggest they hire a qualified legal
professional - if they intend to share this application. A simple disclaimer
may not be adequate.

Edit: Typo.

~~~
zitterbewegung
Yea I'm not going to release the app. I was already on the fence on it. I
could have finished it months ago but I was worried about legal implications .
Thank you .

~~~
KGIII
No problem. I wouldn't even give it away, with any disclaimer, without
consulting a qualified legal professional. That sort of thing is just begging
for a lawsuit.

I am kind of curious how well you do. If you ever do get it going, there may
be a use for it - but, again, liability is a huge factor in something like
this.

I am not a lawyer but I have spent a whole lot of time with lawyers and in
court rooms. It was part of my business, indirectly. Thankfully, I sold and
retired years ago.

~~~
zitterbewegung
I could send you a dev build if you really want .

~~~
KGIII
Oh, no thank you. I'll just watch from afar. Also, I have a Windows phone.
Yup. I know my shame. It's the only Microsoft OS I have. I figure nobody is
writing malware for it. ;-)

------
richardkeller
What an ingenious way to covertly distribute the Pied Piper decentralized
storage app.

~~~
timanglade
Boy I sure hope no one does a static analysis of the binary…

~~~
rattray
I really hope you've hidden an Easter egg in there

~~~
timanglade
a gentleman never tells

~~~
Retr0spectrum
If such a thing existed, would it be in the Android or iOS app, or both?

~~~
timanglade
Oh neither, there is nothing to find in the binaries :)

------
x2398dh1
I don't relish saying this but the, "Not Hotdog" app does not cut the mustard
in even the most rudimentary of tests:

[https://twitter.com/iotmpls/status/879381125541613568/photo/...](https://twitter.com/iotmpls/status/879381125541613568/photo/1)

Probably only 20% of the world's hot dogs are just a basic hot dog with
mustard on it. Once you move past one or two condiments, the domain of hot
dogs identification along with fixings gets confusing from a computer vision
standpoint.

Pinterest's similar images function is able to identify hotdogs with single
condiments fairly well:

[https://www.pinterest.com/pin/268175352794006376/visual-
sear...](https://www.pinterest.com/pin/268175352794006376/visual-
search/?x=0&y=0&w=564&h=564)

They appear to be using deep CNN's.

[https://labs.pinterest.com/assets/paper/visual_search_at_pin...](https://labs.pinterest.com/assets/paper/visual_search_at_pinterest.pdf)

Having embedded tensorflow for on-site identification is all well and good for
immediacy and cost, but if I can't really properly identify whether something
is a hotdog vs. a long skinny thing with a mustard squiggle, what good does
that do me? What would be the next step up in your mind?

I ask this as someone who is sincerely interested in building low cost, fun,
projects.

~~~
timanglade
It’s fair, as I mention in the blogpost there are some failures that are a bit
obvious for sure. Mostly I think I tried to fit too many things into one
“hotdog” category including chili dogs, chicago dogs, bunless hotdogs, cut up
hotdogs, even octopus-cut hotdogs, etc. I think it caused the network to over-
generalize on shapes & colors. Definitely would do things differently next
time! And great work on the puns :)

------
timanglade
While we’re here and chatting about this, I should say most of the credit for
this app should really go towards the following people:

Mike Judge, Alec Berg, Clay Tarver, and all the awesome writers that actually
came up with the concept: Meghan Pleticha (who wrote the episode), Adam
Countee, Carrie Kemper, Dan O’Keefe (of Festivus fame), Chris Provenzano (who
wrote the amazing “Hooli-con” episode this season), Graham Wagner, Shawn
Boxee, Rachele Lynn & Andrew Law…

Todd Silverstein, Jonathan Dotan, Amy Solomon, Jim Klever-Weis and our awesome
Transmedia Producer Lisa Schomas for shepherding it through and making it
real!

Our kick-ass production designers Dorothy Street & Rich Toyon.

Meaghan, Dana, David, Jay, Jonathan and the entire crew at HBO that worked
hard to get the app published (yay! we did it!)

~~~
loader
Oh he's going long ... and queue music.

------
bluetwo
OK, but where are the eight octopus recipes?

------
latenightcoding
I am glad I am not the only one with questions about the external GPU, I had
considered trying that, but came to the conclusion that the data transfer
between CPU to GPU would be too slow for ML tasks. So, what is your opinion on
this ? if you had to do it again would you use the eGPU or just use AWS or
another GPU cloud service .

~~~
timanglade
My takeaway is that local development has a huge developer experience
advantage when you are going through your initial network design / data
wrangling phase. You can iterate quickly on labeling images, develop using all
your favorite tools/IDEs, and dealing with the lack of official eGPU support
is bearable. Efficiency-wise it’s not bad. As far as I could tell the
bottleneck ended up being on the GPU, even on a 2016 MacBook Pro with
Thunderbolt 2 and tons of data augmentation done on CPU. It’s also a very
lengthy phase so it helps that’s it’s a lot cheaper than cloud.

When you get into the final, long training runs, I would say the developer
experience advantages start to come down, and not having to deal with the
freezes/crashes or other eGPU disadvantages (like keeping your laptop powered
on in one place for an 80-hour run) makes moving to the cloud (or a dedicated
machine) become very appealing indeed. You will also sometimes be able to
parallelize your training in such a way that the cloud will be more time-
efficient (if still not quite money-efficient). For Cloud, I had my best
experience using Paperspace [0]. I’m very interested to give Google Cloud’s
Machine Learning API a try.

If you’re pressed for money, you can’t do better than buying a top of the line
GPU once every year or every other year, and putting it in an eGPU enclosure.

If you want the absolute best experience, I’d build a local desktop machine
with 2–4 GPUs (so you can do multiple training runs in parallel while you
design, or do a faster, parallelized run when you are finalizing).

Cloud does not quite totally make sense to me until the costs come down,
unless you are 1) pressed for time and 2) will not be doing more than 1
machine learning training in your lifetime. Building your own local cluster
becomes cost-efficient after 2 or 3 AI projects per year, I’d say.

[0]: [https://www.paperspace.com/ml](https://www.paperspace.com/ml)

~~~
latenightcoding
Awesome, thanks!

~~~
skraelingjar
I have used the AWS machine learning API and would recommend it. The time
savings using that vs running it on my hacked together ubuntu-chromebook-
mashup is worth more than what I had to pay. I have also used Paperspace. My
only issue was that whatever they use for streaming the virtual desktop to the
browser didn't work over sub 4MB/s network connection.

------
thearn4
It's interesting how amenable image classification neural networks are to the
"take working model, peel off last layer or two, retrain for a new
application" approach. I've seen this suggested as working pretty well in a
few instances.

I guess the interpretation is that the first few
normalize->convolution->pool->dropout layers are basically achieving something
broadly analogous to the initial feature extraction steps that used to be the
mainstay in this area (PCA/ICA, HOG, SIFT/SURF, etc.), and are reasonably
problem-independent.

~~~
timanglade
For sure, although I should say, for this specific instance I ended up
training a network from scratch. I did get inspiration from the MobileNets
architecture, but I did not keep any of the weights from their ImageNet
training. That was shockingly affordable to do even on my very limited setup,
and the results were better than what I could do with a retraining (mostly has
to do with how finicky small networks can be when it comes to retraining).

~~~
thearn4
That's very cool to hear, I'm a lot more interested in the eGPUs (vs.
something like an AWS P2 instance) after reading this. Thanks again for
sharing.

------
cedric
Interesting. What is the external GPU (eGPU) enclosure you used for the Nvidia
GTX 980 Ti card? Is it this one? [https://bizon-
tech.com/us/bizonbox2s-egpu.html/](https://bizon-
tech.com/us/bizonbox2s-egpu.html/)

~~~
timanglade
Yes, that’s what you see in the picture, although as completely personal
advice, I would stop short of recommending it. For one there are arguably
better cases out there now, and you can sometimes build your own eGPU rig for
less. Finally, the Mac software integration (with any eGPU) is very hacky at
the moment despite the community’s best efforts, and I had to deal with a lot
of kernel panics and graphics crashes, so overall I’m not sure I would
recommend others attempt the same setup.

~~~
minimaxir
It's worth noting that High Sierra removes _some_ of the hackiness of an eGPU.

I am waiting until the next-gen enclosures/cards come out which play nicer
with the OS for deep learning.

------
rogerb
This should be standard 'hello world' tutorial for Pragmatic ML.

~~~
timanglade
Ha, I don’t know about all that, but I was very honored to see someone call it
the “Utah Teapot” [0] of Machine Learning… Hopefully it’s an approachable (if
dumb) example of A.I. and its limitations today :p

[0]:
[https://en.wikipedia.org/wiki/Utah_teapot](https://en.wikipedia.org/wiki/Utah_teapot)

------
tuna
Nice write up that should become the go-to tutorial for TF and local training.
Helped me a lot w/ the mobile part, it was a bit strange to thing about
transfer the training when I read at first but it became clear in the second
reading.

------
nganig
Pretty fascinating and encouraging to see how much was accomplished with a
laptop and consumer GPU. Gave me some great ideas. Also happy to see Chicago
dogs properly identified.

~~~
timanglade
Ha you have no idea how hard chicago hotdogs made my life! There was a joke in
the show about Dinesh having to stare at a lot of “adult” imagery for days on
end to tune his AI, but my waterloo was chicago hotdogs — the stupid pickles
end up hiding the sausage more times than not, which makes it hard to
differentiate them from say, a sandwich.

For those of you like me that never knew they existed, here is what they look
like:
[http://img.food.com/img/recipes/25/97/2/large/picMKCD0m.jpg](http://img.food.com/img/recipes/25/97/2/large/picMKCD0m.jpg)

~~~
ethbro
I feel your pain. I once watched a single episode of Jeopardy 50+ times in
pursuit of timings for a prototype app demo.

I could have a good half-hour conversion about the nuances of Alex Trebek's
vocal inflections... _shudders_

~~~
timanglade
Ha, I’d love to hear more. What was the app for? I can’t imagine why you’d
have to pick up on Trebek’s elocution??

~~~
ethbro
Less interesting than you'd expect, as it was for a rapid mobile app
prototyping class.

We had a telesync'd demo that let you play along with a Jeopardy episode by
yelling answers at your phone. The app knew the timing markers for when the
question was asked + when a contestant answered, so would only give you credit
if you beat the contestant with the correct answer.

Our model user was "people who yell answers at the screen when Jeopardy is
on."

Still think it would have made a decent companion app to the show though...

Trebek's elocution is just something you pick up on after rewatching an
episode enough times. He has really interesting ways of emphasizing things,
but they seem normal if you're just listening to them once through.

------
tebica
Love the post! This explains how mobile TensorFlow can be a actually used on
daily life.

~~~
timanglade
One of my primary motivators behind building this blogpost was to show how
exactly one can use TensorFlow to ship a production mobile application.
There’s certainly a lot of material out there, but a lot of it is either light
on details, or only fit for prototypes/demos. There was quite a bit of work
involved in making TensorFlow work well on a variety of devices, and I’m proud
we managed to get it down to just 50MB or so of RAM usage (network included),
and a very low crash rate. Hopefully things like CoreML on iOS and TensorFlow
Lite on Android will make things even easier for developers in the future!

~~~
metakermit
yeah, that's my main pain with the TF docs – great if you just want to try one
of the MNIST tutorial variations, but there's a lot more you need to figure
out when you get beyond of these "hello world" examples…

~~~
timanglade
Yup and in fairness maybe that’s something the community (myself included)
should really step in and improve — but it’s not always clear how the
leadership of the project would like these things to improve, and I often get
the feeling they have their own production fixes & practices internally that
they’re not (yet) sharing with the public… I could be wrong though.

------
ckirksey
This is the Twitter bot I built a few days after the show (similar to Tim's
original prototype with Google Cloud): [https://hackernoon.com/building-
silicon-valleys-hot-dog-app-...](https://hackernoon.com/building-silicon-
valleys-hot-dog-app-in-one-night-aab8969cef0b)

------
laibert
This is amazing - impressed by your persistence to source the training data
yourself, that must have been tedious!

Did you try quantizing the parameters to shrink the model size some more? If
so, how did it affect the results? It also runs slightly faster on mobile from
my experience.

~~~
timanglade
Great question — I did not, because I had unfortunately spent all of my data
on that last training run, and I did not have a untainted dataset left to
measure the impact of quantization on. (Just poor planning on my part really.)

It’s also my understanding at the moment that quantization does not help with
inference speed or memory usage, which were my chief concerns. I was
comfortable with the binary size (<20MB) that was being shipped and did not
feel the need to save a few more MBs there. I was more worried about accuracy,
and did not want to ship a quantized version of my network without being able
to assess the impact.

Finally, it now seems that quantization may be best applied at training time
rather than at shipping time, according to a recent paper by the University of
Iowa & Snapchat [0], so I would probably want to bake that earlier into my
design phase next time around.

[0]: [https://arxiv.org/abs/1706.03912](https://arxiv.org/abs/1706.03912)

~~~
laibert
Thanks! Haven't seen that paper, I'll check it out. I think quantization only
helps with inference speed if the network is running on CPU with negligible
gains on GPU (Tensorflow only supported CPU on mobile last I looked which was
a while ago). However your app is already super fast so don't I think anyone
would notice if it was marginally faster at this point!

~~~
timanglade
Yeah I was surprised it became so fast once I started using small networks. I
actually toyed with the idea of slowing down the transition to results
artificially to provide better UX lol

------
vinum_sabbathi
At MongoDB World this past week they did a demo of stitch where they actually
built something similar with no back end code required and used the Clarifai
API and an angular front end. It took like less than 80 minutes and could like
run on prod of I wanted.

~~~
kenwalger
MongoDB Stitch is a great new BaaS offering. They even have some great
tutorials online to use it.

Have a look at their sample PlateSpace app:
[https://github.com/mongodb/platespace](https://github.com/mongodb/platespace)

Very cool new service and some excellent tutorials as well, for example for
the PlateSpace web app: [https://docs.mongodb.com/stitch/getting-
started/platespace-w...](https://docs.mongodb.com/stitch/getting-
started/platespace-web/)

I'd definitely recommend having a look.

------
SmellTheGlove
Wow, great writeup. This is an area that I know nothing about but have wanted
to learn - seems like this post is a good starting point.

Any chance the full source will ever be opened up? Would be an excellent
companion to the article.

~~~
timanglade
That’s not in the cards, at least at the moment, although if we get enough
requests for it, I may be able to convince the powers that be…

In the meantime, iff there are any details you’d like to see, don’t hesitate
to chime in and I’ll try to respond with details!

~~~
woodrowbarlow
what is the best avenue for making such requests?

~~~
timanglade
Just voicing your interest here is fine!

------
tmaly
Just want to say, awesome post. Its amazing how quickly you created this.

~~~
timanglade
Thanks for the kind words! To prevent impostor syndrome, I should clarify that
I worked on the app for many, many months — basically since August of last
year — as a nights/weekends thing. It’s true that the final version was built
almost from scratch in a few weeks, but it wouldn’t have been possible without
the time investment in the preceding months. Although for the most part I just
wasted a lot of time because I had no idea what I was doing lol (still don’t)

~~~
subcosmos
I finally played with it this morning. I'm blown away by the speed and how
smooth the experience is.

------
quotewall
Finally for Android! Cool to see a cross-platform implementation of this, and
how much can be done by one person and some reasonable gear.

~~~
timanglade
Yes, I was very excited we were able to release it for Android… And even
though we used React Native, there were so many native (and C++) bits, it
ended up being quite complex!

As for the gear, I think it’s really damaging that so many people think Deep
Learning is only for people with large datasets, cloud farms (and PhDs) — as
the app proves, you can do a lot with just data you curate by hand, a laptop
(and a lowly Master’s degree :p)

~~~
giantwolf
Do you think it's possible to generalize the way you handled the cross-
platform complexity into a shared component?

~~~
timanglade
I thought there might be, but most of the AI code ended up being native. The
only AI code in React Native is a single line:

    
    
        var percentage = await NativeModules.AIManager.analyzeImage(path)
    

… Everything below that is Java or Objective-C++, and then native libs

------
subcosmos
Love this architecture. I think Im going to adopt some of it for HungryBot, my
nonprofits diet tracking research arm. I think on-phone predictions solves a
lot of my affordability issues.

[https://www.infino.me/hungrybot](https://www.infino.me/hungrybot)

Great work!

~~~
timanglade
Thanks! I definitely think executing neural networks on-device is the future
for a lot of applications. It’s just a better UX, and much cheaper to boot!

------
john_borkowski
Very informative write up. Thanks!

How did you source and categorize the initial 150K of hotdogs & not hotdogs?

~~~
timanglade
Lots of manual searching, vetting & labeling! Definitely the most actively
time-consuming part. (Passively, only the wait between training runs was
longer.)

------
tkrupicka
As someone who maintains a popular android camera library; what is this app
using to take photos on both iOS and Android? Android can be a bit tricky with
device-specific differences and Camera 1 vs. Camera 2 API changes.

~~~
timanglade
The amazing react-native-camera plugin! [0] I’m still getting a few camera-
related crashes on Android right now, but overall I would say it makes things
pretty smooth!

[0]: [https://github.com/lwansbrough/react-native-
camera](https://github.com/lwansbrough/react-native-camera)

~~~
tkrupicka
Thanks for the response and the writeup! Glad to hear somebody has had success
with that library.

------
bigfish24
What kind of accuracy did you get with the transfer learning attempts?

~~~
timanglade
Well for a while I was lulled into complacency because the retrained networks
would indicate 98%+ accuracy, but really that was just an artifact of my 49:1
nothotdog:hotdog image imbalance. When I started weighing proportionately, a
lot of networks were measurably lower, although it’s obviously possible to get
Inception of Vgg back to a “true” 98% accuracy given enough training time.

That would have beat what I ended up shipping, but the problem of course was
the size of those networks. So really, if we’re comparing apples to apples,
I’ll say none of the “small”, mobile-friendly neural nets (e.g. SqueezeNet,
MobileNet) I tried to retrain did anywhere near as well as my DeepDog network
trained from scratch. The training runs were really erratic and never really
reached any sort of upper bound asymptotically as they should. I think this
has to do with the fact that these very small networks contain data about a
lot of ImageNet classes, and it’s very hard to tune what they should retain
vs. what they should forget, so picking your learning rate (and possibly
adjusting it on the fly) ends up being very critical. It’s like doing
neurosurgery on a mouse vs. a human I guess — the brain is much smaller, but
the blade says the same size :-/

~~~
bigfish24
Very interesting! If you were to make a v2, would you adjust the 49:1
imbalance and add more hot dog images?

~~~
timanglade
I’m not sure, I think I would maybe break classes into multiple labels, but
that becomes even more finicky to train. At the end of the day, there are many
more things that are not hotdogs, than things that are hotdogs, so you do have
to provide more examples of the not hotdogs to train something from scratch
properly — I don’t see a way around it.

Honestly I think the biggest gains would be to go back to a beefier, pre-
trained architecture like Inception, and see if I can quantize it to a size
that’s manageable, especially if paired with CoreML on device. You’d get the
accuracy that comes from big models, but in a package that runs well on
mobile.

------
arrspdx
What’s your biggest regret with this app? What are you most proud of?

~~~
timanglade
Biggest regret was not keeping a pristine dataset for final testing /
evaluation on device. I ended up flying blind when it came to setting the
final threshold, testing the effects of quantization, or even just measuring
the distortion introduced by cameraphones (compared to directly feeding images
during training).

What I’m most proud of is the remote neural network injection [0] — which I’m
surprised no one has commented on here. I just think it’s absolutely huge to
be able to set large parts of your app’s behavior in TensorFlow code, and
replace that on the fly in your user’s app as needed. But maybe I’m the only
one excited about this :D

[0]: [https://medium.com/@timanglade/how-hbos-silicon-valley-
built...](https://medium.com/@timanglade/how-hbos-silicon-valley-built-not-
hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3)

------
kennu
The iOS app is not available in the Finnish App Store, only US :-( We have
hotdogs here, too! (And not hotdogs.)

------
startupdiscuss
Okay, who is going to test this on you-know-what to see if Jian-Yang's pivot
would have worked?

------
k2xl
Tim: Any tips/online resources for someone starting out with ML? How did you
learn?

~~~
timanglade
I can’t recommend Rachel Thomas and Jeremy Howard’s FastAI course enough! I
attended it in person in SF, but the YouTube recordings and online community
around it are great! [0]

Beyond that, I would recommend making sure you have a concrete project you
pursue. ML is very theoretical otherwise, and to be honest, our shared
understanding of what works and what doesn’t is still fairly limited — with
major discoveries every other week it feels like. So without a concrete
project to anchor your thoughts, it can be hard to learn what “works” and what
doesn’t, just because different things work on different projects.

If you have any questions, I highly recommend the FastAI forums [1] or the
Machine Learning subreddit [2]!

[0]: [http://course.fast.ai/](http://course.fast.ai/) [1]:
[http://forums.fast.ai/](http://forums.fast.ai/) [3]:
[https://www.reddit.com/r/MachineLearning/](https://www.reddit.com/r/MachineLearning/)

------
longsangstan
great app! any plan to open source it?

~~~
timanglade
Not at the moment — although you’ll find the most critical aspects explained
in detail in the post. The rest I fear will age very quickly… With stuff like
CoreML and TensorFlow Lite on the immediate horizon, I can’t imagine people
will want or need to use the cumbersome manual approach I had to use to ship
this app. Anything in particular you’d like to see? I can try to share it in a
follow-up post or in comments here.

~~~
lightbyte
I have a quick question on this. The blog post mentions that you guys went
with CONV-BATCHNORM-ACTIVATION (still unsure on whether this is the better
order), but in the model code that is posted the batchnorm and activations are
the other way around. Which ordering did you end up using?

~~~
timanglade
Ooops, good catch — I had posted the wrong definition. Corrected now! It was
convolution, batch norm, then elu activation.

~~~
lightbyte
Quick followup, what type of optimizer did you guys end up using?

~~~
timanglade
SGD with Cyclical Learning Rates [0]. Honestly, it’s the closest to a Machine
Learning silver bullet I’ve found to date! That paper is _awesome_.

[0]: [https://arxiv.org/abs/1506.01186](https://arxiv.org/abs/1506.01186)

~~~
lightbyte
Thanks! I'm trying out your model on a small data set I've been playing with
for identifying invasive species of flowers [0] and so far it's working way
better than my initial version that was based on resnet (though slower)!

[0] [https://www.kaggle.com/c/invasive-species-
monitoring](https://www.kaggle.com/c/invasive-species-monitoring)

------
megamindbrian
LOL, I love that part. So funny.

