Ha didn't expect this to end up here. If anyone is interested, I'm working on a blogpost explaining how we built the app in detail… It uses embedded TensorFlow on device (better availability, better speed, better privacy, $0 cost), with a custom neural net inspired by last month's MobileNets paper, built & trained with Keras. It was loads of fun to build :)
How large is the training set / how did you get a sufficient amount of hot dog pics for a custom model, especially since you did not have a class of naive Stanford CS students at your disposal?
About 150k total images, 3k of which were hotdogs. The results are far from perfect (there's ton of subtle — or hilarious — ways to trick the app) but it was better than using a pre-trained model or doing transfer learning, accuracy-wise (honestly it was even better than using Cloud APIs). As for the difficulty in preparing the training set, I'll just say I definitely empathize with Dinesh and Jian Yang’s feelings in episode 4 :D
Can you please elaborate a bit more on the model architecture and what you tried with respect to transfer learning?
Did you use an imagenet architecture e.g. VGG and retrain from scratch or a custom architecture? Did you try chop off the last 1/2/3 layers of a prerrained mode and fine-tune?
Bonus points:
1. How much better were your results trained from scratch vs fine-tuned?
2. How long did it take to train your model and on what hardware?
Hey so I actually tried Vgg, Inception and SqueezeNet, out of the box, chopped and trained from scratch (SqueezeNet only for the latter due to resource constraints).
We ended up with a custom architecture trained from scratch due to runtime constraints more so than accuracy reasons (the inference runs on phones, so we have to be efficient with CPU + memory), but that model also ended up being the most accurate model we could build in the time we had. (With more time/resources I have no doubt I could have achieved better accuracy with a heavier model!)
Training the final model took about 80 hours on a single Nvidia GTX 980 Ti (the best thing I could hook to my MacBook Pro at the time). That's for 240 epochs (150k images in an epoch) ran in 3 rate annealing phases, each phase being a handful of CLR (cyclical learning rate) phases.
I'll answer in more detail in the full blogpost, it's a bit complicated to explain in a comment. I'll have charts & figures for y'all :)
As someone currently writing an app which uses a retrained Inception model, I watched the show pointing and laughing (and then crying) at the same issues and frustrations. The accuracy of the show and especially this episode has been just brilliant.
Thanks for sharing all the tech details too, it's been great to read. I'm even more amazed to see it as a real app, that I didn't expect!
It was random, I was already working on the show as what Hollywood calls a (technical) “consultant”: advising on storylines, dialogue, background assets, etc. When this idea popped up, someone suggested we build the app for real. We took a try and ended up building the entire thing in-house with the crew, as opposed to hiring an external agency to do it for us.
I should say a lot of people made the app possible, including the show's awesome producers, writers, designers, and a lot of kind folks at HBO. To answer your question, I was the only dev on the project, and I've been working on it since last Summer, on a very part-time basis (some nights and weekends). A lot of time was spent learning Deep Learning to be honest. The last revision of the neural net was designed & trained in less than a month of nights/weekend work but obviously couldn't have been achieved without the preceding months of work — but if I was starting today knowing what I know now yeah it'd probably be about a month of work. The React Native shell around the neural net was just a few weekends worth of work — mostly it was about finding the right extensions, tuning a few things for rendering/performance, and like a whole weekend dealing with the UX around iOS permissions to access the camera & photos (lol it's seriously so complicated).
In the early 90's, I had the honor of corresponding with "Uncle Frank" Webster [1], the curator of the Hot Dog Hall of Fame [2]! It boasts of more than 2,500 frankfurter items, including the Lamborweenie, a dog on wheels that "Uncle Frank" hopes some day to race against the Oscar Meyer Weinermobile [3]. Unfortunately the hot dog museum is currently closed and in (hopefully refrigerated) storage [4], otherwise the museum, gallery and gift shop would be a great place to train your app.
He asked for permission to publish in his newsletter a gif [5] of a photo I'd taken and put on my web site of the Doggie Diner head [6] on Sloat Boulevard in San Francisco [7]. (This was years before John Law acquired all the Doggie Diner heads he could find for his restoration project [8], so there weren't so many photos on them on the internet at the time.)
Of course I gave him permission because he asked so politely, and although at first he seemed a little creepy, I could tell he was authentic and sincere since he signed his correspondence: "With Relish, Uncle Frank." [9] And he even delivered on his promise to send me copies of his newsletter!
He enthusiastically informed me that the highest quality hot dogs he's ever found are from Top Dog in Berkeley [10]. He admitted that he went through their dumpster to find out where they sourced them from because they wouldn't tell him, and he vouches that they are made from the finest possible ingredients. I agree with Uncle Frank that Top Dog has really excellent hot dogs (ask for them cooked butterfly style), and I take him at his word that they come from a reputable source. You can now actually order them online [11] if you would prefer not to go through their dumpster.
I honestly thought the app itself would come across as too limited — and I wasn't quite sure how HN felt about the show it's attached to. I was preparing that technical blogpost specifically for HN because I thought that would be a more hacker-centric way of looking at the same thing.
I was REALLY depressed after watching these last two episodes. HungryBot was easily the coolest thing I've done in my unfunded nonprofit. But it was too buggy to fly. Kicked it out of the nest too early, but to the best of my knowledge it was the first deep learning food recognition app.
Looking for philanthropists if anyone wants to help me study diabetes with apps like this.
It's actually written in React Native with a fair bit of C++ (TensorFlow), and some Objective-C++ to glue the two. One cool thing we added on top of React Native was a hack to let us inject new versions of our deep learning model on the fly without going through an App Store review. If you thought injecting JavaScript to change the behavior of your app was cool, you need to try injecting neural nets, it's quite a feeling :D
This looks great, haha! Shame I can't access it from the UK.
It would be really interesting to read more about your thoughts on working with RN and C++ and perhaps how you did some of it. I'm currently doing the same (but with a C++ audio engine rather than image processing stuff) and I think it's an incredibly powerful combination - but I do feel like I'm making up some interop patterns as I go and there might be better ways, so would love to hear how other people use it!
Broadly, I've created a "repository" singleton that stores a reference to both the React Native module instance (which gets set when I call its setup method from JS) and the C++ main class instance (which gets set when it starts up), so they can get a handle on each other (I bet there are better ways to do this, but I'm new to C++/ObjC and couldn't work out a better way to get a reference to the RN module).
I'm then using RCT_EXPORT_METHOD to provide hooks for JS to call into C++ via an ObjC bridge (in an RCT_EXPORT_MODULE class), and using the event emitter to communicate back to JS (so the C++ can get the RN module instance from the singleton and call methods which emit events).
I've not done anything that really pushes the bridge performance to a point where I've seen any noticeable latency/slow down caused by the interop - have you had any issues here?
Like I say, I'm finding a really cool way to build apps that need the power of native code but still with the ease of RN as the GUI and some logic, and I actually quite like the separation it enforces with the communication boundary.
Sounds like you're further ahead than I was with the React Native part! Not Hotdog is very simple so I just wrote a simple Native module around my TensorFlow code and let the chips fall where they may performance-wise. The snap/analyze/display sequence is slow enough that I don't need to worry about fps or anything like that. As much as I enjoyed using RN for this app, I would probably move to native code if I needed to be able to tune performance.
Can you explain to a noob how you wrote the Native module around TensorFlow? My main area of focus is in python, but I feel hindered when I think I'm ready to start developing for mobile apps. I'm looking into RN, but still not sure how that plays with TF and other python modules.
It was honestly just maybe 10 lines of code, but I was very confused about it before I got it done. The message passing is a bit counterintuitive at first. I'll try to share example code in my blogpost!
Hey guys, how would I be able to add characters to an account for injustice and then disable the anti-cheat. I want to basically make it so that i cant get banned for using characters I added to my account.
When using React Native (and also storing your network in JS) you can choose to push updates and fixes to JS part of your code without going through store again.
Yup that's basically it. The hack was just in getting Tensorflow to accept/load its neural network definition from the JS bundle (what CodePush distributes for you) rather than from the main Cocoa bundle.
Just as a note, people/developers have had messages from Apple telling them that they need to remove code which allows them to update their app outside of the app update/review process.
Really cool stuff! I didn't even realise you've switched jobs, but that explains why I didn't see you when we popped in at Townsend the other day! Keep up the good work.
For anyone keen to learn more about object detection (and deep learning in general), I just finished working through an excellent free MOOC taught by Jeremy Howard (former chief scientist at Kaggle) - you basically learn how to fine-tune a convolutional neural network with your own data (e.g. hotdog vs not hotdog) in lesson 2!
I can't recommend that course enough! I attended it in person and got a lot out of it. Jeremy & Rachel were also enormously kind & helpful outside of class.
I always find it strange when apps like this are made unavailable in certain countries (in this case, the UK).
Video, I understand. From a consumer perspective it's crappy, but I do understand that licensing agreements for video are generally geographically restricted. But an app that goes along with a TV show? I can't see Sky (the UK network broadcasting Silicon Valley and most other HBO shows right now) distributing Not Hotdog.
It would be fun to play with it while the novelty is still fresh.
My guess is they don't want to put in the lawyer work to figure out if they can. I imagine HBO has to heavily vet things like this, unlike small startups.
Craziest thing - it works! Just detected a hotdog (off a photo). Machine learning has really come far, that this can be done for a joke app is really cool.
The term you're looking for is "Fictitious Business Name," or "Trade Name."
Generally, the legal ramifications are a requirement to register the name and pay a small fee. In Santa Clara it costs $40 to register a fictitious business name with the county and the registration is good for five years: https://www.sccgov.org/sites/rec/Fictitious%20Business%20Nam...
I thought Apple does not allow DBA entities? Don't you need to have a DUNS number to create an account these days? (It's been two years since I made one)
I mean, run it by a lawyer if you're ever tempted to use this in a commercial fashion, but you'd generally get told by a lawyer "I've got an icky feeling about saying untrue things in a fashion which could possibly be described as touching interstate commerce, but there is a surprising amount of nuance in fiction and entertainment experiences."
Do I think SeeFood Technologies, Inc. is factually incorporated in a US state? No, I do not. I checked the four states it is most likely to be registered in (Delaware, California, Wyoming, and Nevada) and it isn't registered in any of them. Corporate registrations are a public record and trivially searchable; if I had access to Lexis-Nexis or similar I could probably run the search nationwide roughly as easily.
A sibling reply suggests that getting a Fictitious Name / DBA issued might be sufficient, but this would depend on the state/county/etc; some of them prevent folks from registering fictitious names with things which suggest actual corporations (like Inc.) or various things which the state wants to protect ("Bank", "Trust", "University", etc are common).
(DBAs are available like candy and might make a lot of sense if you've got e.g. an LLC which operates a product whose name is not the LLC's name, if for no other reason than "You'll possibly need this to convince your bank to let you deposit checks made out to $PRODUCT_NAME into $COMPANY, LLC's bank account.")
I would guess HBO registered SeeFood Technologies Inc. as it definitely looks like an official stunt otherwise it might have been taken down by now. It's not uncommon for them to do this - see http://www.piedpiper.com/ and http://raviga.com/
Most movies say they are distributed/produced by a corporation that is specific to the movie (probably for Hollywood Accounting), so it's probably really easy for Hollywood to do this.
Yup sorry about that, the app is available only in the US (& Canada) due to some legal restrictions we couldn't avoid. FWIW I also worked about on Richard's New Internet concept for this season so I definitely hear ya ;)
Can you please enlighten us as to what kind of legal restrictions apply here? It is always interesting to see how these «little» legal details get in the way of running software.
Could it be that the app is using HTTPS? iTunesConnect has all these crazy questions with very little guidance about encryption and export compliance if you say you do use HTTPS.
But seriously, the iTunes connect question specifically excludes stuff like HTTPS using the regular libraries, which is just as well as otherwise pretty much every app would be affected. Legal issue is probably boring IP rights stuff.
It's not that easy. Even the first question in iTunesConnect explicitly states you must answer "yes" even if you just make use of the built-in HTTPS libraries in the OS. If you start digging into the various guidelines you open a can of worms of recursive cross-references between documents and sections. Nowhere have I seen a statement that says "if you just use os HTTPS, you don't have to do anything". At the very least you may have to consider if you have to submit various annual self-classification reports.
For an app like this I could easily see a serious company deciding to skip the hassle and CYA, instead of potentially taking on a huge legal risk. Would you, as a regular worker-bee developer, be OK with personally signing off and accepting a legal risk on behalf of a large company without involving expensive lawyers to evaluate the validity of your opinion on this legalese? Would that be a responsible action to take?
And the World's Most Entitled comment award goes to...
The authors of this app don't owe you anything and are entitled to sell it in whatever countries they so choose. Additionally, the app doesn't even get listed in countries it's not available in.
The only reason you and other non-US users (such as myself) are able to see it is because the OP posted a link to the US version of the app store on a forum where the vast majority of visitors are US-based. Get off your high horse.
I had an idea for an app that would tell you if what you are about to eat is helthful or not. Basically all it has to do is determine the ratio of green to brown color. Obviously it would not really work (purple carrots, spaghetti squash), but would be a nice novelty thing nonetheless.
Is that a joke, or does circumcision or its absence actually have an effect on the classifier? Or do you refer to some other characteristic? Offhand I can't really think of a more marked difference between American and non-American penises than the relatively high US circumcision rate, but I'm not so much a subject matter expert as just an interested amateur, and perhaps I overlook something here.
(If anything, I'd expect the classifier to more reliably call "not hot dog" on a penis that's circumcised, since hot dogs rarely are.)
Hey, the show did a whole scene about taking a dick joke too seriously, I figured a comment doing the same could work. I appear to have been wrong about that! Or, at the very least, I need to work on my delivery.