Looking forward to your feedback as you try it out.
Thanks Rajat. We use typical Cortex-A9/A7 SoCs running plain Linux rather than Android. We would use it for inference.
1. Platform choice
Why make TFL Android/iOS only? TF works on plain Linux. TFL even uses NDK and it would appear the inference part could work on plain Linux.
I did not find any info on performance of TensorFlow Lite. Mainly interested in inference performance. The tag "low-latency inference" catches my eye, just want to know how low is low latency here? milliseconds?
2. The interpreter is more optimized for being low overhead and the kernels are better optimized especially for ARM CPUs currently. While model performance varies by model - we have seen significant improvements on most models going from TensorFlow to TensorFlow Lite. We'll share benchmarks soon.
Glad to hear that Rajat. Since it is easy as you say, I look forward to your upcoming release with Linux as standard. :-)
- As mentioned below - flatbuffers makes the startup time faster while trading off some flexibility
- Smaller code size means trading off dependency on some libraries and broader support vs writing more things from scratch more focused on the user cases people care about
We have had huge issues in trying to figure out how to save models (freeze graph,etc) and load it on Android. If you look at my previous thread - it also mentions bugs,threads and support requests where people are consistently confused.
petewarden (https://news.ycombinator.com/item?id=15596990) from Google is also working on this - so im really hopeful you guys will have something soon. This is a serious blocker for doing anything reasonable in TF.
I'm still a fan of XLA, and I expect the two will grow closer over time, but I think Lite is better for a lot of scenarios on mobile.
Google devs, could you please get yourself together in one room and agree in ONE BUILD SYSTEM for Android?!?
Gradle stable, cmake, ndk-build, Gradle unstable plugin, GN, Bazel, ..., whatever someone else does with their 20%.
I keep collecting build systems just to build Gooogle stuff for Android.
An example: Some of my colleagues put a QP solver in tandem with a DNN, so that the neural network could 'shell out' to the solver as part of its learning, and learned to solve small sudoku problems from examples alone: https://arxiv.org/abs/1703.00443 The pytorch code for it is one of the examples I like to use as a stress-test for doing funky things in the machine learning context.
TensorFlow is a very generic dataflow library at its heart - which happens to have a lot of DNN-specific functionality as ops. It's possible to express arbitrary computations in it, whereas CoreML and and similar frameworks make more assumptions that the computation will fit a particular mould, and optimize it thereby.
Disclaimer: I work for Google Cloud.
The pricing of GCP is: $0.10 per thousand predictions, plus $0.40 per hour. That’s more than 100 dollars for 1 million inferences.
UPDATE: It seems some third party developer have developed some swift compatible APIs.
If so, does it use OpenCL or something?
Also check out this post for more info and examples: https://research.googleblog.com/2017/11/on-device-conversati...
Do let us know if you build/run on other platforms.
TF Lite addresses the segment where you need more flexibility
- you ship single app to many types of devices
- would like to update the model independent of the code itself e.g. no change to Android APK, and update the model over the wire.
Even with this generality, TF Lite is still quite fast and lightweight as that was the focus building it up.
> “Google won't say much more about this new project. But it has revealed that TensorFlow Lite will be part of the primary TensorFlow open source project later this year”
(I'm saying that glibly, but I'm dead serious -- look at what we've seen emerge just this year in Apple's Neural Engine, the Pixel Visual Core, rumored chips from Qualcomm, and the Movidius Myriad 2. The datacenter was the first place to get dedicated DNN accelerators in the form of Google's TPU, but the phones -- and even smaller devices, like the "clips" camera -- are the clear next spot. And this is why, for example, TensorFlow Lite can call into the Android DNNAPI to take advantage of local accelerators as they evolve.
Being able to run locally, if battery life is preserved, is a huge win in latency, privacy, potentially bandwidth, etc. It'll be good, though it does need advances in both the HW and the DNN techniques (things like Mobilenet, but we need far more).