Model deployment is painful. Running a model on a mobile phone?
Forget it .
The frustration is real. I remember spending nights exporting models into ONNX and it still failed me. Deploying models on mobile for edge inference used to be complex.
Not anymore.
In this post, I’m going to show you how you can pick from over 900+ SOTA models on TIMM, train them using best practices with Fastai, and deploy them on Android using Flutter.
are there any object detection models? we went with apple to use CoreML, which works great, but this is cool. Running the pytorch version of our model took way to long for inference.
Badass. I've spent the last week digging into ARM optimization for these models because it's really fascinating how close we are to local deployment for this stuff - writeups like these should help spread awareness.
Thnx for writing!
In academia we're getting the next step operational: training on Android. Any advise for us what to watch out for?
Obviously you need a bit of patience and lots of volunteer devices. With unsupervised continuous learning this is solved, in emulation. See "G-Rank: Unsupervised Continuous Learn-to-Rank for Edge Devices in a P2P Network" [1]. Optimal learning rate is left as an exercise for the developer.
(disclaimer: our own work, I run a lab with "systems for peer-to-peer machine learning")
This is great, I've got a few days downtime and wanted to hone my skills a little so this is a great starter. I've skimmed the article and it all looks very doable for my level.
There is nothing wrong with ONNX, but rather the limitations of PyTorch.
1. PyTorch model files are neither portable nor self-contained, because PyTorch model files are pickled Python classes containing weights so you need Python class code to run it.
Because it needs real Python code to run the models, PyTorch suffers from a numerous issues porting to other non-Python platforms such as ONNX.
PyTorch offers a way to export to ONNX but you will encounter various errors. [1]
Sure, you might be lucky enough to troubleshoot a specific model to export it to ONNX, but if your objective is to export arbitrary 964 models from a model zoo (TIMM) it is almost impossible.
2. There are organizational or cultural problems with it. Because of the above problem, PyTorch model needs to be designed with portability in mind from beginning. But porting & serving models are what engineers do, whereas researchers, who design models, don't care about it when writing papers. So it is often hard to use SOTA models that comes from an academic research.
> PyTorch offers a way to export to ONNX but you will encounter various errors. [1]
I mean sure, there are limitations, but this is greatly exaggerating their impact in my experience. I'd be curious to hear from anyone where these have been serious blockers, I've been exporting PyTorch models to ONNX (for CV applications) for the last couple of years without any major issues (and any issues that did pop up were resolved in a matter of hours).
How did this happen? a pickle is not a sensible storage format. it's insecure, hard to version, not very portable. isnt a model basically a big matrix of numbers?
Not in PyTorch. A model is Python dictionaries containing states and Python module/class objects. I don't know why the PyTorch team did this but that happened. Maybe it boils down to the point #2 I said.
I read it to find this applies to a mobile app (using a flutter library). That is great, and I was wondering if there is a similar library for Javascript to run in the browser or nodejs (I could not find one, other than ONNX)
Forget it .
The frustration is real. I remember spending nights exporting models into ONNX and it still failed me. Deploying models on mobile for edge inference used to be complex.
Not anymore.
In this post, I’m going to show you how you can pick from over 900+ SOTA models on TIMM, train them using best practices with Fastai, and deploy them on Android using Flutter.