I've worked as a software engineer in machine learning for a few years, bringing models to production in a variety of areas. Although there exists an open-source ecosystem for deploying ML models, most of these tools are targeted towards infrastructure engineers -- Kubernetes, Docker, and web server frameworks. As a result, there exists a gap today between some of the data scientists and machine learning engineers that develop these models and the skills required to deploy them.
I built Model Zoo to address that gap. Deploy your model to an HTTP endpoint with a single line of code, from any Python environment. Plus, you get all the features you'll need from a production ML system for free (monitoring features / predictions, autoscaling, web interface for documentation).
Test it out with one of our quickstarts here for free. You can experiment with it in-browser via Google Colaboratory or in your own Python environment:
https://docs.modelzoo.dev/quickstart/tensorflow.html or https://docs.modelzoo.dev/quickstart/transformers.html
1) GPU acceleration and AWS VPC deployments are only available in the private beta (free tier is hosted on our private infrastructure). Apply here and we can set up a meeting with you asap: https://modelzoo.typeform.com/to/Y8U9Lw.
2) We've started with TensorFlow and Hugging Face Transformers. We're currently working on scikit-learn and PyTorch support (via https://github.com/PyTorchLightning/pytorch-lightning). What kind of frameworks do you use?
I'm wondering what ML model formats you use and what is specification for inputs\results? Do you try to use some common format or it is fine for you to use a proprietary one?
I'm building https://DVC.org - Git for data. We think about how to save and version ML models properly (through GitFlow). One of the biggest challenges - there are not common formats for ML models, inputs and scorings formats.
I'd appreciate if you could share your opinion about models format generalization.
Good to meet you -- I'm a big fan of DVC. In our implementation, we've taken the approach of conforming to the standards (typically open-source) set by the frameworks for serialization (for example https://www.tensorflow.org/guide/saved_model). In our API design, it was important to integrate at the framework level (e.g. a tf.keras.models.Model object) for our client libraries. If you're using one of the widely available frameworks that we support, this results in a simple API where the serialization / deserialization is more of an implementation detail. If you're using a custom or rarer ML framework with an unstandardized serialization format, an open-source approach might work better.
Hope that was helpful!
It would be great to come up with a common format for all these pieces. So, many levels of ML stack can use it.
We do support deploying to a private AWS cloud in our private beta that we can help get you set up with. Azure is not yet supported for private deployments.