
Transform ML models into native code with zero dependencies - ghosthamlet
https://github.com/BayesWitnesses/m2cgen
======
jononor
Other similar projects:

[https://github.com/nok/sklearn-porter](https://github.com/nok/sklearn-porter)
Supports many scikit-learn models to Java/C/JavaScript/Go/Ruby, at least since
2016.

[https://github.com/konstantint/SKompiler](https://github.com/konstantint/SKompiler)
transpiles to Excel/SQL

[https://github.com/jonnor/emlearn](https://github.com/jonnor/emlearn) To C
only, focus on microcontrollers/embedded devices. Includes feature extraction
tools also. Disclaimer: I wrote it.

~~~
rajangdavis
Thanks for sharing this! Looking at sklearn-porter now, hope I can contribute
to the either the Ruby, Golang, or PHP library.

------
sanjoy_das
Similar: tfcompile AOT compiles TensorFlow models into native code using XLA
([https://www.tensorflow.org/xla/tfcompile](https://www.tensorflow.org/xla/tfcompile)).

------
isolli
Does anyone know of a similar idea for neural networks? As far as I can tell,
you need the entire framework (which requires a heavy, 1.5 GB docker image) to
apply a trained model, even though in theory you only need matrix
multiplication and a few activation functions.

Related: [https://onnx.ai/](https://onnx.ai/)

~~~
pplonski86
I wrote keras2cpp
[https://github.com/pplonski/keras2cpp](https://github.com/pplonski/keras2cpp)

It transforms keras + theano models into pure C++, no additional packages. It
is not using GPU.

------
wodenokoto
There are similar libraries for converting ML models into SQL queries.

However, the important part of most models is not the `estimator.fit(X, y)`
line, but all the things that are done to X before fitting or estimating.

~~~
nkozyra
Hard to say one part is more important than the other. A working model that
does its job in code is incredibly useful.

------
cognaitiv
You might consider supporting sklearn.preprocessing objects as well to
replicate any transformations applied to training data.

------
s0ck_r4w
Hey folks! One of the authors here. Somehow this post went under our radar.
Thank you all for your comments and feedback! We're super excited about the
amount of attention that this project has gotten. This motivates us to work
even harder and to deliver even more cool stuff. Go and JS support are
definitely on our agenda. Shouldn't be too hard to add since we first
transform models into AST and only then interpret AST into a specific
language. I might be wrong but this is a very crucial detail that
distinguishes m2cgen from similar projects like sklearn-porter. As a result
models are completely decoupled from languages and can be worked on
independently. Once you implement a particular model, all languages get its
support automatically. And vice versa. Plus we support XGBoost and LightGBM :)
SVM support is something we'll be focusing on from the modeling side of
things. Any contributions are much appreciated!

------
wespiser_2018
One big improvement for this project would be to somehow break out the model
weights into an additional dependency, or file, to allow for very large models
and separation between "code" and "data. Overall, pretty good, the alternative
approach is either using the underlying C++ lib, or just doing some matrix
math!

~~~
darius_morawiec
The package sklearn-porter supports the separation between inference (code)
and model data (parameters) by passing `export_data=True` while transpiling
the trained estimator.

~~~
wespiser_2018
Thanks, I'm coming over from R and will certainly check that out!

------
nerfhammer
Now do it for tensorflow. Let's say I want to generate AI faces on a small ARM
processor.

~~~
jononor
Tensorflow is working on tflite, which runs on small ARM processors (with
runtime). And tfcompile is their compile-to-native approach (no runtime).

------
rajangdavis
This is pretty cool idea! Would there be any possibility of trying to convert
a model to javascript/node?

Haven't looked through the source but it looks like the generated code is a
essentially the weights from a trained model transformed into a function for
the target language.

~~~
FridgeSeal
If your model framework supports it, you can export it to ONNX and there’s JS
frameworks that support serving ONNX models.

------
tixocloud
Thanks for sharing. This is quite interesting as we’ve been mulling over
library support for our platform.

------
mahomedalid
What are the advantages?

~~~
leetbulb
Trained ML models that run as native code with zero dependencies.

~~~
danieldk
Note though that e.g. liblinear models are trivial to load and apply yourself
- most classifiers just compute the matrix-vector product between the weight
matrix and an instance vector and take the class with the highest activation.
That route has the benefit that you do not hardcode a model, but can easily
load new models.

Not to criticize this project. This looks nice and has many useful
applications.

