Show HN: Tired checking different ML algorithms? Try: mljar-supervised

pplonski86 · on May 16, 2020

The mljar-supervised is an AutoML tool that works with tabular data. It can handle binary classification, multiclass classification, and regression problems. What's good in it: - it can produce markdown reports which you can commit to the repository, example: https://github.com/mljar/mljar-examples/tree/master/House_pr...

- it computes the Baseline, so you can check if you need ML or not

- this package is training simple Decision Trees with max_depth <= 5, so you can easily visualize them with amazing dtreeviz to better understand your data.

- it is using simple linear regression and include its coefficients in the summary report,

- it has a vast set of algorithms: Random Forest, Extra Trees, LightGBM, Xgboost, CatBoost (Neural Networks will be added soon).

- it can do features preprocessing, like: missing values imputation and converting categoricals. What is more, it can also handle target values preprocessing (You won't believe how often it is needed!). For example, converting categorical target into numeric.

- it tunes hyper-parameters with not-so-random-search algorithm (random-search over a defined set of values) and hill-climbing to fine-tune final models.

- it cares about the explainability of models: for every algorithm, the feature importance is computed based on permutation. Additionally, for every algorithm, the SHAP explanations are computed: feature importance, dependence plots, and decision plots (explanations can be switched off with explain_level parameter).