
Show HN: Automated Machine Learning with Explanations and Markdown Reports - pplonski86
https://github.com/mljar/mljar-supervised
======
pplonski86
The mljar-supervised is an AutoML tool that works with tabular data. It can
handle binary classification, multiclass classification and regression
problems.

What's good in it:

\- it can produce markdown reports which you can commit to repository,
example: [https://github.com/mljar/mljar-
examples/tree/master/House_pr...](https://github.com/mljar/mljar-
examples/tree/master/House_price_regression)

\- it computes the Baseline, so you can check if you need ML or not

\- this package is training simple Decision Trees with max_depth <= 5, so you
can easily visualize them with amazing dtreeviz to better understand your
data.

\- it is using simple linear regression and include its coefficients in the
summary report,

\- it has a vast set of algorithms: Random Forest, Extra Trees, LightGBM,
Xgboost, CatBoost (Neural Networks will be added soon).

\- it can do features preprocessing, like: missing values imputation and
converting categoricals. What is more, it can also handle target values
preprocessing (You won't believe how often it is needed!). For example,
converting categorical target into numeric.

\- it tunes hyper-parameters with not-so-random-search algorithm (random-
search over a defined set of values) and hill-climbing to fine-tune final
models.

\- it cares about the explainability of models: for every algorithm, the
feature importance is computed based on permutation. Additionally, for every
algorithm, the SHAP explanations are computed: feature importance, dependence
plots, and decision plots (explanations can be switched off with explain_level
parameter).

