
Show HN: Easily-configurable machine learning dataset pipelines - mason_jake
https://github.com/jake-mason/ml-pipeline
======
mason_jake
A few years ago, I began to grow old of working with dataset pre-processing
scripts/"libraries" for machine learning and began to create my own sort-of
"pipelines." Last year, I stumbled upon `sklearn.compose`, a relatively newer
module within the scikit-learn ecosystem.

I have had a lot of success with this module since then, and wanted to share a
tutorial I put together which touches on the idea of managing your machine
learning dataset creation steps completely via a configuration.

