
MLflow v0.8.0 Features Improved Experiment UI and Deployment Tools - dmatrix
https://databricks.com/blog/2018/11/21/mlflow-v0-8-0-features-improved-experiment-ui-and-deployment-tools.html
======
mlthoughts2018
As an ML engineer, I’ve found MLFlow to be really a disastrously bad way to
look at the problem. It’s something that managers or executives buy into
without understanding it, and my team of engineers (myself included) have
hated it.

There are many feature specific reasons, but the biggest thing is that
reproduction of experiments needs to be synonymous with code review and the
identically same version control system you use for other code or projects.

This way reproducibility is a genuine constraint on deployment and deployment
of an experiment, whether just training a toy model, incorporating new data,
or really launching a live experiment, is conditional on reproducibility and
code review of the code, settings, runtime configs, etc., that embodies it
totally.

This is much better solved with containers, so that both runtime details and
software details are located in the same branch / change set, and a full
runtime artifact like a container can be built from them.

Then deployment is just whatever production deployment already is, usually
some CI tool that explains where a container (built from a PR of your
experiment branch for example) is deployed to run, along with whatever
monitoring or probe tracking tools you already use.

You can treat experiments just like any other deployable artifact, and monitor
their health or progress exactly the same.

Once you think of it this way, you realize that tools like ML Flow are
_categorically_ the wrong tool for the job, almost by definition, and they
exist mostly just to foster vendor lock-in or support reliance on some
commercial entity, in this case Databricks.

~~~
mateiz
Don't MLflow Projects exactly meet this use case? A project lives in a Git
repo, which can include both code and data, and specifies its software
environment (currently Conda but will eventually also support Docker):
[https://www.mlflow.org/docs/latest/projects.html](https://www.mlflow.org/docs/latest/projects.html).
You can then run it wherever you want to run code: CI system, Kubernetes,
cloud, etc. The reason MLflow doesn't _force_ people to use Projects is
because many users like to develop ML in notebooks, but we definitely expect
engineering teams to use it with Projects.

~~~
mlthoughts2018
I could go on at length about why MLFlow / Databricks understanding of ML
projects is bad to a bonkers degree. I’ll give just one example, which has
mattered considerably for several production projects my team works on and
tried to manage in ML Flow for a while.

The project was a suite of neural network models that provided face & object
detection results in a low-latency web interface where customers can
manipulate photos and want automated metadata about people or objects.

In our case, to optimize for performance we need to frequently experiment with
compile-time details of the runtime environment (in our case a container)
where the application will run in production.

So the axis of our experiments wasnot usually anything to do with neural
network layers or data or parameters. It was different compiler optimization
flags, different precision approximations and GPU settings that needed to be
rolled into a huge number of different underlying runtime environments, and
then for each distinct runtime environment the more mundane experiments would
be carried out for layer topology, number of neurons, width of CNN filters,
etc.

We found that unless youbasically build your own entire “meta” version of ML
Flow that wraps around ML Flow, then it falls apart at use cases where custom
compile time details of the runtime are themselves aspects of the experiment.
Not to mention that the Projects formatting violates good practices, like 12
Factor stuff, for how to inject settings from the environment, which again
leads to wasted effort making special case deployment handling for ML Flow
jobs.

Whatever deploys and measures your tasks should not also impose any type of
special case packaging structure, which is a big reason why MLFlow
_conceptually_ fails. Any attempt to make anything at all like a DSL packaging
layer for experiments that causes it to diverge from “regular deployment of
any old job” is immediately a failed idea. The only thing it’s good for is
creating unwitting vendor lock-in once you’re highly dependent on this
bespoke, weird packaging template for Projects that makes your ML jobs weirdly
(and needlessly) different from other deployment tasks.

------
m_ke
I'm looking into switching over to using MLflow or Polyaxon for experiment
management and tracking. We currently us a a custom built django app for
experiment tracking and run experiments by hand on desktop workstations but
we're starting to move some of that over to GCP.

For people who have used either of the projects, what are your opinions and
are there any hidden issues that you ran into?

Ideally we'd like to have a platform that makes it easy to schedule runs on
the desktops or GCP depending on requirements and available resources. Seems
like kubernetes might be the best option for that and it doesn't look like
MLflow supports it out of the box yet.

~~~
Voloskaya
Polyaxon is really great in term of functionality and UX. It's still pretty
early stage, so there are some bugs, but overall I am very impressed by it. We
have been using for a few month now with a couples of ML researchers.

------
antisocial
We are evaluating MLflow. I would like to know if there are any plans for
making this an Apache project?

