
CometML wants to do for machine learning what GitHub did for code - randomerr
https://techcrunch.com/2018/04/05/cometml-wants-to-do-for-machine-learning-what-github-did-for-code/
======
ellisv
Data science has 3 areas for "versioning":

1\. code versioning 2\. data versioning 3\. model versioning

Code versioning is primarily dominated by GitHub and is a fairly saturated
space (Bitbucket, GitLab). Data versioning is either not happening, or being
done through regular data pulls, database snapshots, etc. It is not well
standardized or adopted. CometML is tackling model versioning.

It would be really nice to have a single solution for all of these but that is
unlikely. Hopefully new standards evolve from this.

~~~
alexilliamson
Nice breakdown. I agree that data versioning is the one area with limited
standardized options. I would add that in addition to versioning the data,
there is also the related problem of integrating the 3 areas of versioning...
tying the "data version" to the "model version" and the "code version". That
seems to me like it might be a good place to start in tackling data
versioning, or is that too trivial? Is there a product out there that already
does this?

~~~
jdoliner
Pachyderm, a project I work on, is probably as close as you'll find to
something that ties all 3 together. In my mind the major unsolved problem here
was the data versioning so that's the first thing we tackled. Code versioning
is already quite well solved so we just integrate with existing tools for
that. I'm not convinced that model versioning is actually distinct from data
versioning, models are just data after all. So I think without an established
system for versioning models, such as Git + Github is for code, treating
models as data and versioning them that way is good enough for government
work. From what I can tell CometML isn't quite versioning models so much as
tracking versions of models. It expects that models to be stored and versioned
elsewhere but it gives you a way to get deeper insight into how those models
are performing, how they're changing, the hyper-parameters used to train them
etc. Tracking this is also a very important problem that CometML seems to
solve quite elegantly.

------
yodon
Can you talk about how CometML fits into the real world state of ML training
tracking, which is a pretty terrible Wild West of non-reproducible practices
and processes?

There have been articles and comments here on HN about the sorry state of ML
trackability, with papers being published on models whose training is not
reproducible because no one really knows how it was trained. One in particular
(I apologize for not having retained the link) described researchers starting
with partially trained models they had lying around (with undocumented and
unknown prior training applied), manually changing hyper parameters mid
training while watching the learning progress, swapping different training
sets in and out, and etc.

From what I see, the problems in ML reproducibility aren’t in the code, they
are in the external human processes that are used to drive and train the
models (essentially bad DevOps practices more than bad dev practices). Do you
help with these kind of real-world trackability and reproducabilty scenarios?

~~~
gidim
That's a great point! Machine Learning reproducibility is a huge problem, both
in Academia and within companies. Comet.ml tracks every run of your script,
the hyper-params (pulled automatically from the ML library when possible) and
your results. So in the example you gave we would be able to track the DS
loading pre-trained weights, manually changing hyperparams and the datasets
hash.

In such complex cases there's still a little discipline required on the user
behalf but overall it's just including a few calls to our Experiment object.

------
tedivm
Their pricing is ridiculous. $50 per user per month for their cheapest plan
that works with private spaces, $149 per user per month for self hosted.

If you're a company with 15 people working on a single model you're going to
be paying more than a team with five people who have 20 different models they
are working on. The actual load and cost for the service is completely
detached from the price.

I was strongly considering looking at this for a project, but not at that
cost.

------
Macuyiko
Great to see ML "governance" work being done on the training-part of the
pipeline. Seems like this provides a Domino Data Labs based dashboards but
without the walled garden environment.

I've yet to see similar great initiatives also tackling the deployment-part.
E.g. something similar you can stick on top of your model's API (or scheduled
batch predictive outputs), as well as incoming instances, to monitor usage
patterns, population shifts through time, probability distributions, newly
popping up missing values or categorical levels, logs, etc, in order to
provide warning lights to indicate that a retraining might be in order, for
instance.

Google's "What's your ML test score" paper provides some great insights, but I
hope someone will tackle this with a turnkey solution as well.

~~~
gidim
Thanks! We indeed solve a similar pain point as Domino but we unlike them we
allow you to train your models on your own infra/laptop.

As for monitoring production models that's something we're also working on. It
was important to get the training part out first so we can measure those
distributions changing.

------
gidim
Hi, I’m one of the founders of Comet.ml. We built comet.ml to allow machine
learning teams to automatically track their machine learning code,
experiments, hyperparameters and results. We think that reproducibility is
really important so we’re also giving free access to students, academics and
open source projects.

Feedback is welcome. Ask me anything.

~~~
ah-
Are you planning to open source it?

A lot of your competitors have, like
[http://pipeline.ai/](http://pipeline.ai/),
[https://github.com/pachyderm/pachyderm](https://github.com/pachyderm/pachyderm)
and recently
[https://github.com/polyaxon/polyaxon](https://github.com/polyaxon/polyaxon).

~~~
mmq
@ah- thanks for mentioning Polyaxon and congrats to the CometMl team for
building this nice tool, it's good to see that many projects are trying to
solve problems related to reproducibility in ML/DL, many people had to build
an internal tool for the companies they work for to solve this issue, and many
got frustrated after joining a new team and were not able to reproduce any
results.

I would like to outline a couple of differences between CometML and Polyaxon,
as mentioned before, we are also trying to solve issues related to technical
debt in ML, but not only, Polyaxon tries also to simplify training and
scheduling parallel and distributed learning. there are also a couple of
differences, I see CometML as dashboard, Polyaxon does not have an extensive
dashboard as CometML, but it leverages Tensorboard for most of the
visualisations. We use the CLI or the API for programatic access to the
platform. Most importantly, Polyaxon aims to be an open source and to be
installed on premise or in the cloud, it solves the issue related to code
tracking based on an internal git and a docker registry, and as someone else
mentioned that resources for running an experiment could be an issue for
future reproducibility, Polyaxon restarts the experiments with the same
resources and dockerfiles, it also tracks hyper params as part of the
configuration.

For hyper params tuning and suggestion, Polyaxon can also do hyper params
search based on a couple of algorithms, and for the next release, it will
include also a service similar to vizier for suggesting more experiments/group
of experiments based on a given search space.

Disclaimer: I am the author of Polyaxon

------
sbraden
Since a few people commented on the cost of using CometML (or its
competitors), I wanted to suggest an open source project (that's been around
for a while) I found helpful for organizing ML experiments and tracking. It
has two different frontends to choose from (I like SacredBoard). If you like
open source this might be the ML experiment tracker for you!

edit (forgot the link):
[https://github.com/IDSIA/sacred](https://github.com/IDSIA/sacred)

~~~
softawre
Did you forget to suggest the project? Is it "sacred"?

[https://github.com/IDSIA/sacred](https://github.com/IDSIA/sacred)

------
henripal
Cool, thanks.

For those of you who want to tinker, there's a much rougher, open source
library based on Vuejs, postgres, and Flask with some momentum on GitHub right
now, LabNotebook
[https://github.com/henripal/labnotebook](https://github.com/henripal/labnotebook)

(Disclaimer: I'm one of the authors)

------
spraak
I get so excited when I see ___ML but then I'm let down when I realize the
context is Machine Learning instead of Meta Language. Not that Machine
Learning isn't cool, but I really like ML family languages.

------
erk__
When I saw that name the first thing I thought about was some new SML compiler
or like MoscowML.

------
jdpigeon
I have been anxiously waiting for something like this! Very excited to try it
out.

How does it handle data storage? Could we use CometML to store our
continuously growing set of labeled data or is there a smart way to link it to
Google Cloud Platform?

~~~
gidim
We do not store your data as it's usually a very sensitive. You can host your
data on GCloud or AWS and use Comet.ml to track where it was coming from and
if it changed between experiments.

------
jorgemf
I would like to use it but I think the price doesn't justify the tool. For a
team of 5 people github is $25 a month, you are $745 a month. I do understand
a price a bit higher that github but not 30 times more expensive.

~~~
gidim
Thanks @jorgemf. Keep in mind that $745 also includes unlimited usage of our
hyper-parameter optimization service.

~~~
XnoiVeX
Are there any similarities or distinctions between this article (link below)
and how your system works to tune hyper-parameters?

[https://blog.coast.ai/lets-evolve-a-neural-network-with-a-
ge...](https://blog.coast.ai/lets-evolve-a-neural-network-with-a-genetic-
algorithm-code-included-8809bece164)

~~~
gidim
This article seems to discuss genetic algorithms which could be used for
hyperparam optimization. We use another method called Bayesian (GP) hyperparam
optimization. According to our internal benchmarks and academic research
Bayesian methods outperform genetic algorithms. Another thing to keep in mind
that we automate the entire process for you. You only need to provide a list
of parameters you'd like to tune.

------
p1esk
How do you handle dataset/checkpoint management and versioning? Ideally with
powerful dataset filtering options. Right now we’re using excel and panda
dataframes, but are interested if your tool does it well.

~~~
gidim
Since we do not host your data we cannot provide filtering on the actual
dataset content. We do allow you to track where the data was coming from and
if it changed (by hash). Same for checkpoints, you can log their location
(S3/local path) and hash.

------
XnoiVeX
CometML vs Tensorboard? Thanks in advance for your feedback.

~~~
gidim
Both CometML and Tensorboard help track metrics/weights during training in a
similar way. CometML also tracks your hyperparams, code, dependencies. We also
allow you to compare models, collaborate by sharing projects and experiments
and much more. TB is an amazing tool but it doesn't help with reproducibility.

If you're already using Tensorboard just throw in our one liner:
comet_ml.experiment(api_key="your-key") and you'll get everything TB gives you
+ our added value.

------
nicodjimenez
That's cool! I also wrote my own service
[https://losswise.com](https://losswise.com)

~~~
henripal
Cool. I'm guessing from trying it out that you're using Highcharts. I've run
into really unpleasant memory leaks/slowness when streaming data (especially
when the existing chart is already thousands of data points). Are you seeing
something similar?

~~~
nicodjimenez
Yes using Highcharts. You've had issues with Highcharts? Yeah it's not
designed to stream data extremely rapidly but it's a great "good enough"
product, especially for something like Losswise where the differentiation is
the overall design and architecture and developer experience, not the
prettiest possible graphs.

~~~
henripal
Yes... For example if I'm running three experiments at the same time, auto-
refreshing the chart every two seconds, it essentially freezes the app to a
crawl after a thousand points or so. So we reverted to manual updates.

If you know of any better alternatives for data streaming, I'm curious. I
tried benchmarking a couple libs recently:
[https://github.com/henripal/ChartingLibBenchmark](https://github.com/henripal/ChartingLibBenchmark)

~~~
TorsteinHonsi
As a Highcharts developer, I had a look at your benchmarking, and have some
thoughts about optimizing for Highcharts. The first step is to turn off
animation, which helps a lot. The default Highcharts animation on addPoint is
250ms, so with a refresh rate of 100ms you will get a lot of redrawing going
on for nothing. The second thing that possibly optimizes a bit is to use hard-
coded axis values so that it doesn't have to recompute axis values for each
iteration.

With those modifications the performance is much better:
[http://jsfiddle.net/highcharts/1o5ghqc8/](http://jsfiddle.net/highcharts/1o5ghqc8/)

------
suff
Great idea if AWS had not already solved this with the release of SageMaker.

~~~
tedivm
SageMaker is awesome but it's a very different product. There is nothing in
SageMaker that lets you track performance over time and no fancy dashboard.
That being said SageMaker's pricing is actually reasonable, and adding a
dashboard on top of it shouldn't be that complicated.

------
nightski
Sorry if this is off topic but does anyone else just instantly close the page
or spew profanity when those stupid chat bubbles pop up on a page? No I don't
want to talk to your ridiculous sales chat.

------
euler_
Is there an explore option like on github?

