
Show HN: Valohai – GitHub of machine learning - Tailgunneri
https://valohai.com/
======
Tailgunneri
Hi HN!

We’re Eero, Otso, Aarni and Ruksi from Finland! We are the founders of
Valohai, a machine learning infrastructure as a service startup. We support
existing frameworks like TensorFlow, Keras, Torch and Caffe – actually,
anything you can package into a Docker image. Our platform helps with the
process of training machine learning models at scale with a focus on
collaboration, realtime results, record keeping and repeatability. We are
doing to machine learning what continuous integration and version control have
done to programming. We just went to open beta. It’s still early so all
feedback is super welcome!

~~~
jtraffic
Your pricing is close to 1/2 of AWS. Assuming you scale, would you stay
profitable? How?

I think the front page lacks clarity. "GitHub for Machine Learning" suggests a
framework for hosting and cloning machine learning models, datasets, and
training scripts. I may be just stupid, but it didn't come through as clearly
what you actually offer. I see the term "experiments" up front and I wonder
what you mean exactly. Then in collaboration I see "projects." What does a
typical project consist of?

Don't get me wrong, the idea seems to have huge potential. My advice is to
clean up the front page and develop a very clear explanation of what a typical
person would use this for. You might want to take all of the features that say
"Coming Soon" and move them into a separate page to avoid overwhelming users.

Perhaps try to differentiate this against Floyd in some way.

~~~
Tailgunneri
Hi jtraffic!

And thanks for the feedback! Here are some answers :)

> Your pricing is close to 1/2 of AWS. Assuming you scale, would you stay
> profitable? How?

Exact pricing is currently work in progress like for the most of startups but
we do feel confident that we can make this model work. First of all current
Amazon prising is very high. When we scale we can cut costs way below Amazons
normal pricing by reserving instances for longer periods. Then it becomes a
problem of keeping demand stable vs optimizing how many instances we reserve.
Of course there are other providers too and prices are dropping as we speak.
We just don't think that keeping this level of pricing will be an issue.

Computational resources are also not the only revenue stream. In fact, our
first pilot customers were enterprises with their own hardware, where we plan
on having per-seat licensing fee much like GitHub Enterprise works.

> I think the front page lacks clarity. "GitHub for Machine Learning" suggests
> a framework for hosting and cloning machine learning models, datasets, and
> training scripts. I may be just stupid, but it didn't come through as
> clearly what you actually offer. I see the term "experiments" up front and I
> wonder what you mean exactly. Then in collaboration I see "projects." What
> does a typical project consist of?

Thanks for the feedback. Getting feedback on the website was one of the major
reasons on posting here :) But, Yup! You got our vision! So in that sense our
website copy communicates our intention. But then again, we don’t have as much
of those collaborative features that we’d like so this critique is well
deserved. I’ll try to explain how it works currently and what is in the
roadmap going forward.

Currently you can fork any Valohai enabled ML project on GitHub and start from
there but it doesn’t yet copy previous executions or outputs (such as model
weights and biases). It only “forks” how the environment is set up, the
training scripts and how they are ran so you don’t need to worry about any of
that.

Features like making project public, more full fledged forking of project
(executions, models, datasets), commenting on entities, watching an aspect of
a project, starring project, pull request (using VCS integrations) and other
social features are in the roadmap.

Project is basically an entity like “repository” in GitHub:

\- A single project is meant to be “namespace” for solving and collaborating
on a specific ML problem. For example, you might be in a machine learning team
for an organization where you have multiple projects to solve but don’t want
to mix the executions between the projects.

\- Projects have executions, which are like “runs” in some of other systems
I’ve used.

\- A link to one git repository+branch that contains your training scripts and
YAML file that defines who different kinds of “experiments” are ran. Single
project can have multiple branches linked to it in the future, even multiple
git repositories if we find an use case for it.

\- Experiment term itself has not meaning in our system at the moment. It's an
umbrella term for "anything that you might need computing" such as training or
feature extraction.

\- Task is a collection of executions that are mean to tackle a sub-problem,
such as applying grid search hyperparameter optimization on a specific
training to find the most optimal network.

Here are some examples of how project is defined at the GitHub side, the
integration part with Valohai is in valohai.yaml file:

\- [https://github.com/valohai/tensorflow-
example](https://github.com/valohai/tensorflow-example)

\- [https://github.com/valohai/darknet-
example](https://github.com/valohai/darknet-example)

\- [https://github.com/valohai/keras-
example](https://github.com/valohai/keras-example)

If you use any of those repositories as a source, you can run those pre-
defined experiments with a click or two.

We also have other stuff right around the corner such as command line client,
which should be released this week or early next week.

> Don't get me wrong, the idea seems to have huge potential. My advice is to
> clean up the front page and develop a very clear explanation of what a
> typical person would use this for. You might want to take all of the
> features that say "Coming Soon" and move them into a separate page to avoid
> overwhelming users.

Haha, don’t worry; I’ve been lurking at HN for long enough to know that there
is a difference constructive and outright negative feedback.

Good advice on the clarity. We have been working on this for some months now
and not everything that is clear to us is clear to a moderate visitor on our
website. More in-detail use cases or user stories do make a lot of sense.
Maybe even add one to the pricing page for more specific pricing that normal
machine learning training project might take per month on AWS compared on our
infrastructure.

Having a separate coming soon page might be an option, we’ll have to try it
out.

> Perhaps try to differentiate this against Floyd in some way.

So far Floyd seems to concentrate on being a computational platform for
singular users by “eliminating engineering bottlenecks in deep learning” while
we are focusing on creating collaborative work flows and supporting private
hardware use. But we do have a lot of similarities. We should focus on these
differences, thanks for the tip! If FloydHub keeps on focusing on the
engineering side of the machine learning, I wouldn’t be surprised to see
FloydHub as one of our future backends after AWS and Google Compute Cloud.

~~~
jtraffic
Cool. Good answer. Just FYI, the first link you provided, about TensorFlow, is
broken.

~~~
Tailgunneri
Thanks fixed!

