Launch HN: FloydHub (YC W17) – Heroku for Deep Learning
Hi HN! I’m Sai, one of the cofounders at FloydHub (https://www.floydhub.com). We're building FloydHub to be a “Heroku for deep learning”. We are in the current batch (W17) at YC. But I still like to think of FloydHub as being an HN incubated startup.

10 months ago, I was working at Microsoft and doing a lot of deep learning (DL) there. While the DL community is terrific, I was often frustrated by how difficult it was to get started and build upon others’ work. For example, running any popular Github project often started with an exercise in dependency hell. As I untangled these for myself, I wrote up some notes on setting up popular DL frameworks, which, unexpectedly, started trending on HN after someone posted it there (https://news.ycombinator.com/item?id=11697571). That's when I realized that engineering was a huge bottleneck in deep learning and a problem worth solving after all.

I’ve since quit my job and have been working fulltime for the last 9 months on building FloydHub to make deep learning easier. Our goal is to let the data scientists focus on the science, while we handle the engineering grunt work (provisioning and scaling infra, running reproducible experiments, enabling sharing and collaboration, supporting DL frameworks with zero setup, shipping trained models to production easily, etc.) Lots of interesting challenges - happy to talk about them!

We have a lot of work ahead, but we’re excited to share with you what we have so far! Looking forward to your feedback.






Skimming through the site this looks very polished for a startup offering.

Thanks! We've been iterating on it for a while. My sister is a UX designer and helped a ton. I still think there's room for improvement, but it's great to hear a positive comment about it!

Hi! I'm Naren, the other co-founder of FloydHub. I'll be happy to answer any questions and really appreciate any feedback you can provide. Thanks!

The first link to the pricing page on the FAQ points to localhost.

Everything else looks really slick!

You said you worked at MS, so you're presumably pretty familiar with Azure's offering. I'm no expert (at all) but I played with it briefly and it was all pretty slick and easy to get up and running. How does FloydHub compare to that?

I assume you're talking about AzureML Studio. It's a pretty neat UI-centric tool for building machine learning workflows! It's great if you're starting out with ML, but offers little in terms of customizability. For example, it only supports R and Python, has no GPUs, no CLI, no container support for managing reproducible environments, etc. I think these are kind of deal breakers for doing deep learning :)

FWIW, I worked as a data scientist in Bing for 6 years and haven't seen/heard any other data scientist use it internally. We ended up building our own GPU clusters and going through the regular drill.

Curious how Microsoft's approach compares to what you have heard about Google's approach to ML as a service. My impression is that it looks kinda/sorta like it's internal approach/infrastructure...if you squint.

Which also makes me curious about FloydHub's infrastructure. Any gory details?

Floyd’s Infrastructure runs entirely on Docker. That makes it backend agnostic (our cloud offering currently runs on AWS). Floydhub uses nvidia-docker for the deep learning jobs that require GPU. We also version the entire pipeline (code, data, params and environment) for exact reproducibility.

GPUs instances are really expensive. One of the biggest challenges at the moment is around reducing this cost. Eg. Spot Instances and Spot Blocks. Still some challenges to be solved there.

We also want Floyd to be an end-to-end solution for building, training and deploying deep learning models. In that vein, we are also investing in adding support for Tensorflow serving but it has been a rough ride so far. Getting a generic solution that can host any Tensorflow model has not been straightforward.

ML-as-a-service offered by many companies (Microsoft Cognitive Services, Google Cloud Prediction, IBM Watson, etc.) are fairly similar. They're great out-of-the-box for some domains, say English speech recognition. For others (text/image), they’re fairly easy to get started with (don’t need much training data, no managing infra, etc.) However, they are mostly black boxes and set a slightly low bar in terms of quality. Anyone doing serious AI will hit the limits of what they offer fairly quickly.

The DL community is awesome in its openness and contributions. Our goal with FloydHub, in contrast to the ML APIs, is to provide the tools for data scientists to effectively leverage this. We want to solve the engineering hurdles that come in the way of doing some cool science.

