Hacker News new | past | comments | ask | show | jobs | submit login

So there is a difference between ML research and application. Being a practitioner doesn't require deep math knowledge that perhaps research would. Jeremy Howard's fastai course is a great example of how someone with a solid programming background can effectively transition into being a deep learning practitioner. Given that production ml and deep learning is still the wild west, as a practioner, you can contribute also to the research around effective training, scaling, and application of these models. The math and intuition required are definitely acquirable.

I think when you shift into pure research, yes a deep probability, information theory, linear algebra, and calculus background are needed. But at the level, you're rarely writing code and more likely working at theoretical level.

I recently got the assignment to "do ML" on some data. I hadn't done anything in the area before, and a couple of things surprised me:

1. Most of your time is spent transforming data. Very little is spent building models.

2. Most of the eye-grabbing stuff that makes headlines is inapplicable. My application involves decisions that are expensive and can be safety critical. The models themselves have to be simple enough to be reasoned about, or they're no use.

You might argue that this means what I'm actually doing is statistics.

The longer you work with ML, the more you discover that it's almost exclusively about handling data.

It's also one critique I have to the world of academia. When learning ML in academia, 9 of 10 times you work with clean and neat toy datasets.

Then you go out in the "real world" and instantly get hit with reality: You're gonna spend 80% of the time fixing data.

With that said, I think that 10 year from now, ML is going to be almost exclusively SaaS with very high levels of abstraction, with very little coding for the average user. Maybe some light scripting here and there, but I mostly just drag'n drop stuff.

> You might argue that this means what I'm actually doing is statistics.

whats the difference?

ML conferences have way bigger budgets.

Also, note that Greg's goal was to contribute to OpenAI's flagship project. That's a rather ambitious goal!

Also, most folks I know that are making practical deep learning contributions are doing so by combining their pre-existing domain expertise with their new deep learning skills. E.g. a journalist analyzing a large corpus of text for a story, or an oil&gas analyst building models from well plots, etc.

as a side note, I love that you highlight regex in your new NLP course. There is an inherent tension between the probabilistic nature of models and the need for deterministic outputs in most production settings. Often if we can uncover linguistics rules or regex patterns that guarantee minimal precision (or as our VP puts it - don't look stupid), we'll eschew the model in the short term or use the model to augment the rules.

Also I really appreciated that on of the training goals for ULMfit was to be trainable on a single gpu. With these large-capacity models, training is getting crazy expensive and out of hand. Any chance that your future work will still keep the single gpu training goal?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact