
Ask HN: Things You Wish You Knew Before Getting into Machine Learning - onuralp
Especially for those who switched careers to become a machine learning practitioner, data scientist, data engineer vs.
======
deepsun
From own experience (switched to ML 1.5 years ago):

1\. That software engineering skills are way more important than ML skills.

2\. That you'd be spending more time on making presentation than doing ML (and
it makes sense, it's very important to present statistics properly).

3\. That most problems don't need good ML models. Something cheap and easy is
often good enough. What you do need to be good, is data pipelines around them
(see 1.)

In my case, I learned ML enough to feel "senior" compared to other people in
company and online in less than a year. Same path to Senior SWE took me much
longer (way larger mandatory knowledge base, probably because ML is a young
field). So I'd say ML is definitely easier.

~~~
luhego
Those are great points. Could you comment what resources you have used to
learn ML?

~~~
deepsun
Mostly Kaggle -- reading others solutions and notebooks and integrating them
into mine code.

Also there's a great Coursera course on ML for Kaggle:
[https://www.coursera.org/learn/competitive-data-
science](https://www.coursera.org/learn/competitive-data-science)

I think once you finish it, you're better than 60% of silicon valley data
scientists, no kidding.

------
AznHisoka
It's starting to become a cliche (which might be a good thing), but building
datasets, cleaning that data and validating that data is the hard part..by
far. The actual machine learning is quickly become a commodity.

~~~
natalyarostova
Our code base ratio of data cleaning/APIs/pre-processing : API calls to ML
packages is like 98:1

------
natalyarostova
Except in rare cases, or specific teams tackling problems that are both
exceptionally hard, and exceptionally well-suited for deep learning, I would
take someone with some medium value stats and advanced python/pandas coding
ability over a PhD in ML.

------
ChrisAntaki
Sometimes people you work with, like team members or PMs, will really want to
understand ML and be involved but will have a hard time grasping the concepts
being discussed. I found it really helped to draw out and illustrate the
different components and data flows!

------
altairiumblue
The best places to start for a complete beginner are Precalculus and Hello-
World in C.

I'm serious about this. Ultimately the job is just software development plus
statistics.

If you are a software developer, work on your statistics.

If you're a statistician, learn to program.

Most people will have gaps in both of these sub-fields.

Do not, under any circumstances, take any online courses that include the
phrases "data science" or "machine learning" in the title.

------
jriot
It isn't as interesting as we are lead to believe.

------
tixocloud
Your ability to develop an amazing ML model is limited by your organization's
ability to collect and clean data. However, the great news is that most
problems do not need an incredible model. Small uplifts in performance could
still result in substantial outcomes.

In industry, you also need to balance the amount of time and effort it takes
to build your model against the incremental benefit.

------
Asafp
In the industry its much more important to build/get a great data set for
training and test than building the perfect model.

