
The Math Required for Machine Learning - hsikka
https://medium.com/technomancy/the-math-required-for-machine-learning-af0d90db3903
======
samfisher83
I am not sure when you are using neural nets how much knowing the underlying
math helps you. Yes we are doing matrix multiplies and using gradient decent.
That is stuff you probably learned in high school. Usually you just learn
partial derivatives in two dimensions instead of 10s or 100s of dimensions.

Even if you understand the math you are training tens of thousands of
parameters inside your model. You are also passing them through non linear
elements like sigmoid of ReLUs. I am not sure what insight knowing calculus or
linear algebra will provide you in building the model unless you can process
more than 3 dimensions non linear elements in your head. I am sure there are
people that can do it, but how many can do it?

~~~
screye
Actually very much. Think of it from an industry perspective.

If you are a software engineer that needs to use a network as a plug in
module, then you may not need much understanding of linear algebra. But then
you also don't need to know much ML either. It is simply a software
engineering job.

However things change for anyone who gets their hands dirty or actually builds
(even just implements know networks ) from scratch .

It is common to find that error caused by transferring a known model to your
problem, was because of making key mistakes in how the question was posted
mathematically. I am currently implementing Neural nets in my job, and have
made many errors because of misunderstanding the mathematical operation of a
layer.

PGMs are almost entirely math and so are embeddings and matrix factorization
models. VAEs and GANs also need a solid grasp of the math behind them. Want to
touch your loss function ? -> Math. Want to change optimization methods ? ->
Math.

The visual designs of almost all neural nets are grounded in math, and this
fact makes it vital to be good at it, if one wishes to gain anything beyond a
surface level understanding of it.

~~~
samfisher83
I am not arguing there is not math involved, but gradient decent is basic
calculus MSE, MAE, etc. This is something you take in high school. This stuff
is pretty basic. Even in gradient decent you might have multiple minimums. The
minimum you get will dependent on which set of points you start at. Imagine a
3d curve you might have different valleys.

Let me quote François Chollet: "Neural networks" are a sad misnomer. They're
neither neural nor even networks. They're chains of differentiable,
parameterized geometric functions, trained with gradient descent (with
gradients obtained via the chain rule). A small set of highschool-level ideas
put together

~~~
stevofolife
You need a great knowledge of math to internalize the complexity of neural
nets. One particular area of mathematics that is not in the set of highschool-
level ideas is topology.

------
gaius
I have been getting more into machine/deep learning et al recently and have
been pleasantly surprised that the calc and linalg I learnt 25 years ago I
just need to review, it all still works the same! So refreshing after the
short shelf/half life of many tech skills. These are the skills that matter
(and that if I’d been entirely self-taught I never would have learnt)

~~~
hsikka
Well said! It's pretty funny, back when I was studying neuroscience i used to
constantly loathe the fact that my undergrad made me study linear algebra,
calc, etc. Now, a few years later i'm neck deep in all this theoretical ml and
boy am i thankful for it haha

~~~
whatch
As someone who took calculus, linear algebra and a couple CS related courses
on my economics program in university 6 years ago, I wish I had someone around
who could explain me (better) why and how all those courses can be helpful to
me. ML and AI were not big those days (at least I didn't hear much about them
as I do know). Now, looking for my first python web developer job and dreaming
about switching to something more ML related eventually, I am thankful I
hadn't completely fail those courses, but feel extremely sorry I didn't study
them properly.

~~~
gaius
You can get a fair way these days without knowing any maths at all, the tools
and libraries are very good and there are a ton of tutorials and sample code
out there. But what a grasp of the maths really gives you is intuition which
you will really, really need as soon as you go off the beaten track. Without
it you will just get bogged down in hyperparameter sweeps and other tasks that
just don’t scale

------
urmish
Aren't statistics and optimization techniques required too? There was a
similar post here couple of weeks ago and that had quite more. Here's the
link: [https://blog.ycombinator.com/learning-math-for-machine-
learn...](https://blog.ycombinator.com/learning-math-for-machine-learning/)

~~~
ChristianGeek
The author links to a statistics course.

~~~
urmish
I checked again. Maybe I am missing something, I only found a probability
course in OPs blog post. Statistics is quite different.

------
ivan_ah
There was an interesting HN thread about this topic two weeks ago:
[https://news.ycombinator.com/item?id=17664084](https://news.ycombinator.com/item?id=17664084)

Check ^ for other useful recommendations and links.

------
hodder
It is concerning to me how many people are deploying these techniques without
first understanding the principles and limitations of multivariate regression
and time series analysis.

I'm not sure why people are diving into modern techniques without knowing how
to properly specify a simple (but powerful) regression model.

~~~
mindcrime
One might reply by saying:

 _It is concerning to me that people are driving cars without first
understanding the principles of mechanical engineering, thermodynamics, and
fluid dynamics. I 'm not sure why people are diving into modern cars without
knowing how to properly build a simple (but powerful) steam engine._

The point being, there's a distinction between "machine learning research" and
"applied machine learning". Of course those points are on a continuum, not
separated by a bright line. But the point is, there are different roles and
those different roles have different goals and requirements.

Of course knowing more math and theoretical foundations enables you to do more
in some senses... just like a basic knowledge of fluid dynamics would be
useful if you want to "port and polish" the cylinder heads in your car. But in
reality, a minuscule portion of the population of car owners will ever want to
do this. OTOH, it is essential if you're the person designing the car engine
in the first place.

No, it's not a perfect analogy, but I think the overall point stands: some
people need to know the deep, deep details of linear algebra, multivariable
calculus, probability theory, measure theory, topology, etc., etc., for their
goals in machine learning, while other people can achieve quite a lot without
all of that knowledge.

