
Ask HN: Full-on machine learning for 2020, what are the best resources? - jamesxv7
I want to focus on Machine Learning for this 2020 but I see to many options; Deep Learning, AI, Statistical Theory, Computational Cognitive and more... but to focus just on ML, where should I start? I work mostly as a data analyst on pharma where the focus is batch process.
======
codingslave
Honestly, skip all of the courses. Pick a problem to solve, start googling for
common models that are used to solve the problem, then go on github, find code
that solves that problem or a similar one. Download the code and start working
with it, change it, experiment. All of the theory and such is mostly
worthless, its too much to learn from scratch and you will probably use very
little of it. There is so much ml code on github to learn from, its really the
best way. When you encounter a concept you need to understand, google the
concept and learn the background info. This will give you a highly applied and
intuitive understanding of solving ml problems, but you will have large gaps.
Which is fine, unless you are going in for job interviews.

Also bear in mind that courses like fast.ai (as you see plastered on here),
aggresively market themselves by answering questions all over the internet.
Its a form of SEO.

EDIT (Adding this here to explain my point better):

My opinion is that the theory starts to make sense after you know how to use
the models and have seen different models produce different results.

Very few people can read about bias variance trade off and in the course of
using a model, understand how to take that concept and directly apply it to
the problem they are solving. In retrospect, they can look back and understand
the outcomes. Also, most theory is useless in the application of ML, and only
useful in the active research of new machine learning methods and paradigms.
Courses make the mistake of mixing in that useless information.

The same thing is true of the million different optimizers for neural
networks. Why different ones work better in different cases is something you
would learn when trying to squeeze out performance on a neural network. Who
here is intelligent enough to read a bunch about SGD and optimization theory
(Adam etc), understand the implications, and then use different optimizers in
different situations? No one.

I'm much better off having a mediocre NN, googling, "How to improve my VGG
image model accuracy", and then finding out that I should tweak learning
rates. Then I google learning rate, read a bit, try it on my model. Rinse and
repeat.

Also, I will throw in my consiracy theory that most ML researchers and such
push the theory/deep stats requirement as a form of gatekeeping. Modern deep
learning results are extremely thin when it comes to theoretical backing.

~~~
echelon
This.

Learn top down, not bottom up.

Watch maybe one or two short videos on back propagation. You don't need to be
muddled in the theory and the math - you can become productive right away.

Once you start playing with pytorch and tensorflow models (train them yourself
or do transfer learning), you'll start to develop an intuition for how the
network graphs fit together. You'll also pick up tools like tensorboard.

Also, do transfer learning. It's so awesome to train on a publicly-available
high quality and large data set, train for a lot of epochs for good problem
domain fit, then swap out your own smaller data set. It's magical.

I have a feeling that ML in the future will be like engineering today. You can
learn by doing and don't need a degree or formal background to be productive
and eventually design your own networks.

I have no formal training (save one undergrad course that was way outdated in
"general AI"), and I've designed my own TTS and voice conversion networks. I
have real time models that run on the CPU for both of these, and as far as I
know they're more performant than anything else out there (on CPU).

Eventually you might start reading papers. (You'll be productive long before
you need to do this.) Most ML papers are open access, but review (broad
survey) articles might need pirating. Thankfully there are websites that can
help you get these. The papers aren't hard to read if you've spent some time
playing with the networks they pertain to. Read the summary, abstract, and
figures before diving into the paper. It may take a few reads and some
googling.

You do not need to be a data scientist. Anybody can do it. That said, a good
GPU will help a lot. I'm using two 1080Ti in SLI and they're pretty decent.

~~~
1996
> I have no formal training (...) I have real time models that run on the CPU
> (..) and as far as I know they're more performant than anything else out
> there > You do not need to be a data scientist. Anybody can do it. That
> said, a good GPU will help a lot. I'm using two 1080Ti in SLI and they're
> pretty decent

An alternative is that, by not knowing what you are doing, you may not see all
the options that exist -- and when you hit a problem too hard, you just throw
more hardware (GPUs) at it.

This is not to say it is not sometimes a valid approach, but I'd be wary of
someone who say hasn't had any formal training in C, and says that his stuff
is more performant that anything out there- just because lack of training
causes not knowing stuff that already exists.

~~~
echelon
> An alternative is that, by not knowing what you are doing, you may not see
> all the options that exist -- and when you hit a problem too hard, you just
> throw more hardware (GPUs) at it.

Maybe some will. I just explained that I'm running my models on CPUs, so I'm
actually developing sparse and efficient resource constrained models that
evaluate quickly.

I've been working with libtorch's JIT engine in Rust (tch.rs bindings).

I'm currently trying to adapt Melgan to the Voice Conversion problem domain so
I can get real time, high-fidelity VC without using a classical vocoder. WORLD
works great and quickly, but it's a poor substitute for the real thing as it
only maps the fundamental frequency, spectral envelope, and aperiodicity.
Melgan is super high quality and faaast.

~~~
woodson
Are you working on VC (input: speech of one speaker, output: the same spoken
content, but sounds like another speaker) or speaker-adaptive speech synthesis
(input: text, output: speech)?

Also check out ParallelWaveGAN, another high-quality and very fast (on CPU)
neural vocoder.

------
blululu
The answer to this question depends on your level of computer & math
proficiency. Some folks here have been debating about the relative merits of
practice vs. theoretical foundations, but this dispute makes some assumptions
about where you are starting from and where you are most comfortable. The
fastest way to learn something is to fit it into a framework that you already
understand. If you have a PhD in theoretical physics/abstract mathematics
(like a lot of ML researchers), then the more mathematical (theoretical)
frameworks will be a good way to build deep intuitions. If, on the other hand,
you are more into applied data analysis, then you will probably find that
working on applications will be the easiest way to go.

Personally, I enjoyed both Andrew Ng's and Geoffrey Hinton's respective
courses on ML and Neural Networks on Coursera. You may also want to check out
Michael Neilsen's online essay on deep learning
([http://neuralnetworksanddeeplearning.com](http://neuralnetworksanddeeplearning.com)).
Ultimately I would also encourage you to supplement your understanding by
applying this work to your own applications. The universe is often the best
teacher.

------
1996
Whoever read this - please please please ignore the posts that suggest to just
play with numbers. This is the equivalent of suggesting to someone who wants
to learn how to code to copy-paste formulas into excel. Just don't be that
person.

To be very blunt, in 2020 most ML is still glorified statistics, except you
lose the insights and explanations. The only tangible improvements can be
random forests - some times. 99% of the stuff you can do with basic
statistics. 99% of the coders I know just don't know statistics besides the
mean (and even with that, they do senseless things like doing means of means)

So learn statistics - basic statistics, like in the "for dummies" book series.

If you want to be a little more practical, stats "for dummies" is often found
in disciplines that depends on stats, but are not very good in math - biology,
psychology, and economics are great candidates.

So just download biology basis stats (to know how to compare means - this
gives you the A/B test superpower), then psychology factor analysis (to know
PCA - this gives you the dimension reduction superpower) then econometrics
basic regression (to know linear regression)

With these 3 superpowers, you will be able to do more than most of the
"machine learning" people. When you have mastered that, try stuff like random
forest, and see if you still think it's as cool as it's hyped to be.

~~~
rckoepke
You can do pose estimation with basic statistics?

~~~
hcho3
Many business data is tabular (possibly with time component), and if you are
working with tabular data, the OP’s advice is sound.

------
lettergram
I’d suggest:

[https://fast.ai](https://fast.ai) \- good intro on practical neural networks.

I wrote a guide to ML based NLP. We identify if a sentence is a question,
statement or command using neural networks:

[https://github.com/lettergram/sentence-
classification](https://github.com/lettergram/sentence-classification)

The truth is you don’t need to understand all the math right away with neural
networks. Mostly it’s getting an understanding of why you use a given layer,
bias, etc and when. Once you get some intuition then I’d learn the math.

That’s at least how I instruct others. In any case, there are lots of guides
for any flavor. I’d start with deep learning and focus on the “practical” then
move to the “theoretical”.

~~~
aliveupstairs
[https://www.fast.ai/](https://www.fast.ai/)

[https://fast.ai](https://fast.ai) is unsafe.

------
joaogui1
Machine Learning:

* [https://www.youtube.com/watch?v=UzxYlbK2c7E](https://www.youtube.com/watch?v=UzxYlbK2c7E): Andrew Ng's machine Learning course, the recommended entry point by most people

* [https://mlcourse.ai/](https://mlcourse.ai/) : More kaggle focused, but also more modern and has interesting projects

Do both courses simultaneously, take good notes, write useful flashcards, and
above all do all the exercises and projects

Deep Learning

* [https://www.fast.ai/](https://www.fast.ai/) \- Very hands-on, begin with " Practical Deep Learning for Coders" and then "Advanced Deep Learning for coders"

* [https://www.coursera.org/specializations/deep-learning](https://www.coursera.org/specializations/deep-learning) : More bottom-up approach, helps to understand the theory better

Do those two courses in parallel (you can try 2 weeks of coursera followed by
one of fastai in the beginning, and then just alternate between them), take
notes, write good flashcards and above all do the exercises and projects.

After that you will be done with the beginning, your next step will depend on
what area interested you the most, and getting way too many resources right
now can be extremely confusing, so I would recommend doing a follow-up post
after you worked through the above resources. Also as non-ML stuff I recommend
Scott Young's Ultralearning and Azeria's self improvement posts
([https://azeria-labs.com/the-importance-of-deep-work-
the-30-h...](https://azeria-labs.com/the-importance-of-deep-work-the-30-hour-
method-for-learning-a-new-skill/))

------
psv1
Good free resources:

\- MIT: Big Picture of Calculus

\- Harvard: Stats 110

\- MIT: Matrix Methods in Data Analysis, Signal Processing, and Machine
Learning

If any of these seem too difficult - Khan Academy Precalculus (they also have
Linear Algebra and Calculus material).

This gives you a math foundation. Some books more specific to ML:

\- Foundations of Data Science - Blum et al.

\- Elements of Statistical Learning - Hastie et al. The simpler version of
this book - Introduction to Statistical Learning - also has a free companion
course on Stanford's website.

\- Machine Learning: A Probabilistic Perspective - Murphy

That's _a lot_ of material to cover. And at some point you should start
experimenting and building things yourself of course. If you'are already
familiar with Python, the Data Science Handbook (Jake Vanderplas) is a good
guide through the ecosystem of libraries that you would commonly use.

Things I don't recommend - Fast.ai, Goodfellow's Deep Learning Book, Bishop's
Pattern Recognition and ML book, Andrew Ng's ML course, Coursera, Udacity,
Udemy, Kaggle.

~~~
joshvm
Bear in mind Elements of Statistical Learning is a grad-level text. I would
never recommend that to a beginner to the field over an Introduction to
Statistical Inference, by the same authors.

Geron Aurelien's Oreilly book is great - _Hands-On Machine Learning with
Scikit-Learn and TensorFlow_. Get the second edition which covers Tensorflow
2.

~~~
psv1
You're right about ESL, that's why I started the list with some more
fundamental material. Also, +1 for Aurelien's book, it's really good; I didn't
know he had a revised edition for TensorFlow 2.

------
e_ameisen
A lot of the resources proposed in the comments focus on theoretical
knowledge, or a particular sub-domain (Reinforcement Learning, or Deep
Learning). I recommend a top down approach where you pick a project and learn
by building it. This can be easier said than done however, and after mentoring
dozens of junior Data Scientists I wrote a how-to guide for people interested
in using ML for practical topics.

You can find it from O'Reilly here
([http://shop.oreilly.com/product/0636920215912.do](http://shop.oreilly.com/product/0636920215912.do))
or on Amazon here ([https://www.amazon.com/Building-Machine-Learning-Powered-
App...](https://www.amazon.com/Building-Machine-Learning-Powered-
Applications/dp/149204511X/)).

------
eachro
I think it depends on what you want to focus on. If you want to do deep
learning, fast.ai is probably the best resource available. Jeremy Howard and
Rachel Thomas (the two founders) have poured quite a lot into fostering a
positive, supportive community around fast.ai which really does add quite a
lot of value.

If you want to really understand the fundamentals of machine learning (deep
learning is just one subset of ML!), there is no substitute for picking up one
of the classic texts like: Elements of Statistical Learning
([https://web.stanford.edu/~hastie/ElemStatLearn/](https://web.stanford.edu/~hastie/ElemStatLearn/)),
Machine Learning: A Probabalistic Approach
([https://www.cs.ubc.ca/~murphyk/MLbook/](https://www.cs.ubc.ca/~murphyk/MLbook/))
and going through it slowly.

I'd recommend a two pronged approach: dig into fast.ai while reading a chapter
a week (or at w/e pace matches your schedule) of w/e ML textbook you end up
choosing. Despite all of the hype of deep learning, you really can do some
pretty sweet things (ex: classify images/text) with neural nets within a day
or two of getting started. Machine learning is a broad field, and you'll find
that you will never know as much as you think you should, and that's okay. The
most important thing is to stick to a schedule and be consistent with your
learning. Good luck on this journey :)

~~~
jamesxv7
Excellent recommendation. I really appreciate all the recommendations
proposed. Happy New Year eachro.

------
zyl1n
Be sure to check out 3Blue1Brown's linear algebra series as well. (Maybe after
you've built your own MNIST network) Blew my mind when I made the connection
that each layer in a dense NN is learning how to do a linear transformation +
a non-linear "activation" function.

------
sytelus
In following order:

1\. Michael Nielson's book:
[http://neuralnetworksanddeeplearning.com/](http://neuralnetworksanddeeplearning.com/)

2\. Stanford CS231n course:
[http://cs231n.stanford.edu/](http://cs231n.stanford.edu/)

3\. DRL hands on book: [https://www.amazon.com/Deep-Reinforcement-Learning-
Hands-Q-n...](https://www.amazon.com/Deep-Reinforcement-Learning-Hands-Q-
networks/dp/1788834240)

After this churn through research papers or medium articles on conv net
architecture surveys, batchnorm, LSTM, RNN, transformers, bert. Write lots of
code, try things out.

~~~
fantispug
This may make sense if you want to do image processing and deep reinforcement
learning. But there are lots of other domains.

For tabular data (which is probably most relevant in Pharma, and probably the
best place to start) Introduction to Statistical Learning by Hastie et al and
Max Kuhn's Applied Predictive modelling cover a lot of the classical
techniques.

For univariate time series forecasting "Forecasting Principles and Practice"
is great.

For natural language processing foundations Jurafsky's Speech and Language
Processing is broadly recommended; for cutting edge natural language
processing Stanford's CS224n is great:
[http://web.stanford.edu/class/cs224n/](http://web.stanford.edu/class/cs224n/)

~~~
Breza
I can't suggest Introduction to Statistical Learning enough, it's a fantastic
book! I loaned my copy to another data scientist because I didn't want to hog
such a valuable resource.

------
vjktyu
Study calculus, from the definition of real numbers and to taking complex
integrals via residuals; then study linear algebra to some theorems about
eigenvectors. 1 month total, assuming you're somewhat talented and determined
to spend 12 hours a day learning proofs of boring theorems. After that you'll
realise that most of the ML papers out there are just ad-hoc composed matrix
multiplications with some formulas used as fillers. At that point I think it's
more useful to learn what ML models work in practice (although nobody will be
able to explain why they work, including the authors) and mix this practical
knowledge with the math theory to develop good intuition.

I'd compare ML with weather models: we understand physics driving individual
particles, we understand the high level diff equations, but as complexity
builds up, we have to resort to intuition to develop at least somewhat working
weather models.

~~~
mkl
What are complex integrals used for in machine learning?

~~~
vjktyu
They aren't. It's just a very coarse point where to stop.

------
olalonde
I started with with the machine learning course[0] on Coursera followed by the
deep learning specialization[1]. The former is a bit more theoretical while
the latter is more applied. I would recommend both although you could jump
straight to the deep learning specialization if you're mostly interested in
neural networks.

[0] [https://www.coursera.org/learn/machine-
learning](https://www.coursera.org/learn/machine-learning)

[1] [https://www.coursera.org/specializations/deep-
learning](https://www.coursera.org/specializations/deep-learning)

------
bigmit37
Is C/C++ still worth learning if o want to create some models from scratch
(new layers or different paradigms)

I hear that C++ is a nightmare to work with and was wondering if Rust,Julia,
or even Swift would be worth learning instead.

I know Python but deep learning frameworks seem to be written in C++, so to
come up with new layers I need to understand C++, which I was told has lot of
peculiarities that takes time to pick up. Compiler isn’t also very user
friendly (what I’ve read)

~~~
ChrisRackauckas
Julia is a blast to do research on this stuff in, if you want to go beyond the
basics like TensorFlow and PyTorch allows. The 2020's is going to be the
decade of mixing numerical PDEs with machine learning IMO, and Julia already
has a lot of features along these lines that are missing from "traditional ML"
libraries.

~~~
bigmit37
Interesting. I was going to go through their yearly conference talks to get an
sense of Julia’s capabilities. JuliaCon2019 etc on youtube. Is that the best
way?

~~~
ChrisRackauckas
Possibly. On this topic (machine learning, differentiable programming, GPU and
parallel computing) I'd recommend the following videos:

[https://youtu.be/FGfx8CQHdQA](https://youtu.be/FGfx8CQHdQA)

[https://youtu.be/OcUXjk7DFvU](https://youtu.be/OcUXjk7DFvU)

[https://arxiv.org/abs/1907.07587](https://arxiv.org/abs/1907.07587)

[https://youtu.be/7Yq1UyncDNc](https://youtu.be/7Yq1UyncDNc)

[https://youtu.be/_E2zEzNEy-8](https://youtu.be/_E2zEzNEy-8)

[https://youtu.be/6ntJ_al4oXA](https://youtu.be/6ntJ_al4oXA)

[https://youtu.be/HfiRnfKxI64](https://youtu.be/HfiRnfKxI64)

------
Tenoke
Honestly, I would start with fast.ai - if you dont like it by lesson 3 switch
to another resource. If you do like it through fast.ai is probably the biggest
bang for your buck(time).

------
StClaire
I was in the same boat in 2014. I went a more traditional route by getting a
degree in statistics and doing as much machine learning as my professors could
stand (they went from groaning about machine learning to downright giddy over
those two years). I worked as a data scientist for an oil-and-gas firm, and
now work as a machine learning engineer (same thing, basically) for a defense
contractor.

I’ve seen some really bad machine learning work in my short career. Don’t
listen to the people saying “ignore the theory,” because the worst machine
learning people say that and they know enough deep learning to build a model
but can’t get good results. I’m also unimpressed with Fast AI for the reasons
some other people mentioned, they just wrapped PyTorch. But also don’t read a
theory book cover-to-cover before you write some code, that won’t help either.
You won’t remember the bias-variance trade-off or Gini impurity or batch-norm
or skip connections by the time you go to use them. Learn the software and the
theory in tandem. I like to read about a new technique, get as much
understanding as I think I can from reading, then try it out.

If I would do it all-over again I would:

1\. Get a solid foundation in linear algebra. A lot of machine learning can be
formulated in terms of a series of matrix operations, and sometimes it makes
more sense to. I thought Coding the Matrix was pretty good, especially the
first few chapters.

2\. Read up on some basic optimization. Most of the time it makes the most
sense to formulate the algorithm in terms of optimization. Usually, you want
to minimize some loss function and thats simple, but regularization terms make
things tricky. It’s also helpful to learn why you would regularize.

3\. Learn a little bit of probability. The further you go the more helpful it
will be when you want to run simulations or something like that. Jaynes has a
good book but I wouldn’t say it’s elementary.

4\. Learn statistical distributions: Gaussian, Poisson, Exponential, and beta
are the big ones that I see a lot. You don’t have to memorize the formulas (I
also look them up) but know when to use them.

While you’re learning this, play with linear regression and it’s variants:
polynomial, lasso, logistic, etc. For tabular data, I always reach for the
appropriate regression before I do anything more complicated. It’s
straightforward, fast, you get to see what’s happening with the data (like
what transformations you should perform or where you’re missing data), and
it’s interpretable. It’s nice having some preliminary results to show and
discuss while everyone else is struggling to get not-awful results from their
neural networks.

Then you can really get into the meat with machine learning. I’d start with
tree-based models first. They’re more straightforward and forgiving than
neural networks. You can explore how the complexity of your models effects the
predictions and start to get a feel for hyper-parameter optimization. Start
with basic trees and then get into random forests in scikit-learn. Then
explore gradient boosted trees with XGBoost. And you can get some really good
results with trees. In my group, we rarely see neural networks outperform
models built in XGBoost on tabular data.

Most blog posts suck. Most papers are useless. I recommend Geron’s Hands-On
Machine Learning.

Then I’d explore the wide world of neural networks. Start with Keras, which
really emphasizes the model building in a friendly way, and then get going
with PyTorch as you get comfortable debugging Keras. Attack some object
classification problems with-and-without pretrained backends, then get into
detection and NLP. Play with weight regularization, batch norm and group norm,
different learning rates, etc. If you really want to get deep into things,
learn some CUDA programming too.

I really like Chollet’s Deep Learning with Python.

After that, do what you want to do. Time series, graphical models,
reinforcement learning— the field’s exploded beyond simple image
classification. Good luck!

~~~
laichzeit0
This is the correct progression IMHO. I can tell you’ve been in industry
because it mimics my experiences.

Always start with a simple model and see how far you can get. Most of the
improvements I’ve seen comes from “working the data” anyway. You will be
surprised how much you can improve model performance just by working the data,
or improving the quality of the underlying data alone. Also simple models give
you a “baseline”. What is the point of reaching for neural networks if you
don’t have a baseline performance metric to compare against? XGBoost is a
godsend. It trains extremely quickly and is surprisingly difficult to beat in
practice.

As you say, constantly sharpen your saw with regards to probability theory and
mathematics in general. There is simply no way around this in the long run.

------
orware
I'm not an expert, but I had heard lots of good things about Fast.ai's online
course/content: [https://course.fast.ai/](https://course.fast.ai/)

I've started/stopped a few courses with Georgia Tech's OMSCS program as well
which might have been useful, but I still feel like I'm missing some of the
mathematical foundation to allow me to make more sense of those courses so
Fast.ai's approach seems like it could be a better fit for someone like myself
that's more interested in the practical aspects of using it (I just haven't
made the effort to go through their content myself).

------
SrslyJosh
Ask yourself: Do you really need ML to solve the problems you're interested in
solving?

If you're learning it for career purposes, keep in mind that many corporate ML
use-cases are problematic at best. At worst, you will produce something that
kills someone inadvertently, possibly more than one person.

Learn about the many pitfalls and limitations of ML. Learn about inadvertent
bias in datasets. Learn about the issues with inputs not represented (or not
adequately represented) in your training dataset.

Most importantly, understand that ML is not magic and without significant
guardrails in place, there's a good chance something will fuck up.

------
vowelless
A lot of good advice here.

One thing I would add is replicate a couple of ML papers. It can help develop
a lot of intuition about the specific area.

~~~
jamesxv7
Actually this is a great idea. Seems I'll try this approach for 2020 Q1.

------
pinouchon
No one suggested standford cs231n:
[http://cs231n.github.io/](http://cs231n.github.io/). I'd recommend the winter
2016 lectures (by FeiFei Li, Karpathy and Johnson). For getting started with
convnets / deeplearning, I think this is one of the best hands on ressources
out there.

------
lymitshn
AFAIK FastAI courses are well recommended for their Deep Learning stuff but
they also have ML course[0] Another usual recommendation is Elements of
Statistical Learning book. Another option is finding a MOOC that you enjoy and
following it.

[0][http://course18.fast.ai/ml](http://course18.fast.ai/ml)

~~~
pjmorris
There's a MOOC that uses 'Introduction to Statistical Learning' by the authors
of 'Elements of Statistical Learning', here:
[https://lagunita.stanford.edu/courses/HumanitiesSciences/Sta...](https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/about)

------
forgingahead
I had a nice experience with Adam Geitgey's Machine Learning is Fun course.

He published a lot of free ML blog posts, in easy-to-understand writing with
nice examples, so it never made anything seem out-of-reach. I found that a lot
of other material was a little too abstract, so his stuff was great.

The blog posts are here: [https://medium.com/@ageitgey/machine-learning-is-
fun-80ea3ec...](https://medium.com/@ageitgey/machine-learning-is-
fun-80ea3ec3c471)

And I also bought his paid course with code samples -- it's affordable and
good value.

------
autokad
Does anyone have any resources for people with more advanced ML experience?

~~~
boltzmannbrain
1\. Find a paper you like/admire

2\. Implement their methods from scratch (i.e. numpy not pytorch)

3\. Experiment a bit, tweaking the models/algs to gain intuition

4\. Repeat 1-3

~~~
throwlaplace
> Implement their methods from scratch (i.e. numpy not pytorch)

lol this is basically impossible and completely pointless. please show me a
numpy implementation of BERT or CycleGAN or deformable convolutions (note that
jax != numpy). it's like suggesting implementing a kernel to someone who wants
to learn about virtual memory or scheduling.

better advice would be take a paper and implement the model using pytorch
without looking at their implementation and fiddle with that.

------
Buttons840
Another suggestion: I like
[https://spinningup.openai.com](https://spinningup.openai.com) for learning
reinforcement learning.

------
jamesxv7
I'm impressed by the responses generated in this conversation. My expectation
was to get several links and start browsing each one of them. However, many
have agreed that the best way is to start with a specific example and start
creating a model. Many times I have tried to answer that same question, "which
model to apply"? How do I know I'm not re-inventing the wheel?

~~~
mkl
> How do I know I'm not re-inventing the wheel?

You probably are, but for learning purposes that doesn't matter at all.

------
jansc
Humblebundle has a bundle of machine learning books right now:
[https://www.humblebundle.com/books/python-machine-
learning-p...](https://www.humblebundle.com/books/python-machine-learning-
packt-books) I'm considering buying this bundle. Any of these books you would
recommend?

~~~
jansc
I'm not affiliated with humblebundle in any way, and this was a genuine
question. I know that the packt books are not best quality, but if one these
books is a good introduction to practical ML, I would consider it a good deal.
In my opinion much better than googling algorithms and tutorials and visiting
10s or 100s of sites full of ads and ad trackers to find a suitable algorithm
for a given problem. Reading an EPUB on my daily commute sounds much better
and works offline.

------
DrNuke
> where should I start? I work mostly as a data analyst on pharma where the
> focus is batch process.

Any tool needs an applied field but any applied field does not need all the
tools. You have an applied field already (pharma), so start looking for one or
two state-of-the-art ML papers for that? Happy 2020 and good luck, it’s going
to be fun!

------
6ak74rfy
I am currently going through fast.ai's Deep Learning course and will totally
recommend it because of its top-down approach.

Has anyone done non-DL courses on their website? For e.g., any thoughts on
Rachel's Computational Linear Algebra?

------
thosakwe
Does anybody have resources on the _math_ behind ML? I hit a dead end using
Python frameworks because it was a black box, and I simply lacked the
underlying knowledge.

~~~
olalonde
Week 1-5 of [https://www.coursera.org/learn/machine-
learning](https://www.coursera.org/learn/machine-learning)

------
asfarley
I'm using machine learning to solve some computer-vision problems. If you're
interested in joining my project, email me at alex at roadometry.com

------
atregir
Excellent thread - I have the same goal and currently am working mostly with
databases. Thanks for asking this question!

------
jamesxv7
There is a question I have been asking for quite some time. It is known that
Python is the language of choice when practicing ML. But, can similar results
be achieved using Powershell? What makes Python superior to Powershell when
making models for ML?

~~~
rckoepke
Technically you can do it in any language, but in software engineering we tend
to stand on the shoulders of giants in order to get the job done on time.

A lot of original excellent data processing, statistical analysis, and ML
libraries were built into Python and R, so all the deep learning stuff was
built on top of those. R is somewhat harder to integrate into a production
pipeline due to its typical reliance on something like RStudio, so Python
ended up being the de facto standard as it is also well supported in cloud
computing environments.

With TensorFlow API's being written for Swift, we might start to see Swift
competing with Python.

