Hacker News new | past | comments | ask | show | jobs | submit login

I made similar lists 4 years ago. I quit my job, and applied to a PhD program in ML (still doing that). Looking back, I can offer the following advice if you goal is to become a ML Engineer:


Start with (re)learning math. Take those boring university level full length courses in calculus, linear algebra, and probability/statistics. No, unless you're a fresh STEM graduate, 10 page math refreshers won't do. If you're self-studying, make sure you do all exercises and take practice exams to test yourself.


1. (Re)learn C/C++, as well as a linear algebra library, such as Numpy or MATLAB. You will also have to learn parallel and distributed programming at some point (CUDA, MPI, OpenMP, etc). Next take a boring university level full length course on algorithms and data structures.

2. Get a book describing ML algorithms, and implement them yourself, first using plain C, then with MPI or CUDA, and finally using plain Numpy/MATLAB, or one of the low-level ML frameworks (Theano or TensorFlow).


Finally, start doing ML. Not learning about it, doing it. Choose an application that interests you (computer vision, NLP, speech recognition, etc), and start learning what you need to make something work). Focus on specific, practical tasks. If you don't have any particular application in mind, go to Kaggle, choose a competition, and read what models/tricks the winners used. Then jump right in and start competing.

The first two requirements might take years to master, but if you skip them, you won't be able to do any serious work in ML, or even understand latest papers. You will be a script kiddie, not a hacker.

Your comment is littered with advice that applies to machine learning research, not engineering. It is as applicable to machine learning as "learn deep data structures and algorithms" is to CRUD web app development for an internal "enterprise" application that will see simultaneous usage in the high dozens of users at best. In other words: it's only correct in a very narrow context.

And you end with patronizing insults. Do you feel better?

Security folks do the same thing. I don't get it.

It's pervasive in this industry, no matter the sub-discipline.

That sort of crap doesn't happen in other, mature engineering fields. At least not to the extent it does in computing. It's as if there is some unaddressed need to be considered among the intellectual elite that festers and ferments into casual, ego driven aggression.

You must have not dealt with other engineering disciplines. At least in software engineering you don't have to pass exams and get a "Professional Engineer" certification from the government in order to get a job.

First, it's NCEES who creates and grades the PE certification, not the government.

Second, with the exception of civil engineers other disciplines have no problems getting jobs without licenses. The license just opens more doors and makes finding work easier.

My comment has nothing to do with PE certifications.

I work full-time as an ML Engineer at a big tech company in Silicon Valley. I think most ML Engineers do not read papers and do not do low level programming. They mostly do feature engineering.

Actually I feel like I forgot a lot of math since university and I would like to pick it up again. Does anyone have a resource they can recommend either for self-study or in person?

Khan Academy is good. Also http://patrickjmt.com/

I'm reading this one math book by Richard Hamming: https://www.amazon.com/Methods-Mathematics-Calculus-Probabil...

Not sure if that's what you're looking for, but it seems to have all the stuff I forgot from college.

Disclaimer: I am the author of this Top-down learning path. Could you please take a look and give me some advice? Or If you have any opportunities, please drop me a line.

My advice would be to figure out what you want to do more concretely. Machine Learning / Data Science is a big field, if you have more concrete goal I think it would be easier.

And yet you're here, looking for math refreshers :)

I personally don't do a lot of feature engineering and I do regularly read papers, I just don't think that I am typical.

CUDA, MPI, C++ these are the deep internal layers most software engineers working on Machine Learning would never need to interact with. Please learn Python instead. R and Matlab (Octave) might also be helpful.

My comment is intended for people who want to become ML Engineers, not software engineers curious about ML.

No, your comment is intended for people who want to become ML infrastructure engineers of a very specific type.

If you don't know that stuff but know generally how ML processes work (measurements, models, data partitioning and cleaning), and you start getting to know the frameworks (e.g caffe/tensorflow/torch) you'll do just fine.

Like, you can know what MapReduce is and how it works and just use abstractions over it, never having implemented it yourself, and you're still a data engineer.

It's a myth that X engineer needs to know everything about field X. There are different levels of everything, and different ways to get there.

Or better yet, learn C, C++, CUDA, MPI, Python, and R simultaneously.

> Get a book describing ML algorithms, and implement them yourself, first using plain C, then with MPI or CUDA, and finally using plain Numpy/MATLAB, or one of the low-level ML frameworks (Theano or TensorFlow).

You don't seem to understand what high-level and low-level mean, why on earth would you write something in C, and then re-write it in MATLAB? The point of the higher level languages is rapid development at the expense of performance.

Re-writing code in a different language is quite a basic skill and not something to waste time perfecting. Instead the focus should be on understanding the language/systems themselves on a deeper level.

No offense, but it sounds like you have a lot to learn before you should be giving out advice. Seriously you're talking about kaggle? Yet you don't make a single mention of computer science?

Let me give you an example.

I first learned about neural networks from deeplearning.net tutorials. They use Theano, so when I got a convolutional network running, I was left wondering about two critical pieces: backpropagation, and the actual convolution operation. Theano completely hides them from a user. So I decided to implement a simple convnet using only Python and Numpy. That was when I realized that I had no idea how the backprop works for convnets. I'm guessing most people who only used Theano or similar framework to run convnets don't know how backprop works there.

Next, I realized that the code I wrote was too slow for any practical purposes, and I decided to rewrite it in pure C. Well, actually I gave up half way and used Armadillo package (Numpy analog for C/C++), but the end result was my convnet became 5 times faster. It was still too slow, and I noticed that my code is using only one core out of 4 cores available on my computer, so I decided to parallelize it, using Cilk+. It was actually easier than I expected, and soon I got almost linear speedup using all 4 cores.

The final exercise was to implement it with CUDA, and I got another factor of 5 speedup running it on my GPU.

The benefit of this experience is enormous - I learned a lot about convnets, I learned a lot about tools and libraries useful for ML, and I learned a lot about making my code faster.

p.s. I don't quite understand what you meant about Kaggle and CS.

p.p.s. And yes, I do still have a lot to learn, but I wish someone gave me this advice when I was starting out.

You had suggested to learn an algorithm in a different order then what you've just written, that's why I suggested it was incorrect. (You put a MATLAB implementation after a C)

Computer science is key to all of this, understanding how to write performance code isn't just in the programming skill, but it's also in understanding the complexity theory, graph theory and computer architecture.

When going for a job at these places, (you did specify 'ML Engineer', this is different from a researcher), they will grill you to make sure you understand how to use a computer. They want you to know it inside out, not just having done some tutorials, but actually able to debug deep complex issues and performance tune.

Getting familiarity with modifying the linux kernel is much more important than having done some silly competitions/tutorials. You have to remove the marketing material from reality.

It's not about being able to "write something in C", it's about understanding the conception of every bit of code, and what is happening underneath, to engineer a quality solution.

The exact order does not really matter. Some people prefer top down approach, others want to start from the bottom, then gradually abstract the details. My point is, you need to know how to work at each of these abstraction levels.

I did mention a course on algorithms and data structures as a requirement, that's where you learn about complexity, graphs, and other things.

I'm guessing you don't know what Kaggle is. It's not a silly competition. It's a place where you get exposed to real-world problems, using real-world data. This is a good alternative to an actual internship as a ML/Data engineer.

You expect ML engineers to know how modify the Linux kernel? Do you also want them to know how to design a superscalar processor in Verilog? How about simulating circuits with SPICE? I don't think so.

That's ridiculous, why would you need to do a MATLAB implementation after writing it in C? Do you even know what MATLAB is? Do you even understand what abstractions are? If I wrote something in C, I understand it on the MATLAB level. There is no gain from doing that.

Kaggle is just a hobbyist competition. How many engineers do you think have used Kaggle to gain their employment? I get that it's hard to understand from the ivory tower of academia what is actually required in the real world.

You seem to keep missing the point, an engineer should be comfortable with working on the low level, understanding performance tuning and they should have a solid understanding of architecture on the low level. Experimenting with the linux kernel is one way to get an intuitive feel for how things work, not a requirement.

You also seem to keep misunderstanding the domain space, do you even know what an engineer does? What do you think they're doing all day? They're writing high performance code, not solving stupid riddles. They will almost never write anything MATLAB, they engineer things to specifications.

I hope your university has an internship, after you complete it maybe you'll understand then.

why would you need to do a MATLAB implementation after writing it in C?

Because knowing how to implement something in C does not mean you know how to implement it in MATLAB, and the best way to learn to do both is to implement the same thing using both languages.

How many engineers do you think have used Kaggle to gain their employment?

I know two people personally who were asked, and bragged about, their Kaggle experience during ML engineer job interviews. Moreover, several of ML positions I was interested in had Kaggle experience mentioned as a desired qualification in the job description.

an engineer should be comfortable with working on the low level, understanding performance tuning and they should have a solid understanding of architecture on the low level

If you reread my comments, you will see that this was kind of my point (and a lot of people here disagreed with me on this). However, learning the details of the Linux kernel is not the best way to learn about computing, and is definitely not the best way to learn the skills needed to do machine learning.

do you even know what an engineer does?

Before enrolling in a PhD program, I worked as an engineer and an engineering manager for 12 years. Since I went back to school, I did two ML related internships, and I occasionally work on freelance ML projects for local companies. What are your credentials, relevant to this discussion?

Honest question - C or C++? I thought the general language suggested was Python+NumPy along with R.


The parent seems to be coming from a skewed background. Most of his suggestions are way too low level. It's like telling a software engineer they must first write an operating system before they can write a web app.

Learn Theano, TensorFlow, and maybe Torch. They'll handle the low level details for you. You just have to know some math to use them.

If you just want to dive in and be immediately productive, you'd be hard pressed to do better than sci-kit learn and keras (for machine learning / deep learning respectively)

You should be equally comfortable with both C and C++.

Sorry I wasn't clear - my question was why do you recommend C and C++ over Python? is it specifically because of the distributed/GPU programming aspect?

I think his meaning was if you're looking to create new ML algorithms, rather than just apply the algorithms of others, you'll need C and C++. If you're just looking to apply the algorithms of others, or if you're 100% sure you'll only be (for example) using TensorFlow to develop new algorithms, Python should work great.

That, and also sometimes writing something in C is the optimal solution (faster than Python, and simpler than trying to modify TensorFlow or Theano backend).

> 2. Get a book describing ML algorithms

Which one would you recommend?

Pretty much any book on ML will describe all basic algorithms in sufficient detail for you to implement them yourself. Don't waste time trying to find the best book.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact