I am a Computer Systems Engineer who is good at front end development. I want to learn machine learning next. How can a front end developer utilize machine learning?
In Feb 2015 I quit my job as a front end developer to learn more about machine learning.
First, I went through the Recurse Center, which is a 3 month program sort of like a writing retreat for programmers. I learned a lot about Python and AWS in that time, and got an internship as a data engineer.
In that Fall, I started a computer science master's. I've taken mostly courses in machine learning including: Machine Learning Theory, Deep Learning, Probabilistic Graphical Models, NLP, and GPUs. I've collaborated with two professors on research papers, which has definitely been the highlight of my degree although I definitely think the courses were necessary as I continue to use the information that was covered.
Finally, I'll be starting this summer as a research engineer doing deep learning! This process took me 2.5 years, but I feel very prepared for my new role. It probably is possible to do this faster by joining a program like Metis or Insight, which prepare you for data science like jobs w/in 3 months. I would say that approach is slightly more challenging / high risk. If you really want to go into machine learning, I'd say doing the degree is a more surefire approach, granted it's more expensive in time and money.
I work in machine learning and I don't think a 3 months course is going to transform you in a machine learning engineer. I think you need at least the time you took. You need to learn the maths, statistics, etc. Apply those tools to real problems, get some experience with different problems and techniques. And, finally, be involved in some type of research. Because machine learning is evolving so fast, you need to learn how to read a paper, how to understand it, and how to apply it to your problem. I see no way to do everything it in 3 months. You need at least a couple of years and be lucky to find a job related, so you put your first step in the industry.
And congratulations for you achievement and your bravery to quit the job!
As a computer vision/ML applications engineer I disagree with this. What you describe is someone who is actively implementing cutting edge tech.
That is VERY different than what 99% of people should be doing with ML which is: Spinning up some K80s on Azure, installing TF/CUDA/OpenCL, pulling existing pre-trained models off the shelf, and running inference on a novel data set.
That's how you get into it as a garden variety dev.
Otherwise, go for the PhD if you want to actually make new stuff.
You are missing a lot of things that you don't know. If you want to do machine learning at some point you have to train a model. You need to know how to clean the data, how to create the train/validation/test set, how to measure how good your model is, how to compare to other models you trained previously. If the model is not performing correctly you need to know why. You need to know the trade offs between precision and recall. This is like 95% of your work, the other 5% is running the training in Amazon or whatever you want to use.
I have worked with people who get a training example code and apply it to a dataset. And few weeks later they were still pulling off their hair because the model wasn't working in production but they have such a great results in their test. I took a look to their way of doing the training and I could point to so many errors they were doing why the model will never work in production.
That is not cutting edge, but at some point there is a new model that works better, and you should understand why in order to improve you current model. So probably you will have to read the paper and understand it.
If you're trying to build or train new models then you probably need to go to school for ML or at least math.
The garden variety dev shouldn't be trying to implement a research paper or train new models - that's the point. There are enough proven tools out there to do good work and more are being put out there every day.
> Spinning up some K80s on Azure, installing TF/CUDA/OpenCL
I think a single k80 instance is roughly ~$1/hr. If you had an experiment running 24hrs a day for a year, you'd spend a little over $8.5k. You can build an equivalent desktop machine for less than $2k [1], which might be slightly more convenient (once it's built), although I haven't really factored in energy costs.
> That's how you get into it as a garden variety dev.
Btw, you don't really need a GPU to start learning about deep learning. You can train a SotA modal on MNIST using Caffe I think in roughly 10m on CPU (maybe 1m on GPU). You can also train a reasonable sentiment classifier or natural language inference classifier in less than an hour on CPU. My perception is that these types of tasks are really solid for someone who is beginning to learn about machine learning or deep learning, as they'll provide a playground to mess around with different optimization techniques (SGD v. SGD+Momentum vs. Adam vs. etc), regularization (L1, L2, dropout, batch norm, etc), data augmentation, error analysis, and so on. If you do an ML interview for an entry level position, chances are these are the types things they will ask about.
I guess deploying ML solutions for a company you are working at is a different story.
> Otherwise, go for the PhD if you want to actually make new stuff.
There's some truth to this! PhD (like a Master's) probably doesn't make sense most of the time as a dollar-efficient career move. Rather, it's something you should pursue if you find being in an academic environment personally satisfying. You definitely don't need to be in a PhD program to work on new stuff (although it might make things easier because you will hopefully be surrounded by lots of fresh ideas). I've heard about people in bootcamps working on novel research. Now that so many powerful tools are open source and easy to use (Pytorch, Tensorflow, etc.), it's pretty easy for anyone to put together a novel model.
I would definitely extend this to running training as well, but I agree with the concept - for most people, it should be either transfer learning to adapt existing models to their data, or running training from scratch with currently known best practice methods, NN architectures and hyperparameters, but doing it on their particular datasets. Possibly by using mostly existing code and modifying mostly the data input/output routines.
Cleaning the data, compare the models and understand why the results are like they are, those are huge things in Machine Learning. Actually it is like 90% of my job. Training the model it is nothing compared to it. As I said in another comment, I have seen people doing so many mistakes before training or comparing the models. They spent weeks seeing the models with good results in their test but performing like a random classifier in production. Just because they training setup was wrong, they didn't know how to compare models, etc. Machine learning is not like learning a new framework. You can learn the framework and use it, but you are going to do so many mistakes because all the other machine learning knowledge you need.
I think you need a bit more competence to get into the training realm though, because it's a bigger step to create a new model - especially the hard step of data labeling.
Unless you have a novel data set and a way to quickly train you're probably better off using existing trained models in most cases.
I agree with the transfer learning piece wholeheartedly though.
Data labeling isn't hard, it's labor intensive, which is an entirely different resource. If the business goal is valuable enough, then a non-tech manager without any special expertise can organize twenty man-months of grunts to do the labeling, three man-months of cookie-cutter junior dev work for tools of labeling and data management, and a single man-month of an external consultant with proper expertise to write sensible guidelines on how the labeling should be done and supervise the process. All of which will cost something comparable to a the annual cost a single ML developer.
Training models often is tricky, but it's not that hard, my experience shows that decent undergrads learn to train standard models on their own datasets after a single one semester course, and train quite difficult models after two semesters; so teaching/learning basic ML takes comparable time and effort to e.g. teaching/learning basic JS frontend development.
So if some company's IT department has some minimum ML skills, lack of expertise shouldn't be preventing them from training models. And even more so, using your own data (IMHO) is the whole point of adopting ML; if the problem is so generic that you don't need to adapt it to your data, then you shouldn't be learning to use ML but rather buying and integrating a SaaS API run by someone else.
Which is a form of hard...for example if you need 60,000 semantically labeled images, you need to train people to know how to do that specific of labeling and then have them do it, then QC the data, break it up into training and validation sets etc...
Don't forget that this advice is for a front end dev who hasn't ever touched caffe or torch or whatever. In many cases it takes new people a week to set up drivers and an environment on a GPU.
Any insight for someone that already works as a bioinformatics scientist that wants to move into deep learning?
I've got the programming, math, some stats, and am currently involved in research, but I only know a little about deep learning. I'm currently working my way through the course.fast.ai deep learning courses and am going to do Part 2 when it is released.
Any other resources that would be useful for getting a job in this area? Best to just work on my own projects?
I would say to you to go to kaggle and get your hands dirty in any competition related with bioinformatics. As you already know the domain better than other people in there you can focus on learning deep learning.
And for the job I don't know. You can post your CV here in the monthly thread for jobs, also take a look to the companies searching for people with experience in bioinformatics that wants to apply deep learning. I think a startup it is a good place to find a job and improve your deep learning skills.
My undergrad was in computer science, although I would say this wasn't the case for most of my classmates. There were a lot of students from other engineering fields (like Mech E or civil), Math, Stats, physics, bio, and business.
I did my undergrad at U. of Michigan, and my master's at NYU. I would highly recommend both programs, although there are many great ones out there. I probably have a list of ~30 schools that I think would be excellent for a master's in CS with a focus on machine learning (although it is worth considering a master's in Data Science, as this makes sense if ML is your primary interest).
I am happy to give more targeted advice on grad school. Please send me an email at andrew [at] mrdrozdov.com.
On a general level, did you or your classmates general fund these masters degrees themselves? And what is the career path / expected compensation after completion of program? Asking as a web dev with only a few years experience but making ok startup-world money. I'm curious, being interested in such a route but also a little older and trying to catch up financially after a pre-tech career.
I believe it is similar compensation as a software engineer with a specialty (front end, data engineering, databases, etc.). I would not recommend this career path to anyone that is in it for the money, but rather to do it because they find machine learning personally fulfilling. These other specialties can be personally fulfilling as well, and many of my close friends have a specialty that is not machine learning and are very happy with their career.
I received a handful of emails specifically asking for the list of schools I was referring to. Here is a list of 31. There are certainly schools that I have missed, but I think that any of these programs offer a strong curriculum and community that would prepare a student for a career working in data science.
Liberal arts troll reporting. Starting my prereqs for a masters in CS this summer. It'll take me quite a bit longer to complete than someone who started with a BS in CS (basically I'm taking enough undergrad courses to fill several semesters, before even getting to MS work). However I was a developer for several years before this leap, and I feel pretty comfy with math things. Also working on reproducing papers to develop chops, in my spare time so my dev skills don't go to nil. Excites me since I've always wanted to grow to this level as a dev, I just never had an excuse to go for the CS degree till ML came along.
Consider a MS in Math (or another BS) vs. CS. The Math will transcend the pace of CS...machine learning today, what will it be tomorrow? Whatever it will be it will need to leverage math (at least initially so that others can stand on their shoulders).
I'd also like to know more about your Masters program. My first degree was in graphic design and I've been working as a developer for a few years. I'm currently taking undergrad courses part time in CS and have been looking through Masters programs. Thanks for sharing and also best of luck!
The response makes it sound like they were accepted to a MS program under the condition they complete undergrad courses. I am curious which MS program this is.
Your question says "utilize" machine learning. I was in the same position. It struck me that rather than go off and try to catch up with half a century's research, and contribute nothing, I'd be better off utilizing machine learning in my front-end work.
So I have started using IBM's Watson platform, and some of Google's AI tools. I was specifically interested in speech processing applications (I have some background in signal processing and audio, which helps a little), and I've found the Watson stuff particularly useful.
At the end of the day, if depends on your motivation. If you really want to become a true expert, stop reading this and start studying. Otherwise, I think there does exist a significant "gap in the market", as it were, to build useful front ends to these technologies, which currently exist as raw APIs.
In terms of career prospects, I have already met several Watson consultants, who do exactly that, and charge top dollar for it. The plain fact is, it doesn't take very much to be considered an AI "expert" in the current climate. And you're probably more likely to get there quickly by standing on the shoulders of giants than by dedicating your life to a PhD.
I have no clue about Watson and Machine Learning, I mean I've talked to people and quickly glanced at the platforms. But in terms of the speech processing, does that not require any back-end at all? I'm assuming your using the browsers API to capture sound and then passing this off to Watson in some back-end method to analyze? Or can this seriously be done all fron-end? Pretty amazing if so.
It's not that the actual ML computation is done client side -- that would be way too slow. With Watson and many other ML platforms, you're leveraging someone else's (IBM, Google, Microsoft, Amazon, etc.) computing power.
Some people have made javascript-based ML models to run in the browser (I think some were made for this course[1]), but these are for educational purposes rather than actual use.
Well, I am currently developing a full-blown back-end service using AWS containers, but for a prototype I got it going as a simple Python script, based on an outline I found on github. Took a few hours.
Basically you get your source speech as an uncompressed WAV file, create an IBM Bluemix account (free trial), create a Watson "app" on the site (basically gives you some credentials for calling the API), and then write a script to upload your WAV file to the API and decode the JSON response.
It gets more complex when you want to start parallelizing the process to make it faster, and dealing with the results in an intelligent manner, but the initial proof of concept is remarkably easy.
If I recall, the Google one was even easier - no script at all, did it all with curl I think.
I am a long time developer who is trying to move into data science / ML.
What I've found after taking off for 12 months of self-study, is that it quickly dissolves into you must know math. As far as I can tell, its take a problem, map it into a vector space, then use the full power of mathematical analysis on it .
There is a huge push by large companies to make AI as a service though, and for that, you only really need to know how to use the APIs.
Which makes me wonder, why in hell have I done burnt through all my savings for this. Sure I have a new found love for math, but I'm not going to be accepted as a mathematician ever without the rigor of a formal education, and if I just wanted to use APIs ... I could have continued to do what I was doing.
> Sure I have a new found love for math, but I'm not going to be accepted as a mathematician ever without the rigor of a formal education
I think there are two separate things here:
1. Being a research mathematician requires a degree of expertise which is easier to come by with formal education.
2. Knowing how to use APIs and make correct distributional assumptions; despite the bullshit fed by our industry, it is not easy or non trivial to design a completely idiot proof API. So having know-how of how the math works under the hood is helpful even if you are going to just use the API.
You don't have to be a mathematician to do machine learning. Jeff Dean said that almost every engineer working at Google should be capable of picking up the math for machine learning [1]. In particular with deep learning the math is quite manageable.
I would encourage you to take part of some Kaggle competitions to get a better feel for the practical aspect of machine learning.
For sure work through a linear algebra book, a stats book and some probability theory. Some basic calculus as well. Work through all of the simple examples. Just like programming..
That level of math helps to model the problem domain. The part of modeling the problem is to see that everything in ML is a graph. So you can look at it from that point of view as well, at least computational wise. Mapping the math to the graph is the heart of it all.
Kidding aside, I've seen these kinds of posts so many times and I want to make sure that for those that are thinking "strategically" about their profession, career and passions then I would advise to buckle down with a good BS in Math at a minimum (or CS).
But why? Im your older self telling you that you will grow to really really like and enjoy programming, computers, tech etc. and may want to continuously dive deeper. And when you attempt to do that it all comes down to Math. So save yourself a ton of time and money and just do it, close all your browser tabs, cobble together all your transcripts and get into a Math program (if you already have, go get a MS in Math at a uni that has a strong CS program).
That's exactly how i feel. I dropped out of school at 19, taught myself to code, went to a hack school, and have been working as a rails dev for the last 4-ish years. I realized I wanted to go into ML, and instead of trying to take shortcuts, at 24 I enrolled in my local community college to finish a degree in comp-sci. Looking at 8+ math classes (from Geometry to Calculus 3), but the only reason why I'm doing it is because I believe it will be worth it in the long run.
The only thing that sucks about it is all the GE classes I have to take before I can even start my fun Comp-Sci classes.
Hey its me, your past. Really though, I'm currently 20 and dropped out of school last year (GE's kicked my ass), and started working full time as a prototyper. I know I am probably going to have to go back at some point, but what really made you decide to go back? Was it for a higher payroll, or better fulfillment?
I want to master my craft, and i believe the way to do that is to have the application and the theory. being a rails + mobile dev teaches a bunch about the application side of programming and I find it fascinating.
In terms of the age thing, I get it. At 19 I hated being at school because I wanted to be making $20+/hour working on websites. But now at 25 and right under 6-figures, I see that I can simply work during the day and then go to school at night. It just means I have to do that for 4+ years. I started at 24, but I will not be done until I am 30. So, if I would have done it right the first time, and not dropped out, I __might__ have been further.
I don't regret it tho. I rolled the dice and although I was trying to hit the start-up lottery at that age (why I got into programming), I ended up on a great career with incredibly valuable experiences that I would not have had if I didn't drop out at 19.
I'm not sure if I agree with this. This is like saying that the very essence of cooking is chemistry and that in order to fully understand and appreciate how cooking works, a chef should get a MS in Chemistry.
Thanks for sharing this. I immediately like his energy...I find KNN and TSP very interesting problems and I see he has a videos on those...bon weekend!
Thanks! There is a live session now (started 34 minutes ago), as every friday. He's apparently coding a Minesweeper. It seems like a really good resource.
A lot of bootcamp / self-trained devs went into front-end JS for the money. Now that it's saturated, time to move onto the next buzz word. A foundation of CS fundamentals will still be lacking.
It's great to do jobs for money, I just think programming isn't something you can pick up in an afternoon and be good at, like mowing lawns or making sandwiches. I can't trust the journeymen for high tech jobs, maybe for editing wordpress or something.
The same thing happened in the 90's, my coworker's sister got a "Certificate in HTML" and promptly found a job for $80k in 1998. These people wash out, I've been burned enough times by bootcamp grads that I won't hire them anymore
I've never found myself not knowing what I should do, or threatened by a hiring/skills bubble.
When I stopped feeling like I was getting anywhere with front-end I started to take on more back-end projects and sell myself as a full-stack developer.
I've never been in a situation where I've felt "man, I really should have gotten a CS degree": but what do I know; I'm just a web developer.
Maybe we're not talking about web developers?
No advanced engineering skills have been required to make the frontends/backends/architect a system for Fortune 500s and local businesses that I've been involved with. (I'm speaking of things I would have learned in school with CS and math principles had I been able to get past CS 101).
-- college dropout with successful programming career.
I am a bootcamp grad who specialized in undergrad in electrical engineering. I worked for a few years as a research for the one of the top tech universities in the country; I have found that I have a stronger math and physics background than many CS grads. However, many times when doing something I have been stopped by saying "wait aren't you a bootcamp grad? can you even understand this?"
I'm not alone; my bootcamp had plenty of STEM majors. One of colleagues was a biomedical engineer and worked in a research lab and now works in the front end at a top firm in SF. Another was a math major at an Ivy before working full stack. The top guy at my bootcamp went to Berkeley in Biochem and was way way smarter than me.
So stop generalizing. I understand that they are unaccredited institution so you get a wide variance of talent but you can't shit on everyone.
To be honest, your issues probably in how you screen talent. My current company has found success in hiring bootcamp and top schools in the area (Berkeley, Stanford).
It's just a very simple fact that 90% of the bootcamp grads can't program their way out of a paper bag. In fact, most self-trained programmers are vastly better than even the best bootcamp graduates.
The problem with bootcamp grads is that they don't know what they don't know. And they don't know a lot. Undoubtedly some of them will turn out to be great programmers, but not after 3 months. Or even 6. Or even a year.
Again I think you are overgeneralizing since I follow the careers of my class and at least 1/3 of it is doing extremely well in their careers. I think you should be able to program a "paper bag" working at companies like Pivotal Labs, Microsoft, Pinterest, Airbnb, etc. which is where my classmates work at. Maybe I'm not interacting with the right grads though or have a representative sample set, but then again I know you pulled that 90% number out of thin air.
And you know how I know what I don't know? I constantly read, get mentorship from senior engineers at my company, build side projects, etc. The learning process hasn't stopped and it hasn't for many of my colleagues.
I do agree self trained programmers are better, because frankly thats way harder.
Yeah people with hard science degrees and/or from top schools, I'm less skeptical about. The guy who was a waiter 3 months ago and now saying he's a software engineer, very concerning.
It is 100% possible to be an ace developer without a CS degree. However years of hiring experience has made my bullshit detector ring alarms with self-taught / bootcamp guys. If they have a hard-science degree it's less risky. Honestly I don't bother interviewing these people anymore without a warm introduction, it's hard enough to find good devs and the ratio of good to bad is so much worse with the self-taught crowd.
I'm a self trained dev (aka horrible person?) who's spent the last 8 years doing back end dev work - mostly C#, Java, Python and now Scala in everything from SV startups to a government research lab. Either I'm an outlier, or you're overestimating the importance of those cherished fundamentals (which can be picked up independently of a degree anyhow).
I think 8 years of experience plus solid references and projects speaks for itself. It's a resume that lists a 3 month boot camp, a github full of forked projects with no original commits (plus the obligatory ToDo app in rails), Senior Software Engineer title, $100k salary demand, leaving their first professional programming job after 6 months, can't speak knowledgeably about basic shit like IPC. I don't... have time... for this anymore.
Can't you just filter this out with an hour coding challenge?
I don't even know why you are screening these candidates, the profile you are talking about doesn't even hit our hiring team usually or gets binned within 5 minutes.
I really think you have a recruiting process problem.
Posts like this definitely give me a nice flare-up of imposter syndrome. Had not heard the acronym IPC. So I guess I don't even know basic shit.
As a bootcamp graduate, what do you want me to ... do? I can't go back in time and major in CS. My employer is satisfied with my work and I build things I'm asked to build independently. Should I give up a startup salary, inflated as it may be, and ship myself off to a CS monastery?
I'm a self-taught developer with significant programming experience who has worked as CTO of a successful startup (multimillion dollar exit) founded by others, has sold one of my own businesses for six figures, and who currently runs several independent, profitable projects.
In more than a dozen years of programming, never once have I had to talk to someone about IPC. And if someone pulled that sort of higher-than-thou test-by-acronym-recogition at a job interview I would laugh at them unless the salary is literally 200k and in an area which involves kernel hacking, which is pretty much the only situation in which I think it would be reasonable.
Reality is that there is a bias against hiring self-taught people and many common practices like these are basically screeners against that as opposed to measures of actual skill and expertise. One tip if you are job-hunting is to ask potential employers to review your code before you jump through their interview hoops. The places/people who are serious about hiring will do it. Those who will not are more likely to filter using this sort of arcane minutiae and are not worth your time unless they are paying for you to attend the interview.
Your salary expectations are also quite reasonable fwiw.
IPC used is fairly common for desktop programmers. IPC just communicating with another process, and technically web developers are doing it as well.
What most people mean is something like COM(Windows), Named pipes, memory mapped files, etc that allows processes on the same machine to communicate with each other.
If I was hiring for desktop software or pure server software, I would expect someone to know some IPC mechanisms.
It's awesome. I have a library stacked full of software and CS books. Maybe try to get your bachelors at night? I got my master's at night while working.
This is a craft. If you want to be an expert, train like an expert. Realize 3 months of a bootcamp isn't going to cut it, not at least until you've been working for 4, 6, 8 years.
If you can't tell me what inter process communication is, what it's used for, pipes, signals, etc. then I think you have some pretty big gaps in knowledge that preclude you from being an expert at this point.
Don't worry about it. Keep working and learning and focus on doing great work that makes people happy. As long as you take the time to learn why something works, rather than just cutting-and-pasting from StackOverflow, you'll be fine.
Frankly, the recommendations @seibelj is making are in a particular niche--OS fundamentals--and one that I'm guessing makes him feel smart knowing about. But they aren't necessarily relevant to you, or important to know. It depends on what you work on. Some people have trouble realizing that their pet interview question isn't actually as universal as they think it is.
PS: IPC is "Interprocess communication," and it's how you can have multiple processes coordinate with each other. You may have heard of pipes or sockets--those are for IPC. (Technically, so are files.) If not, don't worry about it. I have over 20 years' experience as a professional developer and while IPC primitives like pipes and sockets have come up from time to time, it's hardly central to my work.
Thanks for the very nice words. I realize, while still learning and progressing, at this stage in my coding life I'm an electrician, not an electrical engineer. Which can be ok - the world needs electricians too.
Not trying to feel self important, I'm saying why I'm not hiring bootcamp people anymore. I could go through a litany of topics from OS, algorithms, networking, architecture, languages... if you are a magic box programmer, don't be shocked if I ask you how the magic box works and you have no clue.
Um. You do know that a lot of folks manage to graduate with a degree and basically have only ever programmed in one language like Java and wouldn't recognize a unit test, build system, or a version control system if it bit them, right?
Not to mention starting every task with implementing a linked-list or sorting algorithm from scratch.
Eh, try to settle the impostor syndrome down. It's hard, I know, but not particularly warranted in this case.
Day to day, I don't think I know anyone who would actually use the acronym IPC over the expanded term "interprocess communication", assuming the subject even comes up.
The generic term just isn't particularly useful, given the wide variety of mechanisms included. Instead, you will hear developers talk about pipes, sockets, ports, connections, queues, and so on as may be warranted.
Knowing when to use a pipe and when to use a socket is important, but the fact that both are grouped together as "interprocess communication" mechanisms along with a bad idea like shared memory really isn't.
I'm about a year into a transition from traditional software dev / engineering leadership to ML / robotics engineering, starting with 6 months off of personal study and now continuing study while on the job as a research engineer. I wrote about my experience and give advice on how to approach learning ML here:
Its hands down the best course to get your hands dirty with latest , state of art stuff, and then learn how it works. It has completely different approach to most courses. It is top down.
Do this first, you can immediately apply it to cool stuff like image classification , nlp etc.
The assignments have additional resources where you can get into more detailed math (but the course doesn't dumb things down, but gives more intuitive explanations) and dive even deeper.
I have worked with ai and nlp guys. How i have seen this works out: there is a problem x. They get the best most recent respected research on the problem x. They implement it most of the time it's on (github).
If it doesn't solve the problem at hand they shrug their shoulders and say something like "it is the standford nlp parser can't do better than that!"
the concept "getting into ai" - I am confused. We need more people to git clone ai repos? Or are these people truly interested in ai research - at that point they should be looking at a phd.
Then people pile on: "learn how an nn works!" Uh why? Anyone can git clone and setup nodes. I am missing something. Please help.
So I have been doing what I shall call applied machine learning since I was in college when I built an ad classifier for a web crawler I was building at the time. I made the real transition while working in the search team of a web company almost 10 years ago now.
Let me first say that I am unlikely to ever design a new novel algorithm like an SVM kernel. I have however studied ML theory extensively and have a good grasp of the underlying math. I also had the advantage of working in medical research starting in high school and even before college I had learned a lot about statistics and was comfortable using a tool like SPSS to perform ROC analysis as well as gaining a solid understanding of what real statistical rigor was.
I, and those I know and work with, do a lot more than clone some repos from GitHub and see if they work. Typically there is some sort of a business problem that needs solving. Sometime we know of an approach that will work but often there is a literature survey that needs to be conducted to see if anyone has solved a similar enough problem and written about it. I am comfortable reading ML/NLP literature and evaluating the methodologies described. Often there is some open source stuff to get us started but rarely (I can't think of any, but its early in the morning) have I been able to put together a complete solution without solving some difficult problems on my own.
If I were to give someone advice it would be probably not the advice that they would want but here goes. I assume that the person would already have a solid mathematical foundation like engineering calculus.
1. Start by getting solid foundation in statistics and probability.
2. You will need a foundation in linear algebra.
3. Find a mentor(s) that can help you with both the theoretical side of ML and the applied side. In my case they were different people.
4. Implement some learning algorithms from scratch. I build a NN library a long time ago. I never used it in a production application but the learnings it gave me are still invaluable.
5. Read the research. You need to feel comfortable picking up a paper, understanding it, and evaluating whether you should believe the authors or not.
Maybe there are shorter roads. Personally I don't believe so. I was lucky to be paid to learn these skills through my career. I am sure there are people who are smarter than me or who can just learn by reading. I learn by doing. But this has led to success for me and I think gave me the ability to succeed in different environments, using different technologies, and long before the entire world was so enamored with deep learning.
do you have advice on whether it is worth going back to school if your goal is to build novel and useful software/ai tools using deep learning - not necessarily improving the algorithms themselves. Would you expect to still need those 5 things you listed as advice.
Yes, to actually improve the state of the art, you start a PhD, I agree.
But there's still also a lot of work that people can do applying the "Github repositories" to new problems. And to do that effectively, you also have to know stuff (e.g. you need to be able to read the most recent research, now when tool X is appropriate over tool Y, know what preprocessing makes sense in a given situation, etc). There's money to be made there and people want to do that work.
Stanford parser is very good for preprocessing data. Things like part of speech tagging, named entity recognition, and dependency parsing. If you want to do something fun and interesting with your data, you will probably need to implement it yourself. Note, there are lots of other go-to tools nowadays besides the Stanford parser. Things like GloVe embeddings, open source translation systems (harvardseq2seq, open source sentence encoders (facebook fasttext) are probably necessary in many NLP pipelines.
When things "just work" with off the shelf tools then you probably don't need the researcher (although sometimes you will need them to just find the right solution/tool). When things don't work, you will need them. I guess this can be said about many fields though? (Databases, front end development, etc)
The big gains now are taking these pre-processing tools in speech, vision, NLP etc and using that as input to a NN for some problem domain. Every one of these is a startup.
http://www.fast.ai/ is a mooc, Coursera is a mooc that has courses on the topic also there is udemy, the knowledge you retrieve depends on the effort you put in, just like front end development
I did Udacity's Deep Learning and Machine Learning courses and found them to be a really great introduction, and enough for me to continue on my own. I'm not a front-end dev, but also definitely not a mathematician or computer scientist. You should be able to learn enough to do some cool stuff.
- Make sure you know Python well - pretty much everything interesting in ML is in Python. If you already know JS this shouldn't be too difficult.
- Learn you some classic ML using scikit-learn and some online course.
- Learn Deep Learning using TensorFlow, Keras and/or PyTorch with one of the online courses.
- Get in the habit of reading new papers in ML (which are out on a daily basis) and replicating the results.
Cornell University Library [https://arxiv.org/] has a lot of publicly available papers. You need to find the domain that you are interested in and start looking. Usually, papers describe relative works and state of the art algorithms. So you can start with one algorithm and go deep.
How can you utilize it? It's not "front end" per se, but you can use it to build recommendation systems and predictive models for users. Basically, anytime you need or want to make a decision involving some amount of uncertainty.
it depends if you want to be a data engineer (lots of need for that), software engineer that developers code for data scientists (Software Engineer: Machine Learner - also lots of demand), or a data scientist.
the latter, 'might' be a challenge. I started looking into linkedin profiles of data scientists of top tech companies after I realized there were more wharton mba's as data scientists than there where people who mastered in CIS in my program.
So far, 50% had a masters/phd in statistics, 25% Information / Data Science, 25% business background. However, I am still looking into it as my sample is small and I have selection bias in my sampling.
That's probably because "data scientist" isn't a job description, it's a trendy title for a whole constellation of jobs that involve looking at data. I would expect that the Wharton grads are probably doing work that is fundamentally different from most people with hardcore academic stats backgrounds.
I think this sentiment is roughly correct. It's basically the same for programmer as there are many jobs that involve programming. Note that data science doesn't necessarily mean machine learning, and there are lots of data scientists who do powerful work with stats that doesn't involve anything that looks like ml.
The jobs for people working in machine learning tend to fall into: software engineer (machine learning), research engineer, and research scientist.
Yann LeCun has an excellent Quora Q&A where he delves into the differences of the roles. I do not have the link on hand but it is probably easy to find.
Just like you use socketio api without knowing about tcp ip or network routing protocols. You can treat machine leaning as an api which does things.
You don't need to take a machine leaning class. Just treat it as an api.
And you never know, you might be already using some api which does machine learning.
I literally just grabbed the last (at the time) machine learning paper that'd crossed my feed, and started reading.
Whenever I encountered something I didn't know, I googled around until I had even a vague idea.
That said a) I don't know how do-able that would have been without the math background I've already got, and b) I haven't gotten much further in that reading that one paper ...yet.
Regarding the math - The only thing I know for sure to point you at is multivariable calculus, which is also (probably) the most useful math class I ever took. (It helps that the teacher was completely amazing). The set of concepts it introduced me to are amongst the most useful things I ever learned, and it's that understanding that has always given me a leg up on jumping into things at the deep end.
(I took multivar from UC berkeley in, I think, spring 2002? I went looking for video lectures, but did not find what I was looking for...)
Cornell University Library [https://arxiv.org/] has a lot of publicly available papers. You need to find the domain that you are interested in and start looking. Usually, papers describe relative works and state of the art algorithms. So you can start with one algorithm and go deep.
From the description: These tutorials have been chosen to maximize learning curve, i.e. learn the most in the shortest amount of time and cover topics from basic deep learning all the way to research done within the last 1 year.
They cover significantly more material than a typical deep learning course and took me lesser time. Good luck!
I would suggest running through a tutorial or two using one of the hosted machine learning options (Amazon has one I am working with right now).
This will let you focus on the essence of machine learning (data gathering and cleansing, interpretation) rather than the mechanics. The gathering of data and building of intuition about results are by far the hardest parts of machine learning, in my experience and reading. This is especially true if you are just getting started.
Plenty of time to focus on the mechanics later.
(Full disclosure, I am working on a ebook about Amazon machine learning link in profile.)
If you want to get more hands on, I'd look into Tensorflow. You can search github for popular projects using it to get some ideas. I haven't played with it in awhile.
Find a company with a data-science department, which needs engineers to industrialize their proof-of-concepts (POCs). The POCs might often include front-end work. Once you are in, learn machine-learning on the side and try to transition into a machine-learning role.
EDIT: On top of it, if software engineering is your strength (testing, automatic deployment, etc), data-scientists will also benefit, since you can show them how to professionally develop a sw product.
I have a friend applying for a job doing just this: being the software developer to help the data science folks operationalize their models.
Not sure how applicable this is to a straight up front end developer, though. I would definitely suggest learning Python if you head this route. R would be good as well.
It don't answer you but it is related in some way. I saw some days ago a js library to implement neural networks. I didn't dive in it but it looks like something to do simple things on browsers https://synaptic.juancazala.com/#/
I can't recommend [Practical Deep Learning for Coders](http://course.fast.ai/index.html) enough. I am going currently through it and it is extremely educational and practical.
Hey, I'm in a similar boat. Is your motivation more curiosity or career-related? Do you really like front-end code? I'm finding that it's an easier path to get into ML through conversational UX.
Related question: If I only had an MIS undergrad degree (some 200 level stats and no other math) and wanted to get into machine learning what kind of math courses would I need to pick up to become proficient?
A front end developer can most easily jump into machine learning by reading the docs for a machine learning API then querying it. Won't take years and years and no PhD needed either.
have not finish it but it seems good, you could check it out, this tutorial uses processing for teaching neural networks and it relays on processing's graphic feedback to visualize what is happening https://medium.com/typeme/lets-code-a-neural-network-from-sc...
If you can get through it on independent study, Bayesian Reasoning and Machine Learning by David Barber is incredible, dense, but very accessible. It's also free online!
One is very programmer heavy. There's a lot of data processing that goes into machine learning, but you can't quite separate the two. Often, the programmer who prepares and processes the data needs to write the code that actually runs the model and parses the result, and this means you benefit from more understanding of machine learning. That role is most likely the best one available to programmers who don't have much mathematics background in this area.
To really get into machine learning itself as a data scientist, though... I do think it requires some math. There's a reason a large percentage of people who work in this field have an MS or PhD in a very quantitative field. And I don't just mean algorithm designers - to really be able to explain the difference between naive bayes, random forest, neural nets, and logistic regression, it helps enormously to have a background in math.
To illustrate this, I've taken two coursera courses on data science. They were both excellent, but approached from different angles. Bill Howe's data science class involved an exercise to use a random forest to do some classification, but the focus was on calling the scikit-learn library. We did of course review the algorithm, but not in mathematical depth.
Andrew Ng's course on machine learning got into implementing the algorithms (with a language called Octave, which honestly I didn't like much, but that's a completely different topic where plenty of people would disagree with me). To do that class, honestly, I'd just ask if the terms "vector calculus", "matrix of second order partial derivatives" or "logistic function" mean something to you. It's ok if you can't define these things on the spot, but was there a time when you could? You can get up to speed, but I'd say if you haven't taken basic calculus through differential equations (with linear algebra), then you won't be able to understand this material.
I've been impressed with how well people learn on their own, picking up a lot of math as they go along. And I don't think you need to be able to implement these algorithms yourself to use them meaningfully. But if you're going to be deciding what kind of model to use, even if you're using libraries to do it (and most people who can implement these algorithms would still use a library), I think that you do need to be able to describe how a neural net work vs random forests vs logistic regression vs naive bayes. There is a side of this that is very math-y as well.
On the bright side, we live in an era of amazingly available learning material. I personally think a dedicated person can probably learn calc, linear algebra, and differential equations through web-based coursework now.
SO overall, I'd say - start on the data side as much as possible, leveraging your programming skills. While doing this, keep getting more exposure to ML algorithms, and make sure you are taking a coursera or other web-based class on the side.
If you really want to learn machine learning you should pursue a PhD. You might be able to do very simple stuff and follow tutorials because there are very high level tools around, but to really know what's going on and how to use it to produce new products and services, you need to master it properly.
While it may not require a Ph.D, effective use of Machine Learning does require Ph.D like scientific skills. Even the link you mention talks about reading research papers and building good models does require scientific rigor.
Hmm... for this statement, I guess it depends on what you would classify as "machine learning".
From what I've read on machine learning, a lot of the more basic techniques includes statistical methods (linear regression, logical regression, random forests, Bayesian statistics) that more or less are taught at master's degree level statistics courses at most, not doctorate level. If I remember right basic linear regression even showed up in stat 101.
I realize that many of these techniques can't solve some of the problems the deeper, more complex machine learning techniques can (for which your Ph.D statement might be right). But not every problem needs a very complex solution.
Well the article builds around a very superficial view of ML.
If you want to do simple recommendation systems or spam filters than O.k. Those are solved problems, hence commoditized.
If you want to build novel things, you really need academic-grade ML.
If you want another argument, I came from working in VC and startups, and they think they understand ML. Boy, they really don't. They are like kids pretending to play a guitar that can't strike a single chord right.
Different approaches suit different people and PHD is a relatively specialized route. It's good to have people targeting similar goals with different approaches.
For an anecdote, I recall hearing one of the Kaggle founders mention that many of their bounties are won by non-statisticians/ML-ists. Producing novel (in the academic sense) stuff is unlikely outside of an academic setting, but producing products or solving problems is do-able.
Edit/comment: no need to downvote rsrsrs86 people. He's putting forward a position and defending it, not trolling. If you disagree, then disagree. The whole point of a thread like this is hearing people's take. Surely, PHD is a valid suggestion.
Kaggle competitions are very restricted in the sense that they are supervised learning problems. This typically results in applications in analytics. This should o.k. be easier to get into.
But ML can do much more than analytics., and much more than supervised problems. And the great problems to be solved are not supervised problems. They involve learning as you go, without a clean database with examples to learn from. They are adaptive problems.
You might optimize prices in an online retail player by trying to estimate supply and demand curves, but you will fail, and the best way to do it is not much different than teaching a neural network to play video games, but is fundamentally different from supervised learning and regressions.
ML can do self-driving cars, it can build drones that learn to fly, it can translate horses to zebras, it can play defeat humans at Go, it can make guitars sound like pianos.
There is a lot of technique and theory into framing any problem as a problem that can be solved by machine learning. Machine learning is generally not feasible unless you restrict the problem properly.
I think if the problem is figuring out an ML solution to an already existing large dataset (as in the case of Kaggle and big companies), you do not need an advanced education. However, if you are doing a start-up one needs to answer questions like when should I stop collecting data, when should I give up trying this algorithm, when should I conclude this problem is impossible in its current form etc. These questions require crazy amount of experience of solving novel problems. During your PhD, you try to solve many novel problems and it makes you an expert to answer these questions. I also think you need an advisor/mentor to develop these skills. This is the real knowledge you can learn in PhD and that's why they are valuable and hard to replace
"If you want to build novel things, you really need academic-grade ML." is a bit tricky.
If you want to achieve novel (better than yesterday's state of art) results on existing problems, then yes, you really need academic grade ML. Especially for "solved" (i.e. well researched) problems - if the current solution isn't good enough for your needs, then you're going to need serious work to improve on that.
However, if you want to attack novel business problems, then it's quite likely that you can solve them without needing to solve any new ML problems. You have to know what "instruments" are available, and you have to be able to read&learn how implement a particular solution that you choose, but generally you just need to squint hard enough to map your business problem to one or more ML tasks that have a known solution.
For starters, what's your math background like? If you're interested in seriously pursuing ML theory, I'd recommend having at least 2 semesters of stats and 1 semester of linear algebra fresh under your belt. Don't short change yourself here, because any remotely advanced material will be incredibly frustrating (aka constant backpedaling) without the requisite math intuition.
One approach might be to tackle a MOOC like Andrew Ng's coursera offering (which isn't very math heavy) while simultaneously brushing up on your stats and linear algebra (mostly stats tbh). Even if you end up just focusing on implementation, I think this will be time well spent.
First, I went through the Recurse Center, which is a 3 month program sort of like a writing retreat for programmers. I learned a lot about Python and AWS in that time, and got an internship as a data engineer.
In that Fall, I started a computer science master's. I've taken mostly courses in machine learning including: Machine Learning Theory, Deep Learning, Probabilistic Graphical Models, NLP, and GPUs. I've collaborated with two professors on research papers, which has definitely been the highlight of my degree although I definitely think the courses were necessary as I continue to use the information that was covered.
Finally, I'll be starting this summer as a research engineer doing deep learning! This process took me 2.5 years, but I feel very prepared for my new role. It probably is possible to do this faster by joining a program like Metis or Insight, which prepare you for data science like jobs w/in 3 months. I would say that approach is slightly more challenging / high risk. If you really want to go into machine learning, I'd say doing the degree is a more surefire approach, granted it's more expensive in time and money.