Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Engineers from non-CS background, how did you pivot into ML/AI?
390 points by ultrasounder on Dec 31, 2018 | hide | past | favorite | 132 comments
I am a EE, hardware engineer with about a decade of experience in PCB electronics and systems engineering experience which includes a brief stint at a FAANG that also happens to be an E-tailer. I have been "dabbling" in Python for about a year now and just recently started with DL using PyTorch and find it quite interesting. To be clear I don't write code at work, atleast not until now. I intend to utilize any free resources (MOOCs) to teach myself the latest techniques in DL for CV. What I am not clear is the next logical step.A part of me wants to Boostrap a SAAS using Python stack to build something that I can market using the traditional channels(PH,Show HN,Reddit,answer SO questions) and show it to potential employers but am not sure if I will even get to the interview stage with a resume but that doesn't look anything like a programmer with a tradional CS background and work experience to boot. Sorry about the long and winding question, but what should I do to get noticed by recruiters at FAANG and non-FAANG to stand apart from the CS crowd?

Stop focusing on MOOCs and youtube videos and study textbooks. Do exercises. Treat it like academic studying, and you'll end up with a decent education. It's important, because it's often easier to make a thing work okay than to understand why it works, so you'll get false confidence working through a tutorial. But then you want to apply that to something else and it doesn't work quite right, you won't know why it doesn't work and how to fix it.

Get some textbook suggestions and make a minimum of reading 5-10 pages per day. In about a month or two, you're done with a 300 page book. Repeat that for a few years and you're an expert. Once you have the foundations, read papers too, but don't skip straight trying to using AlphaZero to solve a curve fitting problem.

> study textbooks. Do exercises. Treat it like academic studying

This. Highly recommend Russel & Norvig [1] for high-level intuition and motivation. Then Bishop's "Pattern Recognition and Machine Learning" [2] and Koller's PGM book [3] for the fundamentals.

Avoid MOOCs, but there are useful lecture videos, e.g. Hugo Larochelle on belief propagation [4].

FWIW this is coming from a mechanical engineer by training, but self-taught programmer and AI researcher. I've been working in industry as an AI research engineer for ~6 years.

[1] https://www.amazon.com/Artificial-Intelligence-Modern-Approa...

[2] https://www.amazon.com/Pattern-Recognition-Learning-Informat...

[3] https://www.amazon.com/Probabilistic-Graphical-Models-Princi...

[4] https://youtu.be/-z5lKPHcumo

I would also include some books about statistics. Two excellent introductory books are:

Statistical Rethinking https://www.amazon.com/Statistical-Rethinking-Bayesian-Examp...

An Introduction to Statistical Learning http://www-bcf.usc.edu/~gareth/ISL/

Oof those are all dense reads for a new comer... For a first dip into the waters I usually suggest Introduction to Statistical Learning. Then from there move into PRML or ESL. Were you first introduced to core ML through Bishop? +1 for a solid reading list.

PGMs were in fashion in 2012, but by 2014 when Deep Learning had become all the rage, I think PGMs almost disappeared from the picture. Do people even remember PGMs exist now in 2019?

You'll find plate models, PGM junk, etc in modern papers on explicit density generative models and factorizing latents on such models.

Fashion is relevant only if you want to approach it as a fashion industry.

PGMs also provide the intuition behind GANs and variational autoencoders.

Hands up for Bishop and Russel Norvig.

Russel Norvig should be treated as a subtle intro to AI.

The start Bishop to understand concepts.

You're not going to learn much of anything by just reading a ML/AI book. Well you gotta be pretty good at math to understand everything, my suggestion is to enroll on some ML/stats course and start working on basics. It's long and hard road, and if you're not comfortable with math in general i'd reconsider investing too much time on it. I'm doing it yet i'd much rather do software engineering. Much more practical without math to solve.

So do exercises, spend time digesting and trying to explain things to others. If you feel it's hard, well you are correct. Get comfortable feeling that way. Hopefully theres light at the end of the tunnel. Dont buy into the hype. Know the basics

Edit: so didnt see op said exactly this. My bad, new year and all.

I'd need a lot more remedial mathematics before I could get any mileage out of those ML/AI book recommendations. I haven't touched mathematics in a strict sense since my senior year of high school Calculus (I majored in the humanities), so I think you're absolutely right. People like me would need to spend a lot more time learning things like discrete math and statistics before moving up to these books.

This is exactly my experience. I started up with blogs and sites. Though they were good when I just started out but after a point, I failed to make a coherent, systematic and deeper understanding of the topic. I felt like I am getting knowledge in bits and pieces which weren't creating a complete and package.

Finally I started studying serious books in a disciplined manner. I wish I should have done this earlier.

> Stop focusing on MOOCs and youtube videos and study textbooks.

I'd be ecstatic if I never again see a comment about how folks suddenly and completely understand a class they failed years ago after watching a 3blue1brown video.

Why does everyone act like resources must exist in a vacuum? Use both, there complimentary to one another.

I'd love for people to use 3blue1browns videos in conjunction with reading a textbook.

I think moocs are usefull to jump start you but after that you need to do more. I studied CS and did find the deeplearning.ai course helpful in getting started. Without it much of the ML content was not easy to grokk. But now that I"ve gone through that I get the gist of what papers are talking about. After you do a MOOC though you have to continue with doing real work as in exercises, Kaggle competitions and just playing with things.

I agree that MOOCs, particularly in areas such as math and physics, are woefully insufficient. Nevertheless, videos and MOOCs still have their place, because they help you formulate the questions you need to answer. They help you know what unknowns you don't know.

My experience has been different. "Academic Studying" can be hard with a full time job, and there is often a wide gap between what you learn in books and what you do at the job.

For purely professional purposes, it's probably a better idea to take a non-academic approach.

Yeah, it's super hard with a full time job. That's why I recommend setting a low X pages per day target. Making sure to make just a few pages of progress per day really adds up quickly. You likely won't progress as fast as if you were a full time student, but that's okay. It still adds up fast enough.

A minimum of 5-10 pages a day for 6-12 months seems pretty reasonable for a career change into a competitive field.

> textbook

I recommend Kevin Murphy's ML a probabilistic approach and Ian Goodfellow's Deep Learning.

Those are the books used in most of the ML courses I took in grad school.

There is also Chris Bishop's Pattern Recognition and Machine Learning, but I think it is less popular now, than it was before.

Thanks for the recommendation. The Murphy's book looks solid, better than half the books above. I personally own the ISLR(Hastie) and learning from data(Yaser). Both are beginner friendly. What were the Math requirements for the classes that you took at Uni that used Murphy's book? And how would one go about acquiring those Math/Stats/Prob skills in order to work through a tome like Murphy?Thanks


Patience. Spending that extra time (it helps if you really enjoy it or can program yourself to really enjoy it).

https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_6700... for inspiration.

To answer the question though: I'm not sure what you produce other than maybe blog or publish some analysis using your data skills. And maybe: https://github.com/MaximAbramchuck/awesome-interview-questio...

(source: non-CS engineer at amazon who watched the amazon videos internally before they were made public. I'm not a data scientist yet but sometimes, esp. when people talk about the challenges of AGI, I think about transitioning.)

Thanks for the links to AWS videos. These were posted here on HN sometime back. Will definitely bookmark them and come back to them. At the moment(literally)i am still finishing up Udacity PyTorch and hope to continue wit the venerable DL4Coders part1 and part2. As someone else had posted, coming up with useful implementations of popular ArXIV papers seems to be one sure shot way of building up a personal brand on Github.

Andrew Ng's courses on Coursera (Machine Learning) are the best ones IMO. Gives you the introductions to the math behind and some practice.

Cool yeah! I listened to the first three courses (Math for ML / Linear and Logistic Regression / Elements of Data Science) on my commute, and even though I thought I knew things, just persisting in "re-learning" the material helped a lot to fill in the gaps for me. I was just browsing the links again too, there's a whole lot of other courses/specialities after that too (if not posted there, maybe elsewhere).

I am not an expert, but from what I've heard/seen, being really solid on the fundamentals of regression and feature modeling (and not being afraid to read and apply ArXiv papers) are all key. And eagerness and statistics go a long way and are valuable to companies.

I'm not sure how this will be received, but I'm learning a lot through following Jeremy Howard. He's a huge PyTorch fan and he's spent the last 3 years trying to figure out what people like you need. He launched a course called FastAI and a DL library by the same name. His aim is to help anybody do it that wants to, with or without code.

MOOC: http://course.fast.ai

I just found a resource a few months ago that I'd love to recommend, but haven't started yet. It's mentorship you pay for, but not up front. You sign a contract to pay a certain percentage after you're hired. I plan on going through this program if my current job leads don't pan out.

Mentorship: https://sharpestminds.com

I'm interested in comments about either program in general. Speaking as someone who also has an EE degree, went through a web development bootcamp, and was disappointed by both at the help in getting hired that was offered after the curriculum was finished, I am also interested in your findings and results.

Hey there - I'm one of the cofounders of SharpestMinds. AMA!

EDIT: Also, I strongly concur with the fast.ai recommendation for deep learning, especially if you're starting from a background in software.

Hi Edouard, interesting concept. Who are the mentors and why don't you list or profile a few of them on the website? (beyond the company logos)


Some stats about our mentors:

- There are about 60 of them now

- Geographic distribution is ~1/3 in the Bay Area, ~1/3 in the Toronto region, the rest across the USA and Canada

- About 50% are deep learning engineers, the other half are a combination of ML devops, data eng, traditional ML (clustering, boosted trees, etc.)

- About half work in (or are alums of) the AI labs of major companies such as the ones whose logos are on the website

Why we haven't listed some of them on our website yet: no good reason. We'll probably do this soon. It's a good idea.

Thanks for the response!

My background is that of an econometrician (ie quantitative economist), and I now work as a Research Engineer at one of the FAANG research divisions.

I think the advice about getting in as a hardware engineer is solid. At my workplace, there's a ton of need for people working on specialized hardware for DL, and for people working on the software that works with it (optimizing compilers, etc).

If you are looking to break into the software side of DL, the first two thirds of the Deep Learning book [1] contains all the math you need to know to pass the interviews. Then, it's just a matter of getting interviews; I found that I needed professional experience deploying DL/ML to do that. I got that by doing side projects at work. For instance, we had a long standing operations research problem, and I spent some free time at work implementing a RL algorithm to solve it. I didn't get too far, but I was able to talk coherently about the papers involved and about how I planned to conduct the project, which went a long way.

[1]: https://www.deeplearningbook.org/

> Then, it's just a matter of getting interviews

Are you implying that, once prepared well enough, the contents of the interviews are simpler than getting actually noticed in the pile of applicants ?

You need to think like an interviewer - what can you reasonably make someone do in half an hour (plus time for chat and questions after)? Apart from being able to parrot deep learning theory, implementing things is tricky. Do you learn anything from making someone implement VGG in their pet framework? Training models also takes more time than you have to spare.

Much easier to quiz the applicant how they would solve a problem, or to discuss a previous project or paper they've published (or are interested in). Some people will find that much easier than whiteboard coding, others will hate it.

It really depends where you apply and if you want an applied or research role. Some places won't touch you unless you've got a publication in somewhere like CVPR. Others will go _hard_ on the stats questions. Other places want to see a strong Kaggle rank or some personal projects. It's really useful to have a portfolio here.

Thanks, that is helpful advice.

Performing well on the interviews is a skill that you can acquire through practice. If you do 100 Leetcode questions, read through all of Cracking the Coding Interview, and suffer through 30 phone screens, by the end of it, you'll be a hardened interviewee capable of passing an interview anywhere in tech (you can probably get by with much less practice; I'm being purposefully hyperbolic).

Does this mean you'll be good at the job? No. Is this very wasteful? Yes.

Getting interviews, on the other hand, requires you to read the recruiter's mind, and can vary depending on what the recruiter had for breakfast, or if they fought with their significant other that morning. It's much less formulaic.

Thanks and I just subscribed to your Byte-sized videos at aiworkbox.com.

Thanks! Let me know if anything's unclear :)

Maybe someone who actually works at FAANG can weigh in, but I would think that one of your best bets would be getting into one as a general SWE and then transitioning to AI/ML internally after a year. I recall Google even having some sort of internal program that encouraged this. Getting into Google is a moonshot, but it's possible to do so with no prior professional programming experience if you put in a ton of effort AND get lucky. Amazon seems willing to train too, based on the following experience I had:

I'm an iOS engineer without a STEM background, and I've been contacted by Amazon recruiters for entry-level ML/AI positions. I thought it was weird, but they said they've hired a few people with iOS backgrounds and no prior ML/AI experience who are now excellent ML engineers. I backed out because I knew I would fail the interview process at this point, but it's something for me to think about for the future.

Yes. I have been contacted by Google recruiters multiple times for SWE role(though my resume has no programming experience at work). Apparently, their entry ticket involves reading up Skiena cover to cover and Leetcoding your way through their interview process, which I am absolutely open to.

This is the most direct and predictable path forward, _if_ that is something you are up for, I certainly recommend that route.

There are tons of youtube videos and books (Cracking, Dynamic Programming for Interviews, etc). Definitely do research into questions that will be asked.

Internal transition is always easier, I have several colleagues went from SWE to ML related roles this year. One thing worth noting is that most of them work on building ML infra/platform rather than direct user facing ML features, so it is not drastically different from what they did previously.

this. if you get in as a SWE, you'll end up doing infra or data management/cleaning stuff imho

Random side note, but when is the 'FAANG' acronym going to die? MSFT is killing it, prob the top tech company around these days. Needs to be included in that list.

I agree, Microsoft is probably the #2 top company in AI after Alphabet, should be included

Microsoft Research does a lot in AI but don't forget FAIR, Nvidia, Baidu, and Amazon. Smaller companies like OpenAI are making strides too.

i agree 100%

How may I ask is MSFT a number 2 company in AI?

I'd replace Netflix with Microsoft.

FAANGMUA is the latest I've seen...Microsoft Uber AirBnB, I believe.


Linkedin, Cloudera, Redhat, Quora, Robinhood, Asana, Salesforce, Dropbox

How may I ask is MSFT a top tech company?


I'd strongly recommend considering hardware companies in the ML space. They are trying to acquire ML talent but will also value your EE background. This is probably best approached at the usual big ML hardware vendors (NVDA, INTC and AMD) but there are also numerous start ups in the space that may be credible.

Lots of good advice about acquiring skills, I don't have much to add beyond that. I'll just mention that before you jump into the advanced stuff, please understand the terminology and basics very strongly. I've interviewed over twenty people for roles in ML the last year and many (despite having ML on their resume or even some experience in it) could not even explain the difference between training/inference, the meaning of validation, etc. The field is so hot right now that many unqualified folks are trying to get in, often by faking more experience than they really have. In response, I've created a simple 'fizzbuzz' test just so I can quickly screen people.

I've hired about half a dozen ML engineers/architects in the past six months. Several of them have EE backgrounds. There's a good bit of ML that touches on hardware (think integrated cameras and similar) so it can be really helpful.

You're mostly on track with your plan to build something. You do need to demonstrate that you have the skill set, but building one giant thing isn't the answer. There's so much that goes into building a giant thing, that I can't accurately access your ML skills.

Ideally I like to see a lot of small things over a reasonable amount of time. Someone with a solid GitHub showing 6-12 months of paper implementations, weekend geez-wiz hacks and various other projects would go right to the top of my call back list.

Hope that helps, good luck with the job search.

Thanks!. I think https://fomoro.com/projects/project/reverse-image-search this has provided me some motivation. Also just subscribed to your meetup. I will try to make it to the next meetup on the 8th as i am just about starting with GANs in my udacity PyTorch course. Thanks for organizing this meetup. I will leave the link for folks from Bay Area to follow-up. https://www.meetup.com/deep-learning-sf/

Your work looks interesting, would you mind if I sent you an email via the Contact Us page on your website? I'd love a chance to try to get on that call back list.

Speaking from experience, almost all FAANG positions I've seen require a degree for ML/AI, and even require a degree for less research-oriented positions like a Data Analyst/Scientist.

Non-FAANGs may be less picky but the competition in the field is too great at the moment (due to MOOCs/Bootcamps increasing supply), and even with an excellent portfolio it may be impossible to stand out. (in my case, despite my data science "fame" most recruiters tossed my resume out immediately during my job hunt a year ago; the only interviews I got were by going above the recruiters. And that was for data science, not even ML/AI)

Even after working as a Data Scientist for over a year, I've received practically no recruiter spam.

After reading most of the comments I can try to provide a different perspective.

I am a Director of Data Science and Software Engineering for a mid sized firm (~1000 employees and $150-200MM revenue). I started with a Finance degree then shifted into an analysis position at a FAANG (lots of excel, SQL, learning how to query big data). This eventually led to learning more about tech (python, AWS cloud stack, messaging queues) and after 8 years in the industry giving me enough experience to manage teams of data scientists, software engineers and data analysts.

Although it is so important to know all the software engineering stack, many companies will benefit from simple business intelligence and data analyst roles. I guess my recommendation is to also keep an open mind in looking for these types of roles in the market (data analyst, business intelligence engineer), because given your desire to learn and existing background, its clear you can make a big impact in those companies as well. And it will be much less competitive than traditional CS crowd.

Some food for thought

I think this is great advice but there’s an important caveat which is career goals in AI / ML.

A lot of companies that say “data scientist” when they really mean “spreadsheet analyst” are places to avoid if you have career aspirations in ML. In the worst cases it can be a bait and switch (very common) to get overqualified people to babysit rudimentary analytics. Especially avoid places that might do this to pad their staff for any type of acqui-hire or investor signalling reasons, because your career goals will not be acknowledged.

In the best cases, it can be some befuddled IT manager who vaguely thinks they need “AI” but really they don’t have projects that would actually benefit from it. They might be sympathetic to your dissatisfaction in the reality of the job, but will have little power to do anything about it.

Somewhere inbetween is another very frustrating case: situations where the business or product clearly can materially benefit from “real” machine learning, and from the perspective of making customers happy & making money it’s a no brainer to invest time to research implementations, but risk averse management, often with no ability to gain an understanding of the benefits of investing in machine learning, or who want to act as credit / politics gate-keepers for an existing system, puts the brakes on it and retasks you on things that just waste your talent.

We gotta stop saying 'FAANG' when MSFT is the arguably the top tech company around these days.

Not disagreeing with you (not informed enough to), but top tech company by what measure? Market valuation? Impact of products/services in 2018/2019? Workplace rating?

I’ve always felt that Microsoft fits in with GFAA much better than Netflix.

It seems FAANG has transcended its original meaning from being an acronym to becoming a generalized word for "any highly traded growth tech stock"

Yeah FAANG definitely has that connotation in finance rather than being an acronym suggesting the top tech companies

Okay, but repeatedly making this point in this thread is derailing conversation. Does it really matter that much? Were you unaware of the parent comment's point being made because MSFT isn't represented?

Amazing that this has-been company is so desperate to inflate its reputation that it posts such laughable conjecture on Hacker News.


...or GANFAM to better emphasize a more healthy spirit of competition and collaboration.

I am also an EE and have been able to apply ML to my field, wireless comms. It turns out people skilled in the intersection of two fields are very rare indeed.

I'd suggest you look for an opportunity to apply ML/CV/AI in your industry (deep learning for PCB inspection maybe?). Show the possibilities, get some research funding, do a pilot program or similar. Lead and drag your company (kicking and screaming if need be) into the 21st century.

Then you will have ML/AI on your resume, and recruiters will come looking for you.

Wireline EE here. ML in the context of wireline communications has been widely studied (grant money works that way). However, at least in wireline communications there isn’t a lot of gain to be had. This is because we have good physical models of underlying impairments and our ASICS are optimized using those physical models, so ML isn’t able to improve upon that much. Where it can help is where we don’t have good data on underlying physical parameters of the channel. So in this case, you can get a bit more capacity than you would otherwise. So for wireless where you have signal fading and huge variation in multi path interference, maybe there would be benefit as these phenomena are hard to model.

Also, you may not realize but we’ve been using ML techniques for decades in communications. Gradient descent is used to optimize equalizers; maximum likelihood estimation(and equalizer optimization) used for phase estimation in high order QAM. Plenty of other examples. So you probably are already familiar with much of the basic tool kit. I had a wannabe startup founder in ML tell me that there’s no way I could possibly understand the stuff if I didn’t have PhD in that area in CS (I am physics). I just smiled and nodded.

Thanks for the response.Wireless comms seems like ripe for ML enhancement. PCB inspection is something that has crossed my mind indeed but it is quiet involved due to the sheer number of features ( traces, components, vias, connectors) none of which has pretrained models. Having said that Automation esp FAI( First article inspection) is quiet possible as demonstarted by landing.ai(Andrew NGs) company.Being employee at a publicly trading company, not sure how I can secure funding. Perhaps my focus should be to apply ML directly to something that I do day-day at work.

Would you like to write an auto router for PCB design using ML? A smart one that can indentify components, their properties and choose right connections between them? Have a fast signal going directly and some shitty LED with 10 vias. Or placing power parts with wide tracks and using thin ones for digital signals.

Auto routers already exist and I always route my own PCBs. I am pretty sure the tool vendors are considering or already building up AI expertise to level up their Auto routers. I keep hearing at least in Altium they have come a long way. Gotto try it one of these days.

I'd say pivoting to ML/AI is much easier for Math/Physics folks compared to pure CS ones.

> what should I do to get noticed by recruiters at FAANG and non-FAANG to stand apart from the CS crowd?

The usual answer here is look for suitable business / r&d cases within your own EE industrial domain and use ML/AI as any other tool instead of as a black box or a magic wand. Good luck.

I'm just trying to get my first programming job.

I cant seem to get past HR. My resume has that I'm a Chem Engineer BS, Industrial MS, 7 years in engineering, 2 years of Electrical Engineering.

The first page of my resume is my 10 years of non-career programming experience. Built a Dishwasher(embedded C++), full stack app(RN JS, Mysql PHP laravel), and smaller projects.

I cannot get past HR.

Every real life programmer I show my work to, knows I'm capable. Heck even some got me in touch with HR. Nothing came of it.

Off topic, but wanted to reply regarding "The first page of my resume is my 10 years of non-career programming experience."

This must be part or most of the problem. Cut your resume down to 1 page, if possible. Include a meaningful cover letter catered to the opportunity and specific company youre applying to. Shove the last ten years stuff into the very end, and start that first page with your software knowledge and related projects. Ping me someday here if this ends up getting your foot in the door.

Why not use a pitch or capability cv? and ditch the traditional cv format

Hi. Why are you looking for your “first programming job” with your different background and mature resume? You are at great disadvantage against tons of CS naturals in the usual recruiting and interview routine. Also, your programmers network thinks you are good enough which is encouraging, so why do they not refer you internally or through their own network? Ask them for sincere feedback, they are not helping you enough. Happy New Year and good luck!

Hi, Loved your Dishwasher robot post the other day. IMHO, see if you can gang up with your co-conspirator and raise seed money. Heck, if you can afford it, consider moving to the valley and try to get in the next YC cohort. These days they seem to have a predilection towards funding HW ventures. If all else fails, you will still have all the experience of an entrepreneur which will carry you places. AS an aside, Have you considered "Industrial Automation" jobs. Companies like Tesla might be interested in your profile.

Have you tried cutting down on experience? With age discrimination in hiring you might be better off including 7 years as opposed to 10. Just my two cents.

And don't forget to stock up on Just for Men for the in-person interview. You want to look like a seasoned 32, not a past-his-prime 40-something.

Not sure if this is sarcasm, but people aren’t machines. My reply rate rose considerably by changing from my first(unique) name, to my middle(common) name. People judge, if you’re going for a new job why give them any chances to ding you?

So, should black candidates bleach their skin because "people are not machines"? Making excuses for wrongful prejudice only perpetuates the prejudice.

Depends on how badly that black candidate wants the job. I’m not making excuses for anyone, I’m just giving advice if your goal is to get the job. If your goal is to stop wrongful prejudice ignore my advice. Though one could argue the best way to stop this kind of prejudice is from the inside.

But...but Hackernews keeps telling me any halfway decent programmer should have FAANG-tier companies begging them to work for them!

This. I pivoted by finding applications of ML in my day-to-day engineering role, until I got enough internal traction to justify a title change from "____ Engineer" to "Data Scientist." About a year later, I jumped ship to a startup applying ML to my domain, and I am contemplating making the move to FAANG next.

This is the kind of success story that I hope to emulate. Good luck with Your FAANG quest. That definitely will open more doors for you!

I built ML application in my domain which helped demonstrate my capabilities.Networked a fair bit within my company and then got the chance to lead a ML team. Whole process took about 2 years.meanwhile self educated myself continuously over last 3 years. Spent min 20 hours a week coding, reading, learning and discussing ML. Joined ML learning groups helped other folks and learned through their journeys as well.

ML is unfortunately better done in a big company due to data but also a b*Ch due to tremendous friction within org to get things done.

Another key strategy is to commit yourself to build an end 2end ML application, structure your learning around it. I found this a tremendous technique to turbo charge my learning.

I'm a fellow EE grad, that has been doing software for the past 20 years or so. I started studing ML a couple years ago. I started with a couple of Andrew Ng's courses on Coursera. I found it was a good mix of theory and practice. It's a really exciting field right now (a bit over-hyped, but still lots of room for growth).

BTW, I think the you may have a bit of an advantage because of the math background you presumably have with a BSEE (linear algerbra & differential equations).

It is extremely tough. I left my Civil Engineering career path in 2015 in search of a career in tech. I then applied to graduate school at SMU for an online masters in data science. I am still struggling to find meaningful work, but currently mentoring a data science bootcamp.

This is a very tough track but uf you have software development experience it should be easier to get a role in ML or Data Science.

How did you find that SMU program? I was debating that and a similar program at UC Berkeley.

I can't help but read the title as something akin to "musicians, how did you become comfortable painting with watercolors?"

I know ML/AI is all the rage. I just feel that targeting it so heavily is a bit shortsighted.

I read somewhere that learning ML/AI isn't the hard part. It is having enough data science background to be able to tell tell what problems fit ML. ML isn't the hard part, finding a problem ML can approximate is.

Maybe that is too simplistic but I can't help after my 1 semester ML course think that most of the ML problems people are solving aren't really suited at all. Like SWE see this cool hammer and now everything is a nail. Maybe I should read up on startups using it successfully for anything but I haven't seen many of those on the frontpage.

Industrial engineer by training here, I honed my chops on Kaggle competitions, eventually winning one. I was recently brought into a FAANG in a ML/AI engineering capacity.

I can’t reccomend Kaggle enough for those who are looking to prove their abilities in the field.

Nice! Another pivot story. How did you make it past the Recruiters and their filters? Just with your kaggle portfolio? Any chance you can post a link to your kaggle profile. Congrats on winning Kaggle and that's no mean feat, considering the competition(pun intended).

Naive question. Is ML/AI the "real deal" and here to stay, or is it still kind of just the hype du jour?

Or put another way: are there plenty of problems where ML/AI are valid tools or are they largely cool tech looking for problems to fit into?

ML/AI is here to stay, but currently the marketing and potential impact of ML/AI is a bit exaggerated.

There's some hype, but there's also good reason to be wary if you're occupying a job that can be automated in the short- or mid-term.


Not sure it' the best path for you (it could be though!) I'm an EE and applied for PhD programs in control theory (robotics specifically). Now my research focuses on applying deep learning to control under-actuated robotic systems.

Although I went right after my undergrad, there are several in my cohort who were in industry for as long (or in one case much much longer) than you have before starting their PhD.

There are certainly ways to merge your hardware experience with learning. Either applying AI to hardware design, or applying hardware design to speed up or otherwise improve learning, lots of research going on in both areas.

This is my issue with the phrase "AI/ML" as a catch all. You have an excellent general skillset, but "AI/ML" encapsulates a wide spectrum of jobs.

ML-SWE: SWE with ML focus - building architecture around models, feature engineering, distributed training, etc. Relatively limited ML knowledge needed (IMO). The math won't be helpful for this role. Much more important to have SWE background. If you want this, keep building your programming knowledge (Python) and read books. Would focus on understanding the popular frameworks PyTorch and TensorFlow b/c your work will likely interface with those.

Research engineer: Mostly for MS/PHD background. Farther away from the product and closer to actual research. This doesn't sound like what you want to do.

Data Scientist: ML is a subset of the knowledge needed. Applied statistics as important, if not more so. Doesn't sound like you want this.

A path forward:

(1) Program a lot. On what? Anything at all, b/c you need programming skill to work as a SWE.

(2) If you want to do ML-SWE, program with an eye towards ML applications. Maybe do a simple cloud project that leverages ML - Google Cloud makes this particularly easy for classification tasks. Focus on breadth here, not depth. No sane person outside of academia can keep up with state-of-the-art and truly understand it. Far too much material, so focus on fundamentals.

(3) Work towards your strengths. You aren't some hotshot kid out of college proclaiming to be an AI guru. That would be silly and no competent recruiter would believe it. You know hardware - and AI (neural networks) leverages a lot of hardware. Why not focus on the hardware side of AI? Demonstrate your knowledge of how/why TensorFlow is so effective across distributed hardware, or how CUDA accelerates NN computation, or why TPU claims vs. Nvidia may be up to interpretation, etc. This should be a natural transition given your background.

TLDR; Know what you really want to do. Your background is valuable. Play to your strengths. Don't ring the bell.

Can you let me know a bit more about the SWE-focused ML path? While I'm interested in ML, I'm slightly more interested in the systems that are built around it. In particular, are there resources that can help me get up to speed in designing (distributed and HW-accelerated) ML systems.

Not a full answer, just an observation: most ML people I come across don't have CS backgrounds. Many have backgrounds in physics, math and other STEM fields that have a strong computational component (my own background is in control systems, math modeling and numerical computation).

From your question, it sounds like you want to be a software engineer rather than an ML/AI engineer -- is that a fair assessment?

Its certainly good to know that ML doesn't require a CS background. In my own case, My thinking was that because ML/AI has a strong programming component in addition to Math&Stats, my strong Hardware background might work against me. In an ideal scenario, i would like to develop AI/ML applications and wouldn't mind morphing into a SW engineer if the role requires me to. THough SWE by itself would be another steep hill to climb.

I went back to school for cal 2, Multivariable calculus, Differential Equations and Libear Algebra. Took ML class from a local university after that.

Hi, I am in a similar boat, and would love to network with you either in person or online.

I have a PhD in EE, working in semiconductors. I have done a couple of MOOC specializations on Coursera, and am trying get some data science projects on my resume. Also trying do some Kernels / Scripts on Kaggle to build up a basic portfolio.

Hi, Email is in my profile and I am always looking to network. Shoot me an email and we can take it from there.

The overarching theme of that thread reflects Max Woolf's comments: Companies(FANGMA)place a lot of importance on Accreditation and that is pushing a lot of capable folks out of ML/AI applied roles(not talking about inventing the next capsule network that requires a P.hD).

So basically a Luddite?

Become strong in the background of ML/AI and then learn a language where you can use it. Demonstrate your achievements in it. Help with projects associated with it. Participate in the community.

I am curious: Why do people want to pivot? Is the work more interesting? Is the pay higher? Are there more opportunities? Is there some other reason?

bad management.

on FAANGs, the teams are usually huge, 100+ people doing what a nimbler company does with 3 or less, to the point the employees don't even see that, because the product is now broken into several pieces to give the illusion of complexity. middle managers then break it down further that engineers start being know as the "person that writes the java files in that one directory" and nothing else. This creates constant fear of becoming irrelevant. All while you see 2~10% pay raises while hearing about undergrads making the same you make now with "new tech du jour". This creates even more pressure.

And because this cycle (stagnate, fear, learn, relief) repeats often, engineers start to associate learning a new tech with happiness, just because it offsets the psychological fear for awhile.

EE -> embedded programming -> programming -> financial software -> predictive modeling -> ML.

Wow!. That's got to be the most scenic route to ML, though one might actually get exposed to a lot of Core technologies on the way.

By the time the OP gets there along this path the new hype will be on GQ/JP already.

It's not like I planned that route. Each step in isolation looked reasonable at the time.

LOL! WTH is GQ/JP btw?

We don't know yet.

I basically did what you’re talking about. Masters in physics, then went into semiconductors as engineer and materials scientist, then switched to Data scientist at a bank for 18 months, and now have been a research scientist in AWS for almost two years.

In Amazon, it’s easy to move around, but not between job families. I think it’s a bad idea to join as a SWE and try to transfer because they want people that have done it before, and you’re unlikely to do that sort of work as an SWE. I think it’s better to get experience in the role you want at a less prestigious company. You’ll learn a ton. Pick the best company that will have you.

My personal turning point was when I did free work for a local startup in exchange for them letting me take the Data Scientist title. To a recruiter, it’s totally obvious to hire a data scientist for a data scientist role, and isn’t clear at all what physics has to do with it. Recruiters are the first step when you’re starting fresh, so make it easy for them.

I also somewhat disagree with many of the comments here that textbooks are better than tutorials. If you buy these 1000 page graduate level texts when the idea that you need to read them cover to cover, you’re likely going to give up and fail. Instead, buy the books and put them on the shelf, and then work through tutorials and examples. Then reference particular sections of the book that are relevant to your work to add depth.

Finally, I recommend against starting with deep learning. There’s a whole helluva lot to learn with basic techniques. Very few companies are actually using deep learning in production systems. Start with linear and tree based methods to learn all the stuff about how to frame the problem and build robust systems. Then you’ll have a deeper appreciation for DL.

A reasonable person could disagree and say that there’s so much domain specific stuff around the art of DL that it really behooves you to start there ASAP. I would counter that you’re unlikely to be considered for positions using DL unless you’re pursuing your PhD in it, or have proven yourself in industry. Since that isn’t your situation, I’d wait until you got your foot in the door somewhere and then pursue DL on the side. That’s what I did, and then you look like a hero to your boss. This strategy led to my first publication in the field and I’m now working on DL almost exclusively.

Edit: one more thing. Think carefully about the type of work you want to do. My advice is assuming you’d like to be a person that trains/deploys ML models to solve problems in industry. This is much different than an ML Engineer, who’s implementing algorithms in low level languages and squeezing out efficiency. Obviously that would require a much deeper understanding of SWE. And a totally different person is an academic researcher that’s developing theory or technique. It’ll be hard to do that without a PhD.

Great reply. My question based on this comment,

>My advice is assuming you’d like to be a person that trains/deploys ML models to solve problems in industry. This is much different than an ML Engineer, who’s implementing algorithms in low level languages and squeezing out efficiency. Obviously that would require a much deeper understanding of SWE. And a totally different person is an academic researcher that’s developing theory or technique. It’ll be hard to do that without a PhD

Can one not only train/deploy ML models, but in addition to that be able to implement the algorithms in low level languages and also be able to develop theory?

I’d imagine these are all skill sets that someone in PhD program could pick up.

If they could do all three, what kind of job should they be looking for?

I think it’s unlikely to become expert in all of those things. If you do, it’s over the course of an entire career, not to get started. I guess it comes down to how much expertise is “enough” for you. Naturally, if you split your time across 3 domains you won’t be as expert as someone who dedicated all their time to going deep in one.

In the context of a big company, I think it makes sense to have a specialized workforce. Why look for the one in a billion person that can publish top quality theoretic papers and then implement them on distributed gpus in an optimal way while also building simple Random Forest models for your business? I’d rather that person do more of the most valuable thing, and then hire someone else to do the rest.

This answer makes sense.

I suppose my question is more along the lines of, if someone is specializing in deep learning in a PhD program then shouldn’t they at the very least be able to implement models and also know optimization tricks?

In other words shouldn’t they be able to develop enough skills to go deep in one area but also know enough to be dangerous in the other three domains?

I think I agree with you with the caveat that it would depend on what they're researching. If they're researching new model architectures, I don't think it makes sense for them to try to implement the algorithms from scratch in C++/CUDA to do distributed GPU training--why not just use TensorFlow? But if you're researching distributed tensor computation, then that's your bread and butter.

Great reply. Just out of curiosity, did you end up giving your semiconductor job before joining the start up with a data scientist title ?

I am deep into semiconductors, and am facing the dilemma of giving up my expertise so far, to join a startup as an entry level engineer.

I have done a couple of MOOC specializations and am trying to find projects within my industry to gain some credibility. Also trying to stay active on Kaggle to build some basic data analysis portfolio.

In my particular case, I did quit my job before I got my "real" Data Science position. I can't recall exact timelines, but I think I had already lined up the relationship with the startup.

The reason I did this is because there just wasn't enough hours in the day, and my job was taking ~10 hours a day with commuting...etc. It was a risk, but the idea was that I would be able to transition much more quickly if I worked full-time towards it. I also had the financial savings to support myself for 6-9 months and was willing to get a part time job if necessary. Once it became clear that my job's only purpose was to pay the bills in the context of my goals, and I had enough to pay the bills for the near future, it was clear that quitting was the easiest way to free up a lot of time.

This turned out to be the best decision of my career, but YMMV. I doubled my salary in less than 2 years. It's also nice to be part of an industry that isn't so cost-sensitive. I also have a skill set that's in much higher demand, so you can live almost anywhere and there's a ton of companies that want/need it. With semiconductors, you're much more limited.

It's true that you're giving up some expertise and will start in a less senior position in a different field than if you stayed in semi. Sticking around because you have experience is classic Sunk Cost Fallacy. Think 5 years down the line. If you leave now, you'll have 5 years experience in ML. You'll definitely be giving something up if you leave, but there's huge opportunity cost if you don't leave.

Would you be open to have a brief conversation on the phone for advice? My email is knariks@gmail.com.

Hi OP here. I don't think you need to give up Your Experience to join a startup as a Data scientist. But what won't be 100% transferrable would be your Semiconductor specific skills(think spice, process technology etc). If you have been coding your current job that is a skill that is transferrable.Your PhD is a massive foot in the door. Have you considered something like Data incubator or Insight data science fellowship which require a minimum P.hD to transition.

+1 fast.ai

Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact