Hacker News new | past | comments | ask | show | jobs | submit login
What I learned from looking at every AI/ML tool I could find (huyenchip.com)
459 points by amrrs on June 23, 2020 | hide | past | favorite | 106 comments

>A large portion of AI investment is in self-driving cars, and as fully autonomous vehicles are still far from being a commodity, some hypothesize that investors will lose hope in AI altogether. Google has freezed hiring for ML researchers. Uber laid off the research half of their AI team. There’s rumor that due to a large number of people taking ML courses, there will be far more people with ML skills than ML jobs.

Ouch >.<

As someone who does ML for a living this hits. When switching companies in 2019 for the first time I had a hard time despite having 10+ years experience. Linkedin maxed out saying over 200 people had applied to every job I was looking at.

Thankfully I have a rare skill set. It's not found in classes, books, or boot camps. On the downside there are not many jobs to choose from as companies don't know what I do can be done. (Time series classification and prediction, usually sensor data.)

Hopefully the industry matures. I've bumped into data scientists who don't know how to do feature engineering as well as ones who can't grok ML. It makes the whole industry look bad.

As with data science, the number of people that say they do "data science" and "machine learning" are in far greater supply than the number of opportunities.

Likewise, the number of people that are able to apply their domain knowledge to actually help their company's businesses/product or problems will always have work.

I've often used an analogy comparing this to cooking. Just because people say they know how to "cook" a dish, doesn't mean they'll be able to function or be effective in a restaurant or other type of business context kitchen. Yes, there's an element of this domain knowledge required, which are table stakes, but often the factors that distinguish success have more to do with understanding the context of the work and the surrounding business.

To update this for recent fads, learning to make sourdough bread does not mean one has learned how to run a bakery.

If you're able to be a pure researcher, or pure specialist, you're very lucky and working somewhere whey they can afford to have deep specializations.

The plethora of emerging tools and practices related to AI/ML "workflow" are related to the collective realization that training ML models is just one piece of the puzzle:

https://papers.nips.cc/paper/5656-hidden-technical-debt-in-m... (See: Figure 1)

But keep in mind, the VAST majority of those people have only toyed around with ML libraries or frameworks on the most trivial data. There's A LOT of script kiddies right now, that just copy/paste the same tutorials, maybe even publish yet another "How to classify MNIST dataset with NN / SVM / Naive Bayes / etc." on medium or whatever, just to come off as authority.

So while there's a lot more competition, the majority of these don't even meet the minimum standards to get called into interviews.

If you have solid work experience to show for, then that alone should differentiate you from the the masses. Again, a lot of these people have NEVER worked with anything relevant, and can only show you a bunch of tutorials they've followed. IF lucky, they've actually taken on more exciting datasets from sources like Kaggle, and tried to come up with something.

edit: Not to mention the data - I get extremely impressed when someone actually shows a project which involves them collecting data from the wild, wrangling it, and then doing meaningful work.

Too many beginners only work on neat / ready datasets. You'd be surprised how many candidates get completely blindsided when asked to do some trivial tasks on pre-processing data.

I have been doing ML engineering professionally for the 6 years and have been involved in about one hiring round per year for the last 4. If you filter resumes down to “has at least 1 year of professional experience on a deployed ML product”, then you’re lucky to have one resume out of 100, and that’s after HR has had their crack at the applicant pool. We still have to hire people who don’t really have much actual professional ML experience (which imo is fine). We invite literally anyone who has shipped ML code professionally for interviews, and we get ghosted a quarter of the time. It’s not like we pay below market either, it’s just that the demographic of people who already know what they’re doing can still afford to be really picky.

> Too many beginners only work on neat / ready datasets. You'd be surprised how many candidates get completely blindsided when asked to do some trivial tasks on pre-processing data.

I would argue that beginners only -work- on neat/ready datasets and lack the appreciation or understanding that most of the time the data does not already exist in a usable form, and sometimes not at all.

The process for making model training possible, and the process for applying models in a useful way are not taught in curriculum, tutorials, and generally not addressed in other literature (e.g. academic/research pubs)

"Amateurs and beginners focus on trying to get things right. Professionals focus on making sure things aren't/don't go wrong."

> minimum standards to get called into interviews.

I am a bit curious on what is the minimum standard? It does feel to me (an outsider that took 1 class of ML in college) that you need at least Master in ML to get intuition on the probabilistic and linear algebra theory behind ML concepts.

>I am a bit curious on what is the minimum standard? It does feel to me (an outsider that took 1 class of ML in college) that you need at least Master in ML to get intuition on the probabilistic and linear algebra theory behind ML concepts.

On the computer science side it makes sense to understand how to build a tree, not just know how to use one. In computer science you're learning how to write algorithms, not just use them. On the MLE or Machine Learning Software Engineer side, the same rule applies and knowing linear algebra as well as probability theory is helpful so you can write your own ML.

On the data science side you're rarely inventing new ML. What you want to specialize in is: 1) data mining using ML, 2) data cleaning, by knowing what kind of input ML needs, 3) feature engineering, also by knowing what kind of input ML needs, and 4) What kind of ML is ideal to choose, eg, due to the bias/variance trade off.

A lot of these boot camps, classes, and books teach the underlying structure of ML to become an MLE, though for some sort of reason they often advertise it as data science. I find this odd, because MLE pays better and is in higher demand.

On the data science side we rarely need to know these finer details. We just need to know the ML's characteristics so we know when it is the right tool for the job, similar to knowing how to use a data structure but not needing to know how to invent new data structures.

A data scientist needs to be a research specialist which is more of a phd than masters skill, so knowing the underlying math also doesn't matter as much because we know when it is necessary to research it. It's not that a data scientist can't know it, and many do know it out of hobby or classes, but it's far from mandatory. Knowing probability theory and how to digest a problem into multiple paths forward, like how to collect data, is far more valuable as a skill.

And finally, many data scientists barely know how to write code. They're a kind of analyst. I feel like the job title isn't sufficiently explained so software engineers make a lot of assumptions mixing up MLE with DS.

edit: Also, Linear Algebra isn't that bad and is an undergrad class. Probability theory is taught lightly in DS101 freshman year, but the first year of getting a masters probability theory is often taught again to a much more rigorous. This can get a bit harder, but if you understand the basics it's not bad.

As for Uber, the AI researchers that were laid off were not from the self driving division.

Google's freeze as far as I know is for all engineers ( only backfilling attrition is permitted), not limited to ML.

I personally feel this slump in the ML job market is only temporary and related to a temporary loss of risk taking appetite.

As for the legit vs non legit ML practitioners, I don't see how the situation is going to improve any time soon. Because it's not just random engineers fooling themselves by saying they are great at ML, there are entire companies new and old where there has been an increasing focus/pressure to sell AI powered $XYZ snake oil. They don't care who they hire, and most likely they don't even know who/how to hire when it comes to ML/AI, because the whole process right from the top is let's sell it first and wing it later

Also these hiring freezes may be due to covid, not loss of hope in ML. Google's AI residency program seems to be delayed but still going ahead, almost all the others were cancelled due to covid logistics (to my knowledge, at least FAIR, MS and Uber).

That doesn't detract from the article's point though - lots more interest in those programs than there are places.

I'd also add that the cuts to ML teams seem to reflect more of a freeze on R&D in general--totally normal in a downturn--and that ML just happens to be a feature in a lot of R&D at this point.

Most of these companies have substantial investments in ML that aren't research-focused, they're powering their flagship products. At this point, ETA prediction isn't a speculative experiment for Uber or Google Maps, just like RankBrain, Smart Compose, spam filtering, etc.

There’s a real lack of good engineering in machine learning and too much emphasis on using bleeding edge research.

There’s a lot of boring models and practices that get the job done for the 80%. That don’t require as much talent to do well. That scale, quick to develop, and can be implemented on prod with ease.

But there’s a problem in tech culture of reaching first for the most sophisticated ML, not the most practical.

So my perspective is that 2020:AI is as 2000:Internet. In 2000, the idea that all commerce is going to be on the Internet was obviously true, just as obviously true as the fact that all of the companies that IPOed at the time had no way of getting there (except for Amazon). AI one day is going to be the foundation of successful money-making products and services, but AlexNet and GPT-3 are not even stepping stones in that direction. Self-driving cars are a distraction like virtual worlds.

It's entirely possible, but also AI is older than the internet, and Google was the first big ML company, gaining their success as early as '99. It seems like ML has a longer adoption curve to it, moving slower.

Also, AI is anything cutting edge. The fill tool in Paint was once considered AI, so AI will continue on as long as we have new tech.

> The fill tool in Paint was once considered AI

This Wikipedia page has a lot of filling algorithms, and none of them looks like it's AI-based: https://en.wikipedia.org/wiki/Flood_fill

Can anyone explain me why the fill tool in Paint was considered AI? Has the definition of "AI" changed?

It would be AI by the standards of A*, maze solvers, and similar. It isn't ML and doesn't isn't related to that branch of AI.

Should graph search be considered AI? This gets us to the root of the issue, which is how do we consider something AI or not and to what extent has that definition changed as some problems have become trivial or commonplace. Machine learning is a bit easier since you can create much strong classifications around the need to teach, to feed it data with expected results, before it is useful to get a response (or otherwise give it a way to evaluate data it generates and then feeds itself). However there are a number of techniques that predate ML, or at least having fast enough computers to have worthwhile success at ML, and thus we are left with determining some way to objectively classify those as either AI or not.

It is technically not AI, but it used to be called AI.

AI today is the study of optimizing NP problems through a series of educated guesses.

> Has the definition of "AI" changed?


It would seem that the AI goalposts are always being moved.

They weren't saying the fill tool was an early form of what we think of as an "AI algorithm" today, but that the leading edge of machine intelligence / intuitive capability has changed, and it will continue to change.

Also you can probably implement fill with a graph search algorithm.

Yep, people forget that 'pagerank' was Google's secret sauce and it was literally ML. It looked at data to compute parameters of a model (edge probabilities) and rank a set of candidate results.

My first salary job was reverse engineering page rank. It was a lot of fun. /random

I'm having a hard time seeing PageRank '99 being ML just because it uses a lot of data and is linear algebra at the core. It's one large eigenvalue problem.

It's just because you're having a hard time seeing that most ML is linear algebra at the core.


I mean it's true that ML uses linear algebra. But the difference between ML and PageRank is that ML is Machine Learning. The learning part (involving storing and updating based on new data) is different to statically calculating Pagerank using Map/Reduce and applying it as a ranking function.

Isn't updating parameters(in ML) same updating some ranking function parameter. In my opinion any algorithm that update its model parameter based on data is ML.

This is true, but in this case it was months between updates (in the early days of Google).

I take a pretty broad view of ML, but there certainly is a difference in degree if not principle.

Most machine learning in use today is “train once (and maybe finetune) and then deploy“, not online, continuous learning. I don’t think the frequency of model updates is generally a good indicator of whether or not something is considered machine learning.

More to the original point, nowadays when most people thick of machine learning they are thinking of deep neural networks, whereas Google‘s original pagerank was very simple and shallow by comparison. But they built an algorithm that allowed machines to learn what pages were high value and what pages were low value. If that seems simple by today’s standards, it’s evidence of the AI goal posts moving more than anything else.

I empathize. Been doing ML for a similar amount of time, and the amount of people who claim to know ML, but lack basics, abound. I don't think they are to blame (after all learning isn't bad, and everyone wants to capitalize on a market); we need to adapt to the fact that the definition of ML practitioner has considerably broadened: from someone who knows only to use vgg16 from PyTorch, without understanding the optimization, loss, alternative algorithms etc to someone with a much broader and deeper view of the area. Problems arise when people expect candidates of the latter type but instead end up running into a steady stream of candidates of the former type.

Hiring Managers probably need to get used to the fact the term "ML practitioner" alone doesn't mean much anymore, and they need to set expectations accordingly.

# of people who want to do "machine learning" > # of ML jobs > # of people good at ML

Yes, I was going to say. Taking courses in "ML" is a great way to get to write a hyped-up term on your resume. Actually understanding and being able to write code for statistical learning is something else entirely.

Edit: On the other hand, I think Carmack observed that as long as you get the sign right, you'll get something out of it, even if it won't learn very quickly. So maybe speed of implementation and learning is how skilled people will differentiate themselves?

Research tasks take a long time. If you're going off of speed, it's no longer data science it's software engineering. (Though, to be fair, with all the new software engineers becoming data scientists, the data science title is becoming more like software engineering.)

imho metaphysics and metalearning might be a better way to differentiate.

Right. There's room for research, and then there's room for commoditising existing research. I think the latter is where we currently have the most to gain from expanding our efforts in, and this was the context in which I made my comment.

I see now that I made that very unclear. Sorry!

No need to be sorry. I'm equally to blame for my own lack of communication skills. ^_^

There should be a law that require all ML and DS classes, books, and bootcamps start by manually labeling data.

Exactly why as a grad student I have decided to go into Quantative Finance instead of Data Science.

Literally anyone can become a data scientist, which is not a bad thing by all means. But for me, it makes it hard to justify me going to an elite French school and dealing with theoretical stuff (that will not be useful on a job, think VC dimension and all) for over a year.

Good luck!

I started with quant research and was successful at it before moving over to data science. It's a lot of fun! The technical challenges are more difficult too.

Part of it is the area. I'm in the SF/Bay Area. The other part is schooling as the industry is a bit more ivory. I got my first tech job when I was 17.

> Hopefully the industry matures. I've bumped into data scientists who don't know how to do feature engineering as well as ones who can't grok ML. It makes the whole industry look bad.


I suspect that for 9 in 10 ML "experts" in the field there are corresponding 9 in 10 ML "jobs" in the field that are just hiring for the buzzword and seeing what happens

Have you ever interviewed applicants for software jobs? I think at least 9/10 applicants often have no idea how to code. They can write down some JavaScript, but don't know why.

>As someone who does ML for a living this hits. When switching companies in 2019 for the first time I had a hard time despite having 10+ years experience. Linkedin maxed out saying over 200 people had applied to every job I was looking at.

sure you're not just getting old? My experience is that once I got over 10+ years in experience in development people generally didn't want to hire me full-time, only consulting.

>Hopefully the industry matures. I've bumped into data scientists who don't know how to do feature engineering as well as ones who can't grok ML. It makes the whole industry look bad

This is what happens when a tech industry matures! Before it's mature and lucrative only the dedicated are into it, once it has matured there are layers of skillfulness and somewhere at the bottom the totally unskilled trying to get something.

I suppose you will be receiving an ML take home project for your next job shift.

>sure you're not just getting old? My experience is that once I got over 10+ years in experience in development people generally didn't want to hire me full-time, only consulting.

I don't know. I don't know how to break into consulting but I want to do it.

I've been through three IPOs in the last 11 years, and we didn't really hire consultants (except people out in the field to collect data sometimes, but that was contract employees), so I'm a bit out of the loop. I did everything wearing multiple hats, even going so far as to productionizing models I've made onto embedded, so there has been little reason to hire outside help.

I must have been half way awake. Not three IPOs. I went through one of those. I went through three acquisitions.

There's always a land rush of pretenders in any new sector of tech. Not sure where I heard it, but there's a definition of expert that I think a lot of people believe: "an expert is a person who knows one more thing than everyone else." It takes a while for experience to smoke the paper-thin experts out.

Honestly, I got my foot in the door and now get to work on pretty cool ML stuff as a DS.

I'm gonna stick it out for a few years (I like my job) and pivot to ML-PM right as there is a glut of technical people in the field.

In my experience, product/program managers that understand ML are a very rare breed, and often end up being the bottleneck when it comes to taking an ML model from research to users.

At the same time, becoming a PM too soon will be me being all talk no substance. Don't want that.

Good luck! ^_^

Google will happily hire and pay batshit crazy salary to any capable engineer with ML background

I'm interested to know more about your skill set. If you don't mind.

Ah sorry. I think you already mentioned timeseries classification.


I've done quite a few out there things on previous projects. Some of the more extreme are:

- I invented a new kind of ML for quant work (day trading bot) that ended up being quite successful. It's not published anywhere for hopefully obvious reasons.

- I've written models that I've productionized to embedded. I also wrote a script that converts between languages, automating the process.

- I reverse engineered google's page rank tech (my first DS job actually) and ended up getting higher accuracy on website classification than a team of experts manually classifying web pages could do.

That sounds impressive!

I'm a student who just started a real life project (basic Visual Computing). Any practical tips of what I should look out for / focus on in the future?

I love this kind of software meta-analysis. I have idly dreamed of setting up a site that essentially did literature review of existing solutions and approaches.

I thought it would be a good contribution for a language with strong academic roots like Racket. I suspect a lot of good libraries remain unwritten in part because ecosystems tend to grow organically. Organic is good, but I’m positive that even basic meta-analysis like linked article would lead to better libraries in more languages, with less labor expended.

Same thing for enterprise solutions. How does Redshift stack up against Snowflake? What about choosing a dashboard tool within the various ones that exist today? Etc.

I do this stuff foe my day job, and everything here rings true.

Every part of the "build and use ML in production" workflow is horrible (unless maybe you work at Google).

Firstly, the Datascience workflow is NOT the same as software engineering. Things like version control tools don't work properly (in every part: git on Jupyter notebooks doesn't work without hacks, versioning data is horrible, versioning models is horrible).

Deployment is horrible. Sagemaker (and equivalents) provide the very base level of functionality needed, but are so separated from the feature engineering side that everyone ends up doing vast amounts of work to get something useful.

Frameworks are horrible. TensorFlow did the upgrade to TF2, so half the examples on the web don't work anymore. The TF data loading abstractions are great - if you work at Google but so complicated to get basic examples going.

PyTorch has a horrible deployment story.

Every other frame work are either experimental research things and take months to make progress (JAX) or are so far behind modern work they are useless (MXNet).

But the thing is dealing with all these issues is worth it because ultimately it does actually work.

I think that notebooks have been a bit of a sideways step in a data science workflow. They're accessible, but they're tremendously brittle.

Notebooks are so much better than anything else I've used for data science.

I have a traditional SWEng background and came into data science never having used them. I'd never go back.

I'm not saying that they are impossible to improve, but as a general approach they are exactly right.

They are "brittle" when viewed as a software artefact. But that's not really what they are (or should be).

Not to be too self-promotion-y, but I work on an open source ML deployment that we built specifically because of how incongruous the data science workflow is to software engineering: https://github.com/cortexlabs/cortex

I'm curious to get your input on the Model Monitor, Debugger, and Experiments features on Amazon SageMaker. Have you had a chance to play around with them?

I've tried Experiments. It's great at the easy part of the ML workflow: optimising a working model. But it doesn't really help with the hard part - the debugging at the interface of the model and the data.

Say you are building a car detector or something. Building the CNN is ML101, and SageMaker experiments helps with optimising the training parameters to get the best out of the model.

But that's not really a hard thing. The hard part is working out that your model is failing on cars with reflections of people in the windscreen or something, or your dataset co-ordinate space is "negative = up" so your in memory data augmentations are making the model learn upside down cars or something.

I don't know what Debugger gives me over a notebook, but I've only read the blog post.

I haven't tried Model Monitor but I do think that could be useful.

Any experience with ML in MATLAB?

I'll chime in, I did my thesis in MATLAB (specifically ML for MRI): While matlab itself wasn't the most fun to work with, they honestly have built a fantastic suite of tools. For example the Classifier App is amazing for brute forcing through a bunch of stuff.

Even went through a couple of hackathons with it and got some SoTA results.

I wouldn't ever go back to it, especially outside of academia. But it's not the worst thing out there.


I mean it's fine, but I don't see any reason to use it instead of Python, and lots of reasons not to. But I'm not a mathematician by training.

I do quite like RStudio though, and I do see places where that is useful. So maybe MALBAB fits somewhere in between - less stats than R, less programmign that Python.

I am a mathematician by training (+CS), I've worked in Matlab quite a bit, I've taught hundreds of students in it (having no say in language, but we're about to switch to Python), and I probably see even more reasons not to use it than you do. (I haven't done ML in it, but I'd be astounded if it wasn't terrible compared to Python and its ecosystem.)

I recommend avoiding Matlab for every use case unless they've got you trapped with a huge existing code base or reliance on a proprietary toolbox.

IMO whenever RStudio decides to support Jupyter notebooks, it's game over for everyone else. It's such a great piece of software for data analysis and I hope they continue to go broader than the just the R language.

What do you think of Julia?

> Julia

The Haskell of Machine Learning.


A great survey of the current landscape of ML tool innovation.

This reminds me of the explosion of "big data" tech, which feels like it started exploding in 2010 and peaked maybe five years later.

If ML follows a similar cycle, then in perhaps five years, most tools will have receded into obscurity but a few frameworks and approaches will become dominant. Big stakes and justifies the prevalence and investment in OSS.

Comparison of "big data" and "machine learning" in Google Trends -- https://trends.google.com/trends/explore?date=all&geo=US&q=b...

I literally worked at a company that did a major project called "Big Data". It was in 2013/2014. It used Hadoop. It was pretty useless :)

It will be called ML winter, not AI winter.

Aside from the irrational but probably inevitable over-hype, I find this whole AI/ML space quite fascinating and frustrating at the same time.

Most of the successful algorithms are much less complex/sophisticated than you might believe from the outside, and I believe that's a very good sign because simplicity scales.

But on the other hand, there is a huge amount of brute-forcing, you can't really do serious research in this field without a few dozen millions to spare, and I think that's always a limit on the innovation potential.

Unfortunately writing good code/finding new algorithms is only a small part of the problem.

I also think that many people don't completely understand how weak this model is to process real-world data when there no reliable way for the system to self-train on it.

I agree with you. I was in the Michael Jordan camp, as in: Machine Learning is just statistical inference rebranded and with better marketing. There are lots of cool applications for sure but there is no "magic sauce" beyond your usual models obtained by error minimization. But lately, especially seeing the progress with the Attention/Transformer models I have been slowly letting myself convinced that maybe that is all there is behind human intelligence: Lots of priors encoded by evolution in the brain, and general quasi-brute-force mechanism for inference, deduction and imagination. Of course this is probably a naive view, but I think the main point will remain: Intelligence can be created or simulated by giving an agent the capability to extrapolate in a way that "makes sense" in the environment the agent will live.

Attention based models are still extremely data inefficient and fail to generalize. I recommend reading the "On the Measure of Intelligence" paper by Chollet if this topic interests you.

I read the paper, I like Chollet, but he comes across as too pesimistic in that paper. I agree that current models are still very data inefficient, but, despite this, I think they are not that far away from the mark if you consider 2 things. 1)The brain probably has hardcoded a lot information. Things like an universal grammar a la Chomsky, the concept of continuity of time and space.Just add water and boom a 2 year old understands. The other thing is that humans are great extrapolating from few data points for a limited set of cognitive tasks, not all.

Add that to the third Michael Jordan I've heard of

> I also think that many people don't completely understand how weak this model is to process real-world data when there no reliable way for the system to self-train on it.

True, but probably any data can be used in self-training if you have captured enough of it. ML models can handle sequences, grids, trees, graphs, point clouds and sets as input formats.

As for needing self-training in general, humans also need it even though we have such well attuned priors on the world based on billions of years of evolution. Even so, it takes years before a human can utter a phrase or accomplish a complex task.

Basically today ML works by interpolation, if you have enough data points to cover your space you're good. But if you want to extrapolate outside, you're screwed.

<rant>What we need is powerful simulators (physical or even neural based) to expose ML models to a variety of tasks inside the target domain. A simulator is like a dynamic dataset. A ML agent inside an environment (a simulator) is much more similar to humans than a model + dataset. Of course simulation is just as hard as AI.</>

most machine learning research does not require millions of dollars...

if you’re thinking about GPT-3, it uses the transformer architecture.

But the paper that introduced the transformer architecture trained for a few days on 8 GPUs — surely this was still serious research?

OpenAI sure has a good PR team, but all their spectacular results were built on previous papers (sometimes from their own researchers) that introduced new techniques without spending millions of dollars.

Awesome work - it must have been super interesting to dive into so many difference companies! The current state work that you describe is super fascinating.

Be wary of survivorship bias when you're using today's landscape to draw historical conclusions. If you're only looking at companies/tools that exist today, taking their founding date, and using that as a sign of what existed circa 2012, then you're going to miss the plethora of companies/tools that failed prior to today. Similarly, much of the recent explosion may only be because they haven't existed long enough to have failed.

I'm a data engineer, stopped trying to "upgrade" into ML years ago when I realised my skills were a lot more in need in the longer term than any ML specialists.

This is a good overview of the landscape. From what I see, even if there's an AI bubble burst in funding, a lot of the ML techniques and algorithms can still be really useful for general computing with a smaller budget. Deep learning with neural net might be too costly to run for most projects. Shallow learning algorithms are a lot more cost effective to add to general computing. Algorithms like Support Vector Machine or Decision Tree learning work surprisingly well in targeted areas.

They are OK if your problem is based on tabular data, but not for images and text. But the current paradigm is to reuse a pre-trained model, it's less more data intensive. In some tasks if you have a good backbone model you just need one or a few training examples.

SVMs absolutely work on text, TF-IDF + SVM is a very classic (and pretty solid) approach for classification. It's easily explainable and has known classes of problems.

Great write up. Thanks for posting this.

There’s rumor that due to a large number of people taking ML courses, there will be far more people with ML skills than ML jobs.

I hope this is true. There are so many areas where ML skills could be useful. The sad part would be that some industries would be changed forever. The animation industry for example might not even exist the way it is today.

Don't think it's the case - there will be more desired ML projects than people capable of implementing them. ML is like electricity 100 years ago or programming 40 years ago, we haven't applied it yet to most problems of society.

The problem is it's not as useful as many people seem to think. I often hear my colleagues suggest ML for anything remotely complicated, even something like "measuring body fat percentage using electricity" that in reality only needs a physical equation.

I've even heard people suggest it for web scraping which seems absolutely crazy to me.

It can make a lot of sense for web scraping, if you have lots of target sites you can either build strict rules for the extraction and update them constantly, hand build something generic (often very hard) or train some classifiers for the content you want.

I could actually see a use case for web scraping. If you're after particular pieces of content that aren't accessed in a structured way, on a site that rate limits you to the point of being restrictive, maybe using a bit of NLP could help you rank links to click.

I tried using OCR to scrap Facebook profiles by simulate web browsing behavior. It helps a lot in avoid account blocking but still too slow to be practical.

What's your reason for scraping so many people's personal data?

Really just curious about this approach and want to test it since most old scraping methods failed on Facebook data. My take is that it is possible with enough resources since it is actually pretty hard to separate this from real usages.

Try simulating it using headless chrome (directly accessing the page), works fast

I did a ML/AI course in university in 2011 or 2012. After the course, I had a much greater appreciation for how clever and complex the decision making process in our brain is and how difficult it is to get a computer to even do basic decision making based on a set of inputs. There's human creativity as well which a machine cannot truely do.

I'm always cautious and a bit pessimistic when I hear companies advertising AI skills or functionality built into their product. Most of them are probably driven by people willing to be the next facebook and hype, rather than real technology.

That same part of me is rejoicing in seeing the industry pay for this hype train.

I personally think the AI revolution will come from a hardware innovation (quantum computing or something like that) rather than building these large datasets to train models. We don't fully understand how our brains work so trying to replicate this with a computer program is a tough ask.

The problem with investors is that they've been oversold on the "AI" part, and are expecting magic robots that will replace humans in a few years.

Remember, the key motivation for investing in AI or ML is to become a more competitive business, by reducing (or eliminating) human labor costs. That's it.

Luckily, ML has become very incorporated in the modern workforce, but mostly as part of specific tools that can eliminate easy and tedious tasks - just as intended.

So yeah, it's sad that the big investor bucks have to come from more "exciting" projects, because there's a vast landscape of problems that can be aided my ML, even though they're not as exciting as self-driving cars, robot doctors, or whatever - and don't require some monstrous deep learning model to be solved.

> If you have to choose between engineering and ML, choose engineering. It’s easier for great engineers to pick up ML knowledge, but it’s a lot harder for ML experts to become great engineers. If you become an engineer who builds great tools for ML, I’d forever be in your debt.

Is this true, or at least the majority view? Maybe it is because the comparison is unbalanced, because "great engineer" is on both sides but the ML part is "pick up ML knowledge" and "ML experts".

I'd choose engineering because the employment opportunities are so much more favourable, not because it's easy to 'pick up ML knowledge'. I'm in the beginning of many hours of mathematics and statistics study so that I can become competent at ML, and in comparison to say, becoming good at git and python and AWS, this feels harder.

I agree 100 %, and I started as a PhD in ML 15 years ago.

DL, thanks to its "lego aspect", is getting ML commoditized fast for most applications out there. It hollows the gap between applying more or less directly existing models and pure research. There is a glut of engineers who think ML engineering is playing with TF or pytorch all day, retraining models on well defined datasets.

The reality: you want to use computer vision to do something cool server side ? Just using resnet/etc. and fine tuning on your data, maybe using as an embedded space, will get you 90 & there.

What's hard ? Identifying opportunities, and especially data. I know it is cliché, but identifying what kind of problems you can solve w/ the data you have, and being able to convince the business side that what you do is useful, all of that is very hard. Same for the ability to solve 'big problems' while delivering regular deliverable so that you're not seen as a cost center. How do you link model improvements to business impact ?

The system side of it is still immature as well.

I say that to all my reports, directs or indirects: hone your SWE skills, make sure you understand how you bring value to your company. Unless you're in the top 1 %, you're not going to survive with ML skills alone.

I do think it's true, and likely the majority view too. The saying has many, many alternatives (for quantitative finance, it's "you can teach a math PhD finance but you can't teach a finance PhD math [at the same level that a math PhD is at]") in different studies.

I think it just boils down to knowing the fundamentals or not, and arguably a person who only studied ML will lack a wide range of tools in their toolkit to be able to tackle tangential problems that an engineer could (whereas the engineer already has the tools to build up their ML toolkit). In my anecdotal experience with teaching, knowing how to learn is heavily correlated with how general (as opposed to focused) the field the person is in.

Thanks for the list. In my case, when I started doing ML/DL projects, I was looking for many many tools that could help me organize/train/deploy my model. I remembered looking at Kedro for days which looks great, or MetaFlow from Netflix.

But at the end, I did my own project's structure with a enough Docker monolith that includes MLFlow as well. This is the only tool I use (with sklearn) for something else that training and serving my model.

I think it's nice to do all of these by yourself at first and later use some of the tools listed there to enhance our workflow. It was really hard for me to get started with Kedro when your project is not as simple as MNIST or CIFAR.

When is "tooling" worth buying? Perhaps it's a reflection of the kind of person that I am, but I would always rather build my pipeline internally rather than outsource it. That pipeline will probably become part of the critical path.

> Perhaps it's a reflection of the kind of person that I am


Lots of companies just want the shortest path from here to profitability. If the cost/benefit equations justify it, then eventually they can justify the outlay of building their own tooling to replace outsourced pieces a bit at a time.

I'm thinking of things like AWS Lambda. Once, I built a slim version that was enough to support my own company's use case. I was told "that's good. save the git repo. one day it may be worth it to add the redundancy and scaling we already get from aws. until then, there's more profitable things for you to be working on."

> until then, there's more profitable things for you to be working on.

Sorry, but this gave me a slight chuckle. I think I can empathize I guess, but found your wording hilariously blunt.

I've have been working on software engineering, web development primarily over the last few years. It's always felt that use of AI in industry was hyped up and soon the hype wind down. Finally there would be done very specific domains were AI can do exceptionally well.

But recently I've been looking into the works of OpenAI and I can say that all my assumptions were out right wrong. The rate of growth in AI is increasing definitely. Just consider the voice tech landscape, with the new generative models around speech I can totally foresee the numbers of industries that are going to be disrupted from the ground up.

A good read for beginners. I believe the literature about ML state seems to be a little dated with current times. People and companies are at the phase of already implementing ML in their companies, versus figuring out things: How? The ML field gives you options for every user and different levels of expertise. Advanced user: TF or Pytorch, preprocessing: Beam or Spark. Accelerators: GPUs. Tools such as Bigquery ML and AutoML have helped users to get into ML by generating MVP in an easy way. In would love to hear more from people in the field.

What a great write up. The high level survey of the landscape was insightful, and directly comparing the challenges of ML/AI with their "solved" counterparts was great.

Excellent read. Bookmarked.

`The pretrained large BERT model has 340M parameters and is 1.35GB. Even if it can fit on a consumer device (e.g. your phone), the time it takes for BERT to run inference on a new sample makes it useless for many real world applications. `

This is true, but a fine-tuned Distilbert model will give close to that accuracy with a ~260MB file size.

I don't think AI is overhyped, though certainly there are many bubble companies within it. I believe like the value and applications for cutting edge models is still being discovered, and I would only feel relief if less people piled on the bandwagon, out of a selfish impulse; more for me.

Interesting article! However as a software dev working in this field it is missing quite a few tools.

The first that comes to mind is Dataiku which is already a pretty big startup (~$150M raised) so I'm surprised the author missed it.

I am kinda irritated that this analysis was not done using ML...

I'd consume this style of outline for almost any topic

Regardless of my interest in it

So useful

The most important AI company nowadays is almighty Nvidia.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact