As someone who does ML for a living this hits. When switching companies in 2019 for the first time I had a hard time despite having 10+ years experience. Linkedin maxed out saying over 200 people had applied to every job I was looking at.
Thankfully I have a rare skill set. It's not found in classes, books, or boot camps. On the downside there are not many jobs to choose from as companies don't know what I do can be done. (Time series classification and prediction, usually sensor data.)
Hopefully the industry matures. I've bumped into data scientists who don't know how to do feature engineering as well as ones who can't grok ML. It makes the whole industry look bad.
Likewise, the number of people that are able to apply their domain knowledge to actually help their company's businesses/product or problems will always have work.
I've often used an analogy comparing this to cooking. Just because people say they know how to "cook" a dish, doesn't mean they'll be able to function or be effective in a restaurant or other type of business context kitchen. Yes, there's an element of this domain knowledge required, which are table stakes, but often the factors that distinguish success have more to do with understanding the context of the work and the surrounding business.
To update this for recent fads, learning to make sourdough bread does not mean one has learned how to run a bakery.
If you're able to be a pure researcher, or pure specialist, you're very lucky and working somewhere whey they can afford to have deep specializations.
The plethora of emerging tools and practices related to AI/ML "workflow" are related to the collective realization that training ML models is just one piece of the puzzle:
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-m... (See: Figure 1)
So while there's a lot more competition, the majority of these don't even meet the minimum standards to get called into interviews.
If you have solid work experience to show for, then that alone should differentiate you from the the masses. Again, a lot of these people have NEVER worked with anything relevant, and can only show you a bunch of tutorials they've followed. IF lucky, they've actually taken on more exciting datasets from sources like Kaggle, and tried to come up with something.
edit: Not to mention the data - I get extremely impressed when someone actually shows a project which involves them collecting data from the wild, wrangling it, and then doing meaningful work.
Too many beginners only work on neat / ready datasets. You'd be surprised how many candidates get completely blindsided when asked to do some trivial tasks on pre-processing data.
I would argue that beginners only -work- on neat/ready datasets and lack the appreciation or understanding that most of the time the data does not already exist in a usable form, and sometimes not at all.
The process for making model training possible, and the process for applying models in a useful way are not taught in curriculum, tutorials, and generally not addressed in other literature (e.g. academic/research pubs)
"Amateurs and beginners focus on trying to get things right. Professionals focus on making sure things aren't/don't go wrong."
I am a bit curious on what is the minimum standard? It does feel to me (an outsider that took 1 class of ML in college) that you need at least Master in ML to get intuition on the probabilistic and linear algebra theory behind ML concepts.
On the computer science side it makes sense to understand how to build a tree, not just know how to use one. In computer science you're learning how to write algorithms, not just use them. On the MLE or Machine Learning Software Engineer side, the same rule applies and knowing linear algebra as well as probability theory is helpful so you can write your own ML.
On the data science side you're rarely inventing new ML. What you want to specialize in is: 1) data mining using ML, 2) data cleaning, by knowing what kind of input ML needs, 3) feature engineering, also by knowing what kind of input ML needs, and 4) What kind of ML is ideal to choose, eg, due to the bias/variance trade off.
A lot of these boot camps, classes, and books teach the underlying structure of ML to become an MLE, though for some sort of reason they often advertise it as data science. I find this odd, because MLE pays better and is in higher demand.
On the data science side we rarely need to know these finer details. We just need to know the ML's characteristics so we know when it is the right tool for the job, similar to knowing how to use a data structure but not needing to know how to invent new data structures.
A data scientist needs to be a research specialist which is more of a phd than masters skill, so knowing the underlying math also doesn't matter as much because we know when it is necessary to research it. It's not that a data scientist can't know it, and many do know it out of hobby or classes, but it's far from mandatory. Knowing probability theory and how to digest a problem into multiple paths forward, like how to collect data, is far more valuable as a skill.
And finally, many data scientists barely know how to write code. They're a kind of analyst. I feel like the job title isn't sufficiently explained so software engineers make a lot of assumptions mixing up MLE with DS.
edit: Also, Linear Algebra isn't that bad and is an undergrad class. Probability theory is taught lightly in DS101 freshman year, but the first year of getting a masters probability theory is often taught again to a much more rigorous. This can get a bit harder, but if you understand the basics it's not bad.
Google's freeze as far as I know is for all engineers ( only backfilling attrition is permitted), not limited to ML.
I personally feel this slump in the ML job market is only temporary and related to a temporary loss of risk taking appetite.
As for the legit vs non legit ML practitioners, I don't see how the situation is going to improve any time soon. Because it's not just random engineers fooling themselves by saying they are great at ML, there are entire companies new and old where there has been an increasing focus/pressure to sell AI powered $XYZ snake oil. They don't care who they hire, and most likely they don't even know who/how to hire when it comes to ML/AI, because the whole process right from the top is let's sell it first and wing it later
That doesn't detract from the article's point though - lots more interest in those programs than there are places.
Most of these companies have substantial investments in ML that aren't research-focused, they're powering their flagship products. At this point, ETA prediction isn't a speculative experiment for Uber or Google Maps, just like RankBrain, Smart Compose, spam filtering, etc.
There’s a lot of boring models and practices that get the job done for the 80%. That don’t require as much talent to do well. That scale, quick to develop, and can be implemented on prod with ease.
But there’s a problem in tech culture of reaching first for the most sophisticated ML, not the most practical.
Also, AI is anything cutting edge. The fill tool in Paint was once considered AI, so AI will continue on as long as we have new tech.
This Wikipedia page has a lot of filling algorithms, and none of them looks like it's AI-based: https://en.wikipedia.org/wiki/Flood_fill
Can anyone explain me why the fill tool in Paint was considered AI? Has the definition of "AI" changed?
Should graph search be considered AI? This gets us to the root of the issue, which is how do we consider something AI or not and to what extent has that definition changed as some problems have become trivial or commonplace. Machine learning is a bit easier since you can create much strong classifications around the need to teach, to feed it data with expected results, before it is useful to get a response (or otherwise give it a way to evaluate data it generates and then feeds itself). However there are a number of techniques that predate ML, or at least having fast enough computers to have worthwhile success at ML, and thus we are left with determining some way to objectively classify those as either AI or not.
AI today is the study of optimizing NP problems through a series of educated guesses.
It would seem that the AI goalposts are always being moved.
I mean it's true that ML uses linear algebra. But the difference between ML and PageRank is that ML is Machine Learning. The learning part (involving storing and updating based on new data) is different to statically calculating Pagerank using Map/Reduce and applying it as a ranking function.
I take a pretty broad view of ML, but there certainly is a difference in degree if not principle.
More to the original point, nowadays when most people thick of machine learning they are thinking of deep neural networks, whereas Google‘s original pagerank was very simple and shallow by comparison. But they built an algorithm that allowed machines to learn what pages were high value and what pages were low value. If that seems simple by today’s standards, it’s evidence of the AI goal posts moving more than anything else.
Hiring Managers probably need to get used to the fact the term "ML practitioner" alone doesn't mean much anymore, and they need to set expectations accordingly.
Edit: On the other hand, I think Carmack observed that as long as you get the sign right, you'll get something out of it, even if it won't learn very quickly. So maybe speed of implementation and learning is how skilled people will differentiate themselves?
imho metaphysics and metalearning might be a better way to differentiate.
I see now that I made that very unclear. Sorry!
Literally anyone can become a data scientist, which is not a bad thing by all means. But for me, it makes it hard to justify me going to an elite French school and dealing with theoretical stuff (that will not be useful on a job, think VC dimension and all) for over a year.
I started with quant research and was successful at it before moving over to data science. It's a lot of fun! The technical challenges are more difficult too.
Part of it is the area. I'm in the SF/Bay Area. The other part is schooling as the industry is a bit more ivory. I got my first tech job when I was 17.
I suspect that for 9 in 10 ML "experts" in the field there are corresponding 9 in 10 ML "jobs" in the field that are just hiring for the buzzword and seeing what happens
sure you're not just getting old? My experience is that once I got over 10+ years in experience in development people generally didn't want to hire me full-time, only consulting.
>Hopefully the industry matures. I've bumped into data scientists who don't know how to do feature engineering as well as ones who can't grok ML. It makes the whole industry look bad
This is what happens when a tech industry matures! Before it's mature and lucrative only the dedicated are into it, once it has matured there are layers of skillfulness and somewhere at the bottom the totally unskilled trying to get something.
I suppose you will be receiving an ML take home project for your next job shift.
I don't know. I don't know how to break into consulting but I want to do it.
I've been through three IPOs in the last 11 years, and we didn't really hire consultants (except people out in the field to collect data sometimes, but that was contract employees), so I'm a bit out of the loop. I did everything wearing multiple hats, even going so far as to productionizing models I've made onto embedded, so there has been little reason to hire outside help.
I'm gonna stick it out for a few years (I like my job) and pivot to ML-PM right as there is a glut of technical people in the field.
In my experience, product/program managers that understand ML are a very rare breed, and often end up being the bottleneck when it comes to taking an ML model from research to users.
At the same time, becoming a PM too soon will be me being all talk no substance. Don't want that.
I've done quite a few out there things on previous projects. Some of the more extreme are:
- I invented a new kind of ML for quant work (day trading bot) that ended up being quite successful. It's not published anywhere for hopefully obvious reasons.
- I've written models that I've productionized to embedded. I also wrote a script that converts between languages, automating the process.
- I reverse engineered google's page rank tech (my first DS job actually) and ended up getting higher accuracy on website classification than a team of experts manually classifying web pages could do.
I'm a student who just started a real life project (basic Visual Computing). Any practical tips of what I should look out for / focus on in the future?
I thought it would be a good contribution for a language with strong academic roots like Racket. I suspect a lot of good libraries remain unwritten in part because ecosystems tend to grow organically. Organic is good, but I’m positive that even basic meta-analysis like linked article would lead to better libraries in more languages, with less labor expended.
Every part of the "build and use ML in production" workflow is horrible (unless maybe you work at Google).
Firstly, the Datascience workflow is NOT the same as software engineering. Things like version control tools don't work properly (in every part: git on Jupyter notebooks doesn't work without hacks, versioning data is horrible, versioning models is horrible).
Deployment is horrible. Sagemaker (and equivalents) provide the very base level of functionality needed, but are so separated from the feature engineering side that everyone ends up doing vast amounts of work to get something useful.
Frameworks are horrible. TensorFlow did the upgrade to TF2, so half the examples on the web don't work anymore. The TF data loading abstractions are great - if you work at Google but so complicated to get basic examples going.
PyTorch has a horrible deployment story.
Every other frame work are either experimental research things and take months to make progress (JAX) or are so far behind modern work they are useless (MXNet).
But the thing is dealing with all these issues is worth it because ultimately it does actually work.
I have a traditional SWEng background and came into data science never having used them. I'd never go back.
I'm not saying that they are impossible to improve, but as a general approach they are exactly right.
They are "brittle" when viewed as a software artefact. But that's not really what they are (or should be).
Say you are building a car detector or something. Building the CNN is ML101, and SageMaker experiments helps with optimising the training parameters to get the best out of the model.
But that's not really a hard thing. The hard part is working out that your model is failing on cars with reflections of people in the windscreen or something, or your dataset co-ordinate space is "negative = up" so your in memory data augmentations are making the model learn upside down cars or something.
I don't know what Debugger gives me over a notebook, but I've only read the blog post.
I haven't tried Model Monitor but I do think that could be useful.
Even went through a couple of hackathons with it and got some SoTA results.
I wouldn't ever go back to it, especially outside of academia. But it's not the worst thing out there.
I mean it's fine, but I don't see any reason to use it instead of Python, and lots of reasons not to. But I'm not a mathematician by training.
I do quite like RStudio though, and I do see places where that is useful. So maybe MALBAB fits somewhere in between - less stats than R, less programmign that Python.
I recommend avoiding Matlab for every use case unless they've got you trapped with a huge existing code base or reliance on a proprietary toolbox.
The Haskell of Machine Learning.
This reminds me of the explosion of "big data" tech, which feels like it started exploding in 2010 and peaked maybe five years later.
If ML follows a similar cycle, then in perhaps five years, most tools will have receded into obscurity but a few frameworks and approaches will become dominant. Big stakes and justifies the prevalence and investment in OSS.
Comparison of "big data" and "machine learning" in Google Trends -- https://trends.google.com/trends/explore?date=all&geo=US&q=b...
Most of the successful algorithms are much less complex/sophisticated than you might believe from the outside, and I believe that's a very good sign because simplicity scales.
But on the other hand, there is a huge amount of brute-forcing, you can't really do serious research in this field without a few dozen millions to spare, and I think that's always a limit on the innovation potential.
Unfortunately writing good code/finding new algorithms is only a small part of the problem.
I also think that many people don't completely understand how weak this model is to process real-world data when there no reliable way for the system to self-train on it.
True, but probably any data can be used in self-training if you have captured enough of it. ML models can handle sequences, grids, trees, graphs, point clouds and sets as input formats.
As for needing self-training in general, humans also need it even though we have such well attuned priors on the world based on billions of years of evolution. Even so, it takes years before a human can utter a phrase or accomplish a complex task.
Basically today ML works by interpolation, if you have enough data points to cover your space you're good. But if you want to extrapolate outside, you're screwed.
<rant>What we need is powerful simulators (physical or even neural based) to expose ML models to a variety of tasks inside the target domain. A simulator is like a dynamic dataset. A ML agent inside an environment (a simulator) is much more similar to humans than a model + dataset. Of course simulation is just as hard as AI.</>
if you’re thinking about GPT-3, it uses the transformer architecture.
But the paper that introduced the transformer architecture trained for a few days on 8 GPUs — surely this was still serious research?
OpenAI sure has a good PR team, but all their spectacular results were built on previous papers (sometimes from their own researchers) that introduced new techniques without spending millions of dollars.
Be wary of survivorship bias when you're using today's landscape to draw historical conclusions. If you're only looking at companies/tools that exist today, taking their founding date, and using that as a sign of what existed circa 2012, then you're going to miss the plethora of companies/tools that failed prior to today. Similarly, much of the recent explosion may only be because they haven't existed long enough to have failed.
There’s rumor that due to a large number of people taking ML courses, there will be far more people with ML skills than ML jobs.
I hope this is true. There are so many areas where ML skills could be useful. The sad part would be that some industries would be changed forever. The animation industry for example might not even exist the way it is today.
I've even heard people suggest it for web scraping which seems absolutely crazy to me.
I'm always cautious and a bit pessimistic when I hear companies advertising AI skills or functionality built into their product. Most of them are probably driven by people willing to be the next facebook and hype, rather than real technology.
That same part of me is rejoicing in seeing the industry pay for this hype train.
I personally think the AI revolution will come from a hardware innovation (quantum computing or something like that) rather than building these large datasets to train models. We don't fully understand how our brains work so trying to replicate this with a computer program is a tough ask.
Remember, the key motivation for investing in AI or ML is to become a more competitive business, by reducing (or eliminating) human labor costs. That's it.
Luckily, ML has become very incorporated in the modern workforce, but mostly as part of specific tools that can eliminate easy and tedious tasks - just as intended.
So yeah, it's sad that the big investor bucks have to come from more "exciting" projects, because there's a vast landscape of problems that can be aided my ML, even though they're not as exciting as self-driving cars, robot doctors, or whatever - and don't require some monstrous deep learning model to be solved.
Is this true, or at least the majority view? Maybe it is because the comparison is unbalanced, because "great engineer" is on both sides but the ML part is "pick up ML knowledge" and "ML experts".
I'd choose engineering because the employment opportunities are so much more favourable, not because it's easy to 'pick up ML knowledge'. I'm in the beginning of many hours of mathematics and statistics study so that I can become competent at ML, and in comparison to say, becoming good at git and python and AWS, this feels harder.
DL, thanks to its "lego aspect", is getting ML commoditized fast for most applications out there. It hollows the gap between applying more or less directly existing models and pure research. There is a glut of engineers who think ML engineering is playing with TF or pytorch all day, retraining models on well defined datasets.
The reality: you want to use computer vision to do something cool server side ? Just using resnet/etc. and fine tuning on your data, maybe using as an embedded space, will get you 90 & there.
What's hard ? Identifying opportunities, and especially data. I know it is cliché, but identifying what kind of problems you can solve w/ the data you have, and being able to convince the business side that what you do is useful, all of that is very hard. Same for the ability to solve 'big problems' while delivering regular deliverable so that you're not seen as a cost center. How do you link model improvements to business impact ?
The system side of it is still immature as well.
I say that to all my reports, directs or indirects: hone your SWE skills, make sure you understand how you bring value to your company. Unless you're in the top 1 %, you're not going to survive with ML skills alone.
I think it just boils down to knowing the fundamentals or not, and arguably a person who only studied ML will lack a wide range of tools in their toolkit to be able to tackle tangential problems that an engineer could (whereas the engineer already has the tools to build up their ML toolkit). In my anecdotal experience with teaching, knowing how to learn is heavily correlated with how general (as opposed to focused) the field the person is in.
But at the end, I did my own project's structure with a enough Docker monolith that includes MLFlow as well. This is the only tool I use (with sklearn) for something else that training and serving my model.
I think it's nice to do all of these by yourself at first and later use some of the tools listed there to enhance our workflow. It was really hard for me to get started with Kedro when your project is not as simple as MNIST or CIFAR.
Lots of companies just want the shortest path from here to profitability. If the cost/benefit equations justify it, then eventually they can justify the outlay of building their own tooling to replace outsourced pieces a bit at a time.
I'm thinking of things like AWS Lambda. Once, I built a slim version that was enough to support my own company's use case. I was told "that's good. save the git repo. one day it may be worth it to add the redundancy and scaling we already get from aws. until then, there's more profitable things for you to be working on."
Sorry, but this gave me a slight chuckle. I think I can empathize I guess, but found your wording hilariously blunt.
But recently I've been looking into the works of OpenAI and I can say that all my assumptions were out right wrong. The rate of growth in AI is increasing definitely. Just consider the voice tech landscape, with the new generative models around speech I can totally foresee the numbers of industries that are going to be disrupted from the ground up.
Excellent read. Bookmarked.
This is true, but a fine-tuned Distilbert model will give close to that accuracy with a ~260MB file size.
The first that comes to mind is Dataiku which is already a pretty big startup (~$150M raised) so I'm surprised the author missed it.
Regardless of my interest in it