I think this same thing is happening with ML. It was a hiring bonanza. Every big corp wanted to get an ML/AI strategy in place. They were forcing ML in to places it didn't (and may never) belong. This "recession" is mostly COVID related I think - but companies will discover that ML is (for the vast majority) a shiny object with no discernible ROI. Like Big Data, I think we'll see a few companies execute well and actually get some value, while most will just jump to the next shiny thing in a year or two.
Here's another aspect - in many places nobody listens to the actual people doing the work. In my last job I was hired to lead a Data Science team and to help the company get value of Stats/ML/AI/DL/Buzzword. And I (and my team) were promptly overridden on every decision of what projects an expectations were realistic and what were not. I left, as did everybody else that reported to me, and we were replaced by people who would make really good BS slides that showed what upper management wanted to see. A year after that the whole initiative was cancelled.
Back in 2000 I was in a similar position with a small company jumping on the internet as their next business model. Lots of nonsense and one horrible web based business later, the company failed.
It's the same story over and over again. Some winners, lot of losers, many by self-inflicted wounds.
So essentially, you have a system where people spend other people's resources for living and their success is judged by making the chain link above happy. In especially large companies it's easy to have a disconnect from the product because people in the top specialise in topics that have nothing to do with the product. If the people at the top want to have this shiny new thing that the press and everyone else is saying that it's the next big thing, you better give them the new shiny thing if you want to have a smooth career. In publicly traded companies, this is even more prevalent because people who buy and sell the stocks would be even more disconnected from the product and tied to the buzzwords.
The more technical minded people who have the hunch on tech miss the point of the organisation that they are in and get very frustrated. It's probably the reason why startups can be much more fulfilling for deeply technical people.
This is one of the reasons I roll my eyes whenever I read something like "McKinsey says 75% of Big Data/AI/Buzzword projects do not deliver any value." What's the baseline for failing and/or delivering zero value because those projects were destined to fail?
The whole point is, from their point of view those decisions are rational. It's much more lucrative from their (managers') personal point of view to develop a smokes-and-mirrors looks-good-on-ppt AI project. To be safe from risk, don't give the AI people too much responsibility, let them "do stuff", who cares, the point is we can now say we are an AI-driven company on the brochures, and we have something to report up to upper management. When they ask "are we also doing this deep learning thing? It's important nowadays!" we say "Of course, we have a team working on it, here's a PPT!". An actual AI project would have much bigger risks and uncertainty. I as a manager may be blamed for messing up real company processes if we actually rely on the AI. If it's just there but doesn't actually do anything, it's a net win for me.
Note how this is not how things run when there are real goals that can be immediately improved through ML/AI and it shows up immediately on the bottom line, like ad and recommendation optimizations in Youtube or Netflix or core product value like at Tesla etc.
The bullshit powerpoint AI with frustrated and confused engineers happens in companies where the connection is less direct and everyone only has a nebulous idea of what they would even want out of the AI system (extract valuable business knowledge!).
The useful AI/ML isn't glamorous, it's quite boring and ugly. Things like spam detection, image labeling, event parsing, text classification.
It's hard to get a big, shiny model into direct user facing systems.
Either way I don't think it matters too much because people can't really tell simple from shiny as long as the buzzword bullet points are there.
The point is rather that the job of the data science team is to deliver prestige to the manager, not to deliver data science solutions to actual practical problems. It's enough if they work on toy data and show "promising results" and can have percentages, impressive serious charts and numbers on the powerpoint slides.
I've heard from many data scientists in such situations that they don't get any input on what they should actually do, so they make up their own questions and own tasks to model, which often has nothing to do with actual business value, but they toy around with their models, produce accuracy percentages and that's enough.
Looks like all job postings "collapsed during the pandemic"
It's important to understand why stuff fails. That's the only way to stop things from failing in the future and make sure people are on the right path to not failing. If a large number project failures are management failures, it's useful to know that. Otherwise you try to fix everything except the management structure.
"Personalization" is in a similar place for digital publishing; everyone wants it, products and services carry big price tags, and few organizations want to invest in foundational work or simple, iterative improvements. So they swing for the stars with unrealistic goals like "micro-targeted messaging perfectly tailored to every visitor, no matter where they are in the customer journey" and the results are predictable…
I take the increasingly grim accounts of project failure rates from analyst firms as a good sign — they can be used to sober up executives with unrealistic dreams.
These claims are usually high level and based on surveys or whatever. Failing usually means leadership gave up. As far as high level awareness of project success rates, it's probably accurate enough to justify the point: companies are generally bad at doing X. This tends to be true for many different kinds of X, because business is hard.
I generally don't agree that people make up destined to fail projects for selfish gains. I'm sure it happens, but that seems bottom of the barrel in terms of problems to fix. With DS specifically, leaders just don't know what to do. So they hire data scientists, and the data scientists don't know anything about the business, so they make some dashboards or whatever and nobody uses them. It's really not easy. Business is hard.
Similarly, you couldn’t just fake these types of savings because they needed to be showing up in budget requests. If I saved $10M in hardware costs, then that line item in the budget better reflect it.
AKA the principal-agent problem:
I think the opposite is just as often true: Startups often don't have any real customers, so it's all about buzzwords and whatever razzle-dazzle they can put in a pitch deck to raise the next round.
The place I currently work is data-driven (perhaps to a fault). Every change is wrapped behind an experiment and analyzed. Engineers play a major role in this process (responsible for analysis of simple experiments), whereas the data org owns more thorough, long-term analysis. This means there are a significant number of people invested in making numbers go up. It also means we're very good at finding local maxima, but struggle greatly shipping larger changes that land somewhere else on the graph.
Some of the best advice I've heard related to this is for leadership to be honest about the "why". Sometimes we just want to ship a redesign to eventually find a new maximum, even through we know it will hurt metrics for a while.
...probably true to some extent, but not all leaders are self important ass hats who refuse to acknowledge they are simply “making decisions” not “making good decisions”. Most leaders are doing the best they can (often even very well) with the insights available to them.
I don’t think most data teams are really at fault; they’re just doing what they’re told.
The problem imo lies with the analysts who fail to do anything useful with the data they’re given, and demand constant changes from the data team because they want to deliver silver bullet results to the leadership level.
That’s the problem layer; people who want to be important but have nothing to offer, whipping their data team to produce rubbish and then blaming them for either a) not producing anything fast enough or b) not making the numbers big enough.
"Data science"/analytics groups are a cost center telling management things they do not want to hear, and disrupting management narrative (with receipts). There's no point to it; either you deliver "moneyball"-like opportunities that are ignored, or torture the data to fit existing narratives. In both cases, you eventually get stabbed by an experienced bureaucratic knife fighter.
So, I switched from predictive analytics and put on my prescriptive analytics hat. Over the time I was there I created several presentations containing multiple paths forward letting management feel like they were deciding the path forward.
This continued until I was fired. The board didn't like that I wasn't using a neural nets to solve the companies problems. Startups often do not have enough labeled data, so DNNs were not considered. Oddly, I didn't get a warning or a request about this before being let go. I suspect management got tired of me managing upward. In response my coworker quit right then and there and took me out to lunch. ^_^
Then the company doesn't listen to either group of people (neither tier 1 sales/support people, nor the ML people) and then fires / shuts down the entire division because "upper management didn't find value"
A lot of time they are right to ignore it. I've seen tables say X, but there was some flaw up the capture stack. Very few data analyst have the broad based knowledge and dedication needed to trace the data stack to establish the needed trust with the executive team.
I would have expected the researchy people to be better at it, as often you'll need to collect and analyse your own data during grad programs, and thus have some experience.
If you give executives data and they don't like the results, they will ask you to tweak parameters until the data represents what they want.
And there‘s the point where - IMHO - 3% gain may not be profitable enough.
This understates how awful ML is at many of these companies.
I've seen quite a few companies that rushed to hire teams of people with a PhD in anything that barely made it through a DS/ML boot camp.
To prove that they're super smart ML researchers without fail these hires rush to deploy a 3+ layer MLP to solve a problem that need at most a simple regression. They have no understanding of how this model works, and have zero engineering sense so they don't care if it's a nightmare of complexity to maintain. Then to make sure their work is 'valuable' management tries to get as many teams as possible to make use of the questionable outputs of these models.
The end is a nightmare of tightly coupled models that nobody can debug, trouble shoot or understand. And because the people building them don't really understand how they work the results are always very noisy. So you end up with this mess of expensive to build and run models talking noise to each other.
When I saw this I realized data science was doomed in the next recession, since the only solution to this mess is to just remove it all.
There is some really valuable DS work out there, but it requires real understanding of either modeling or statistics. That work will probably stick around, but these giant farms of boot camp grads churning out keras models will disappear soon.
A good data scientist might choose to use machine learning to accomplish their job. Or they might find that classical statistical inference is the better tool for the task at hand. A good data scientist, having built this model, might choose to put it into production. Or they might find that a simple if-statement could do the job almost as effectively but not nearly as expensively. A good data scientist, having decided to productionize a model, will also provide some information about how it might break down - for example, describing shifts in customer behavior, or changes in how some input signal is generated, or feedback effects that might invalidate the model.
OTOH, if your job has been framed in terms of cutting-edge machine learning, then you may well know - at a gut level, if not consciously - that your job is basically just a pissing match to see who can deploy the most bleeding-edge or expensive technology the fastest. It's like the modern hospital childbirth scene in Monty Python's The Meaning of Life, where the doctor is more interested in showing off the machine that goes, "ping!" in order to impress the other doctors than he is in paying attention to the mother.
About 5 years ago I was coming out of academia with a PhD in chemistry wanting to get into tech. Someone pointed me toward data science and I was immediately pulled in by the potential for deep insights and working with interesting data sets. After getting into an industry research position, I was quickly disillusioned by the talk from our leadership of how we needed to become an "AI company" and discovered that was I really wanted to be doing was classic algorithm development, not data science.
Now, I've see so much poorly implemented, unnecessary machine learning applied to problems that didn't need it that I first assume that any machine learning project is a bad technical decision until proven otherwise. I've happily moved into an engineering role building interesting pipelines.
For example, I can get pretty preoccupied with multicollinearity or heteroskedasticity when I'm wearing my statistician hat, while they barely qualify as passing diversions when I'm wearing my machine learning engineer hat. If I'm doing ML, I'll happily deliberately bias the model. That would be anathema if I were doing statistical inference.
To be fair, I started to understand why developers gave out about bootcamp grads lacking a foundation when the bootcamps came for my discipline (data science).
The PhD fetish is pretty mental (even though I have one), as it's really not necessary.
Additionally, everyone thinks they need researchers, when they really, really don't.
Having worked with researchy vs more product/business driven teams, I found that the best results came when a researchy person took the time to understand the product domain, but many of them believe they're too good for business (in which case you should head back to academia).
What you actually need from an ML/Data Science person:
- Experience with data cleaning (this is most of the gig)
- A solid understanding of linear and logistic regression, along with cross-validation
- Some reasonable coding skills (in both R and Python, with a side of SQL).
That's it. Pretty much everything else can be taught, given the above prerequisites.
But it's tricky for hiring managers/companies as they don't know who to hire, so they end up over-indexing on bullshitters, due to the confidence, leading to lots of nonsese.
And finally, deep learning is good in some scenarios and not in others, so anyone who's just a deep learning developer is not going to be useful to most companies.
This isn't to say that PHd knowledge isn't valuable but if you look at firms in finance that have had success with data i.e. RenTech, they hire very smart people with PHds but it isn't only the PHd. You need someone who has the knowledge AND someone who has common sense/can get results. That is very hard to do correctly (and yes, some people who come from academia literally do not want anything to do with business...it is like the devs who come from a CS PHd and insist on using complicated algo and data structure everywhere, optimising every line, etc.).
It's an abbreviation of the Latin "philosophiae doctor" == "doctor of philosophy".
Getting one is as much about persistence as intelligence, and makes you very knowledgeable about one narrow area, so what you say makes sense. Academic researchers usually branch out into other fields and subfields as well, but straight out of a PhD, narrow and deep is what you tend to get.
I don't think the issue is just that companies hire people who are awful at ML, it's also that people are trying to shoehorn deep learning into everything, even when it currently has nothing to offer and we have better solutions already. IMHO, we're producing too many deep learning PhDs.
I think most people view the hard part as doing the PHd, and so lots of people value that experience, and because they have that experience you have this endowment effect: wow, that PHd was hard, I must do very hard and complex things.
To give you an example: Man Group. They are a huge quant hedge fund, in fact they were one of the first big quant funds. Now, they even have their own program at Oxford University that they hire out of...have you heard of them? Most people haven't. Their performance is mostly terrible, and despite being decades ahead of everyone their returns were never very good (they did well at the start because they had a few exceptional employees, who then went elsewhere...David Harding was one). The issue isn't PHds, they have many of them, the issue is having that knowledge AND being able to convert it.
I think this is really hard to grasp because most people expect problems to yield instantly to ML but, in most cases, they don't and other people have done valuable work with non-ML stuff that should be built on but isn't because domain knowledge or common sense is often lacking.
A similar thing is people who come out of CS, and don't know how to program. They know a bit but they don't know how to use Git, they don't know how to write code others can read, etc.
And again, the key point was: they have had this institute for how long? Decade plus? Are they a leading quant fund? No. Are they in the top 10? No. Are they doing anything particularly inventive? See returns. No.
I know that I (as a DS Lead/Manager) would hire someone who uses an appropriate solution to a business problem above someone who has an intricate knowledge of applying PyTorch to inappropriate problems.
But maybe I'm in a minority here.
Of course it sucks on the short term, but there is zero chance the field will be abandoned. It has enough uses already.
So now my advice is others is "if you can start with some first principles equation or system of equations... start there and use optimization/regression to fit the model to the data."
AND: "if you don't think such equations exist for your problem... read/research more, because some useful equations probably do exist."
This is usually pretty straightforward for engineering and science applications... equations exist or can be derived for the system under study.
In my very limited exposure to other areas of machine learning application... I have found quite a bit of mathematical science related to marketing, human behavior, etc.
> Everything is linear if plotted log-log with a fat magic marker
It was difficult to attract top ML talent no matter how much we offered. Everyone wanted to work for one of the big, recognizable names in the industry for the resume name recognition and a chance to pivot their way into a top role at a leading company later.
Meanwhile, we were flooded with applicants who exaggerated their ML knowledge and experience to an extreme,
hoping to land high paying ML jobs through hiring managers who couldn’t understand what they were looking for. It was easy to spot most of these candidates after going through some ML courses online and creating a very basic interview problem, but I could see many of these candidates successfully getting ML jobs at companies that didn’t know any better. Maybe they were going to fake it until they made it, or maybe they were counting on ML job performance being notoriously difficult to quantify on big data sets.
Dealing with 3rd party vendors and consulting shops wasn’t much better. A lot of the bigger shops were too busy with never ending lucrative contracts to take on new work. A lot of the smaller shops were too new to be able to show us much of a track record. Their proposals often boiled down to just implementing some famous open source solution on our product and letting us handle the training. Thanks, but we can do that ourselves.
I get the impression that it is (or was) more lucrative to start your own ML company and hope for an acquisition than to do the work for other companies. We tried to engage with several small ML vendors in our space and more than half of them came back with suggestions that we simply acquire them for large sums of money. Meanwhile, one of the vendors we engaged with was acquired by someone else and, of course, their support dried up completely.
Ultimately we found a solution from a vendor that had prepared a nice solution for our exact problem.the contracts were drawn up in a way that wouldn’t be too disastrous if (when?) they were acquired.
I have to wonder if an industry-wide slowdown to the ML frenzy is exactly what we need to give people and companies time to focus on solving real problems instead of just chasing easy money.
It is so frustrating to see the potential in the AI world and realize almost no one is really interested in building it.
I think what we got really good at is "perceptive" ML, like speech and image recognition, and those things do see industry applications, like self-driving cars or voice assistants.
I'd be interested to know where you see unrealized potential.
There are obviously places in my company where ML is making an enormous impact, it's just not something that's fit for every single place where decisions need to be made. Sometimes doing some analysis to inform blunt rules works just as well - without the overhead of ML model management.
It seems that I'm inverted from you. The Machine part of Machine Learning is likely of high business value, but the Learning part is the easier and better solution.
We do a lot of hardware stuff and our customers are, well let's just say they could use some re-training. Think not putting ink in the printer and then complaining about it. Only much more expensive. Because the details get murky (and legal-y and regulation-y) very quickly, we're forced to do ML on the products to 'assist' our users . But in the end, the easiest solution is to have better users.
 Yes, UX, training, education, etc. We've tried, spent a lot of money on it. It doesn't help.
That was one of the better insights with our team. We should measure the value-add of ML against a baseline that is e.g. a simple rules engine, not against 0. In some cases that looked appealing (‘lots of value by predicting Y better’) it turned out that a simple Excel sort would get us 90-98% of the value starting tomorrow. Investing an ML team for a few weeks/months then only makes sense if the business case on getting from 95% to 98% is big enough in itself. Hint: in many cases it isn’t.
I don't generally need to develop my own deployment infrastructure for every new project. However I've yet to see an ml team or company consistently use the same toolchain between 2 projects. The same pattern repeats across data processing, model development, and inference.
Oddly, adding more scientists appears to have a super-linear increase in cost - with the net effect being either duplicated effort or exhaustive search across possible solutions.
The problem is that we don't have enough ML engineers and many who go by this title are not really capable of doing the job. We're just coming into decent tools and hardware, and many applications are still limited by hardware which itself is being reinvented every 2 years.
Take just one single subfield - CV - it has applications in manufacturing, health, education, commerce, photography, agriculture, robotics, assisting blind persons, ... basically everywhere. It empowers new projects and amplifies automation.
With the advent of pre-trained neural nets every new task can be 10x or 100x easier. We don't need as many labels anymore, it works much better now.
Many data scientists I knew were either sitting on their hands waiting for data or working on problems that the downstream teams had no intention of implementing (even if they were improvements). I still really believe that ML (be it fancy deep learning or just evidence driven rules-based models) will effectively be table stakes for most industries in the upcoming decade. However, it'll take more leadership than just hiring a bunch of smart folks out of a PhD program.
I worked for a financial services co that saw massive gains from big data/ML/AWS. Given, we were already using statistical models for everything, we just now could build more powerful features, more complex models, and move many things to more-real time, with more frequent retrains/deploys bc of cloud.
I do agree that companies who don't already recognize the value of their data and maybe rely on a consultant to tell them what to do might not be in the position to really capitalize on it and would just be throwing money after the shiny object. It really does take a huge overhaul sometimes. We retooled all of our job families from analysts/statisticians to data engineers and scientists and hired a ton of new people
I've worked in Data Science customers facing roles for 2 companies, and one anecdotal correlation between success with Stats/ML/AI I've seen is how "Data Driven" people really are for their daily decision making. The more data driven you are, the more likely you are to identify a problem that can actually be improved by an Stat/ML/AI algorithm. This is because you really understand your data and the value you can get from it.
Everybody has metrics, KPIs, OKS, etc, but the reality is that there's a spectrum from 100% gut to 100% data driven. And a lot of people are on the gut side of things while thinking (or claiming they are) they are on the data side.
I'll provide an example. I currently work for a company that sells to (among others) companies working with industrial machinery. If your industrial machine runs in a remote area (e.g. an Oil Field), then any question about that machine starts with pulling up data. Being data driven is the only way to figure out what's going on. These folks have a good sense for identifying the value they can get from their data and they usually understand when you say dealing with their data is a engineering task in itself.
The other side of this is a factory filled with people. Since somebody is always operating and watching the machine, the "data driven" part is mainly alarms (e.g. is my temp over 100C) and some external KPI (e.g. a quality measurement). They are much less data driven than they think they are, and a lot of them don't understand what value they could get out of their data beyond some simple stuff you don't really need ML/AI for.
I mention industrial equipment because I think a lot of people (even me) are really surprised when they hear about people working in factories not being super data driven. You think of factories, engineering, and data as being very lumped together. It's amazing how many areas (sales, marketing, HR, are other great examples) exist where people aren't as data driven as they think they are.
In my former space (credit card fraud detection and underwriting), you obviously need a data driven solution. Without even considering latency requirements, you aren't do 6-10B manual decisions/year. The rationale for a more complex ML approach is easier to prove the ROI for, given the need is already there, just with an inferior technical solution.
It is clear that for most of the companies who are investing in deep learning are tangible results are always around the corner, and maybe 1 in 100 will build something worthwhile. But here is the carrot driving them all on, it's like the lottery: you have to be in to win. The stick is the fear that their competitors will do so.
This field is more art than science, give talented people incentive to play and don't expect too much for the next decade.
> They were forcing ML in to places it didn't (and may never) belong.
I find that I spend a lot of time as a senior MLE telling someone why they don’t need ML
Not because it would be a good use case (although there are some for our product), or because it would be of any practical benefit, but because it makes for good marketing copy. Never mind the fact that nobody on the team has any experience with machine learning (I actually failed the paper at university).
(or) In simple terms, is profitable commercial Deep Learning just for oligarchies?
2) Machine Learning
3) Robotic Process Automation
They felt RPA would help them stay more competitive since there are tons of smaller health care companies who are moving faster and innovating faster because they're not buying up companies and having to integrate all their technology at a sloth's pace. They thought RPA would be a way to mitigate these issues.
18 months later and the one manager, director and VP in my org has all but said they don't care about RPA, all their money is going into ML and AI. Even though in all the presentations I've seen them put on, its all blue skies and BS about "IF we had this, we COULD do this." Nothing concrete at all about how the plan to use ML to increase profit margins or reduce overhead.
Right now, our team is basically an afterthought in the company and I'm already starting to interview elsewhere with the knowledge at some point, they're going to kill my team and cut everybody loose.
I work in 2D animation and we were able to design our current pipeline around adopting ML at specific steps to remove massive amounts of manual labor.
I know this doesn't disprove your anecdote, I just wanted to point out that real businesses are using ML effectively to deliver real value that's not possible without it.
Any technology too complex for the managers who purchase for it to understand fully can be sold and oversold by marketing people as "the next big thing".
Managers may or may not see through that, but if their superiors want them to pursue it or if they need to pursue something in order to show they're doing something of value, then they're happy to follow where the marketers lead.
Java everywhere, set top TV boxes, IOT devices, transitioning mainframes to minis, you name it... the marketers have made a mint selling it, usually for little benefit to the companies that bought into it.
However someone is trying to make robotic process automation the Next Big Thing - which I think is hysterically funny.
That's because it didn't get a chance to mature and to show how it could be powerful. People kept trying to force hadoop into it and call themselves "big data experts"
We've gotten a bit more clarity in this world with streaming technologies. However, there hasn't been a good and clear voice to say "hey .. this is how it fits in with your web app and this is what you expect of it". (I'm thinking about developing a talk on this.. how it fits in [hint.. your microservice app shouldn't do any heavy lifting of processing data])
In practice most of these technologies and their peers exist to support real applications, and it would be almost immediately recognizable that they are the appropriate choice when working on a similar application. You don't need a streaming engineer/big data consultant to cram them in.
The easiest way to solve many problems is through lexers, regular expressions and plain-old pattern matching. But that doesn't sell, so, they call it AI anyways.
It's kind of batty actually, people looking for ideas to make money just been taking old ideas and attaching ML to the side of it as if that automatically made it better. And then not educating their customers on the limitations of ML both generally and with respect to their data size.
I personally think the companies that make and sell the software that the police used to make incorrect arrests should be legally liable. Yes, the police shouldn't have blindly trusted the software, but I guaran-fucking-tee you part of why they did is the marketing from the company themselves.
* Revealera.com crawls job openings from over 10,000 company websites and analyzes them for technology trends for hedge funds.
Btw. I don't like twitter's new feature that prevents everyone from responding to a tweet that was used by @fchollet. It no longer feels like twitter if you can't engage.
And I shall use this pulpit to demand, in a mixture of derision and righteous anger, that you defend your comme... ah never mind.
This may not be a new thought, but it's eloquently put. Thank you.
If you solve "trending products" by building a SQL statement that e.g. selects items with the largest increase of purchases this month in comparison to the same month a year ago, that's still "AI" to them.
Knowing this can save you a lot of wasted time.
> The AI effect occurs when onlookers discount the behavior of an artificial intelligence program by arguing that it is not real intelligence.
In the future, I expect ML to also fall out of the "AI" umbrella - it gets used primarily for "smart code we don't know• how to write", so once that understanding comes, it gets a more-specific name and is no longer "AI".
•"know" being intentionally vague here, as obviously we can write both query planners and ML engines, but the latter isn't nearly as commonplace yet to completely fall out of the umbrella.
In my experience, what many organizations lack is simple but high-quality "Business Analytics": Reporting & dashboards are developed that look good but jam too much information together. It is often the wrong information:
Something is requested, and the developer develops exactly what was asked. The problem is that it wasn't what was needed because the person making the request couldn't articulate the question in the same terms the developer would understand. The request will say "Give me X & Y" when the real question is "I want to understand the impact of Y on X". The person gets X & Y, looks at it every day in their dashboard, and never sees much that is useful. The initial request should always be the start of a conversation, but that often doesn't happen. A common result are people in departments spending tons of time in Excel sorting, counting, making pivot tables, etc., when all of that could be automated.
This is part of the reason why companies often go looking for some new "silver bullet" to solve their data problems. They don't have the basics down, and don't understand the data problems well enough to seek out a solution.
Without the skillsets to work with and then understand that data, they are forced into this long process of asking for data to be put into reporting and dashboards and then once they finally get them, either fixating on the limited metrics it provides while being oblivious to other context not in front of them, or to instead forced to start another long iteration to adjust that reporting and dashboards.
We’ve gone almost 30 years believing management was the sole skill required to manage teams and companies, but dealing with the new era of data is starting to show the limits
The most pathetic one is that:
Many major Nlp tasks have old SOTA in BERT just because nobody cared of using (not improving) XLnet on them which is absolute shame, I mean on many major tasks we could trivially win many percents of accuracy but nobody qualified bothered to do it,where goes the money then? To many NIH papers I guess.
There's also not enough synergies, there are many interesting ideas that just needs to be combined and I think there's not enough funding for that, it's not exciting enough...
I pray for 2021 to be a better year for AI, otherwise it will show evidence for a new AI progress winter
Little things compound such as optimizers ( Ranger/Adahessian), better RNN ( IndRNN, Linear Transformers, Hopfield networks ) and techniques (cache everywhere, Torch script,gradient accumulation training)
What kind of network are you using? I can do near-SoTA multi-task syntax annotation  with ~4000 tokens/s (~225 sentences/s) on a CPU with 4 threads using a transformer. Predicting 1000 words/second on a reasonably modern a GPU is easy, even with a relatively deep transformer network.
 8 tasks, including dependency parsing.
My understanding is that a CMU student interned at Google and developed most of the pieces of TransformerXL, which formed the basis of XLNet. The student and the Google researcher further collaborated with CMU researchers to finalize the work.
(For the record, I think the remainder of your points do not match my understanding of NLP, which I do research in, but I just really wanted to clarify the XLNet story a bit).
But GP is too focused on hyping XLNet for some reason. There are much more elegant attempts at improving the transformer architecture in just the past 8 months: Reformer, Performer, Macaron Net, and my current pet paper, Normalized Attention Pooling (https://arxiv.org/abs/2005.09561).
I've queried most of your examples on the SOTA database that is paperswithcode.com and they have almost zero results.
You illustrate the problem, if researchers like you don't even know the general SOTA, how can it be expected to be beaten?
But beyond scientists ignorance there is also the problem of models not submitting their results to paperswithcode.com or not testing them extensively but only on niche benchmarks. This second behavior sentence such potentially promising models to remain unknown and therefore mostly irrelevant.
It's always remarkable how one can be a smart researcher and yet not adjust its behavior to be rational regarding those two flaws (not seeking SOTA knowledge, and not promoting SOTA knowledge; READ, WRITE)
Which tasks remain to be tried? Most actually but an obvious one would be coreference resolution
It seems like the current approaches will always fall short of our loftier AI aspirations, but we’re reaching a level of mimicry where we can start to ask, “Does it matter for this task?”
That's the point - it does not need to be different. If it demonstrates similar improvement to what we saw with GPT-1 --> GPT-2 --> GPT-3, then it will be enough to actually start using it. It's like the progression MNIST --> CIFAR-10 --> ImageNet --> the point where object recognition is good enough for real world applications.
But in addition to making it bigger, we can also make it better: smarter attention, external data queries, better word encoding, better data quality, more than one data type as input, etc. There's plenty of room for improvement.
To be fair, things like CNN's and BERT were definitely massive improvements, but a lot of modern AI is just throwing compute at problems and seeing what sticks.
As you can imagine, searching Google for "linkedin job posting data" doesn't work so great. The closest supporting data I could find is this July report on the blog of a recruiting firm named Burtch Works . They searched LinkedIn daily for data scientist job postings (so not specifically deep learning) and observed that the number of postings crashed between late March and early May to 40% of their March value, and have held steady up to mid-June, where the report data period ends.
There's also this Glassdoor Economic Research report , which seems to draw heavily from US Bureau of Labor Statistics data available in interactive charts . The most relevant bit in there is that the "information" sector (which includes their definitions of "tech" and "media") has not yet started an upward recovery in job postings, as of July.
I work at an Residential IoT company, there are quite a few really valid use cases for Big Data and even ML. (Think about predictive failure).
We hired more than one expensive data scientist in the past few years, and had big strategies more than once. But at the end of the day it's still "hard" to ask a question such as "if I give you a MAC Address give me the runtime for the last 6 months".
We're trying to shoot for the moon, when all I've ever asked is I want an API to show me indoor temp for particular device over a long period.
It's not just the data scientists fault. I once heard our chief data scientist point out that they don't want to hand off a linear regression as a machine learning model -- as if a delivered solution to a problem has a minimal complexity. She absolutely had a point.
Clients are paying for a Ph.D. to solve problems in a Ph.D way. If we delivered the client a simple, yet effective solution, there's the risk of blow-back from the client for being too rudimentary. I'm certain this extends attitude extends to in-house data scientists as well. Nobody wants to be the data "scientist" who delivers the work of a data "analyst." Even when the best solution is a simple SQL query.
Our company kind of sidesteps this problem by having a tiered approach, where companies are paying for engineering, analysis, visualization, and data science work for all projects. So if a client is at the simple analysis level, we deliver at that level, with the understanding that this is the foundational work for more advanced features. It turns out to be a winning strategy, because while every client wants to land on the moon, most of them figure out that they are perfectly happy to with a Cessna once they have one.
Ideally, "in a PhD way" is with careful attention to problem framing, understanding prior art, and well-structured research roadmaps.
I worry about PhD graduates who seemingly never spent much time hanging out with postdocs. Advisors teach a lot, but some approach considerations can be gleaned more easily from postdocs gunning for academic posts.
Also, (for supervised classification problems) labelling is a big problem.
It is almost as if we need a "data janitor" title.
In my company though, we've been applying DL with great success for a few years now, and there are at least five years of work remaining. And that's not spending any time doing research or anything fancy: just picking the low-hanging fruit.
You need not only real use cases, but use cases that happens to well with DL’s trade offs and limitations. I think many companies hired with very unrealistic expectations here.
In conjunction with this, it turns out 99% of the problems the customer is facing, despite their belief to the contrary, aren't solved best with ML, but with good old fashioned engineering.
In cases where the problem can be approached either way, the ML approach typically takes much longer, is much harder to accomplish, has more engineering challenges to get it into production, and the early ramp-up stages around data collecting, cleaning and labeling are often almost impossible to surmount.
All that being said, there are some things that are only really solvable with some ML techniques, and that's where the discipline shines.
One final challenge is that a lot of data scientists and ML people seem to think that if it's not being solved using a standard ML or DL algorithm then it isn't ML, even if it has all of the characteristics of being one. The gatekeeping in the field is horrendous and I suspect it comes from people who don't have strong CS backgrounds wrapping themselves too tightly against their hard-earned knowledge rather than having an expansive view of what can solve these problems.
Instead of focusing our energies on the infrastructure and quality of data around machine learning, there's eagerness to take bad data to very high-end models. I've seen it again and again at different companies, usually always with disastrous consequences.
A lot of these companies would do better to invest in engineering and domain expertise around the problem than worry about the type of model they're using to solve the problem (which usually comes later, once the other supporting maturity pieces are in place)
There are 5 ML models that we maintain where I work, and none of then are more complicated than linear regression or random forests. Convincing me to use something more complex would take an enormous amount of evidence. Domain knowledge is king.
We did have courses explaining the "around" of the whole process though, but that's not as hyped.
I'll note that in my personal anecdote, the megacorps remain interested in and hiring in ML as much as ever.
The author disables tweet replies so I'm not sure where they get their numbers from.
However, megacorps do not seem to suffer much for such continuous lagging in hiring. I do not know why this is so: is it that they still hire smart engineers who can easily change groups and fields or do they work on their core technology to help build the next peak (after the debris are washed away in a fad crash there is often a technology renaissance).
Data science jobs are not slowing down, though they're not really increasing either.
In comparison since 2016 software engineering jobs revolving around building up systems for data scientists have increased 6 fold, maybe even more since I last looked.
I'm completely unsurprised by this. Regularly, at lunch, I'll ask my coworkers if they know of any DL applications that are making O($billions), and no one knows any outside of FAANG.
FAANG is making an insane amount of money due to DL. Outside of them though, I don't know who's making money here. When I was interviewing for jobs, there were a ton of startups that were trying to do things with DL that would have been better done with a few if statements and a random forest, and that had a total market size in the millions.
I think that, eventually, there'll be a market for this stuff, but I'm not convinced that it's anywhere near being widespread.
I was also a consultant before my current role. The vast majority of non-tech firms don't have their data in well organized + cleaned databases. Just moving from a mess of Excel sheets to Python scripts + SQL databases would have made a HUGE difference to the vast majority of clients I worked with, but even that was too big of a transformation.
Basically, everyone with the sophistication to take advantage of DL/ML already has the in-house expertise to do it. There's almost no one in the intersection of "Could make $$$ doing DL" && "Has the technical infrastructure to integrate DL".
Yes, you're not going to achieve state-of-the-art performance with logistic regression. But for most problems the difference between SOTA and even simple models is not nearly as large as you might think. And two, even if you're cargo-culting SOTA techniques, it's probably not going to work unless you're at an org with an 8-digit R&D budget.
Small companies need to frame the problem as:
1) Do we have a problem where the solution is discrete and already solved by an existing ML/DL model/architecture?
2) Can we have one of our existing engineers (or a short-term contractor) do transfer learning to slightly tweak that model to our specific problem/data?
Once that "problem" actually turns into multiple "machine learning problems" or "oh, we just need todo this one novel thing", they will probably need to bail because it'll be too hard/expensive and the most likely outcome will be no meaningful progress.
Said in another way: can we expect an engineer to get a fastai model up and running very quickly for our problem? If so, great - if not, then bail.
ie: the solution for most companies will be having 1 part-time "citizen data scientist"  on your engineering team.
1. There's no clear problem statement. They have never formulated one and now trying to bolt ML on to their decision making.
2. They don't have well catalogued data for engineers/scientists to work with because they never tried to do rigorous analysis of data before ML became a thing.
3. Managers have no idea how to deal with data driven insights. What if the results are completely unintuitive to them? Are they going to change their processes abruptly? What if the results are aligned with what they have been always doing? Is it worth paying for something that they have been doing intuitively for decades?
I'm not a data scientist. But the biggest complaint I hear from my colleagues is that they lack data to train models.
Very few businesses I know actually have a deep learning problem. But they want a deep learning solution. Lest they get left out of the hype train.
Have they? Specifically, have they "collapsed" relative to the average decline in job listings mid-pandemic?
80 or 90% of what companies are doing with machine learning results in systems with a high computing cost that are clearly unprofitable if seen as revenue impacting units. Many similar things can be achieved with low-level heuristics that result in way smaller computing costs.
But nobody wants to do that anymore. There's nothing "sexy" or "cool" about breaking down your problems and trying to create rule-based systems that addresses the problem. Semantic software is not cool anymore, and what became cool is this super expensive blackbox that requires more computer power than regular software. Companies have developed this bias for ML solutions because they seem to have this unlimited potential for solving problems, so it seems like a good long term investment. Everyone wants to take that bus.
Don't get me wrong. I love ML, but people use it for the stupidest things.
It's just that much that needed to be invented has been invented and now it's time to apply it everywhere it can be applied, which is a great many place.
It is context.
1. Start and stop a podcast.
2. Play music
3. Ask for the time
The phone understands me, but then android breaks the flow, so I have to use my hands.
1. It will ask me to unlock the phone first? I have gloves and a mask on. It won't recognize my face, and my gloves don't register touches. Why do I have to unclock the phone to play music in the first place.
2. It gets confused on which app to play the music/podcast on. Wants to open youtube app, or spotify, and so on ...
3. Not consistent. I can say the same thing, and sometimes it will do one things, and another next time.
4. If I'm playing a video, and I want to show it full screen. I have to maximize and touch the screen. Why can't it play full screen be default.
I can imagine that others would be equally reluctant to hire someone they've only seen through Zoom.
They've long resisted that, of course, but I'm pretty sure half the popular of deep learning was it leveled the playing field, making engineers as ignorant of the inner-workings of their creations as the middle managers.
May the middle-manager-fication of work, and acceptance of ignorance that goes with, fail.
Then again, I do prefer it when many of those old moronic companies flounder, so maybe this is a bad thing that they're wising up.
It's tough atm.
I have seen so many more projects derailed by a lack of domain knowledge than I have seen for lack of technical understanding in algorithms.
Turtles all the way down, I guess.
Teams adding measurable value for their companies should be fine but others might not be.
He seems pretty oblivious to the fact that simply not mentioning them would make the problem go away as no one beside him seems to actually care.