Hacker News new | past | comments | ask | show | jobs | submit login
Deep learning job postings have collapsed in the past six months (twitter.com/fchollet)
471 points by bpesquet on Aug 31, 2020 | hide | past | favorite | 264 comments



I've worked in lots of big corps as a consultant. Every one raced to harness the power of "big data" ~7 years ago. They couldn't hire or spend money fast enough. And for their investment they (mostly) got nothing. The few that managed to bludgeon their map/reduce clusters in to submission and get actionable insights discovered... they paid more to get those insights than they were worth!

I think this same thing is happening with ML. It was a hiring bonanza. Every big corp wanted to get an ML/AI strategy in place. They were forcing ML in to places it didn't (and may never) belong. This "recession" is mostly COVID related I think - but companies will discover that ML is (for the vast majority) a shiny object with no discernible ROI. Like Big Data, I think we'll see a few companies execute well and actually get some value, while most will just jump to the next shiny thing in a year or two.


"Like Big Data, I think we'll see a few companies execute well and actually get some value, while most will just jump to the next shiny thing in a year or two."

Here's another aspect - in many places nobody listens to the actual people doing the work. In my last job I was hired to lead a Data Science team and to help the company get value of Stats/ML/AI/DL/Buzzword. And I (and my team) were promptly overridden on every decision of what projects an expectations were realistic and what were not. I left, as did everybody else that reported to me, and we were replaced by people who would make really good BS slides that showed what upper management wanted to see. A year after that the whole initiative was cancelled.

Back in 2000 I was in a similar position with a small company jumping on the internet as their next business model. Lots of nonsense and one horrible web based business later, the company failed.

It's the same story over and over again. Some winners, lot of losers, many by self-inflicted wounds.


If you think about it, that's the natural outcome. Why? Because people in corporations don't have the incentive to benefit the business but to progress their careers and that's done through meeting the goals for their position and make their upper ups progress with their careers too.

So essentially, you have a system where people spend other people's resources for living and their success is judged by making the chain link above happy. In especially large companies it's easy to have a disconnect from the product because people in the top specialise in topics that have nothing to do with the product. If the people at the top want to have this shiny new thing that the press and everyone else is saying that it's the next big thing, you better give them the new shiny thing if you want to have a smooth career. In publicly traded companies, this is even more prevalent because people who buy and sell the stocks would be even more disconnected from the product and tied to the buzzwords.

The more technical minded people who have the hunch on tech miss the point of the organisation that they are in and get very frustrated. It's probably the reason why startups can be much more fulfilling for deeply technical people.


>If you think about it, that's the natural outcome. Why? Because people in corporations don't have the incentive to benefit the business but to progress their careers and that's done through meeting the goals for their position and make their upper ups progress with their careers too.

This is one of the reasons I roll my eyes whenever I read something like "McKinsey says 75% of Big Data/AI/Buzzword projects do not deliver any value." What's the baseline for failing and/or delivering zero value because those projects were destined to fail?


> because of silly management decisions?

The whole point is, from their point of view those decisions are rational. It's much more lucrative from their (managers') personal point of view to develop a smokes-and-mirrors looks-good-on-ppt AI project. To be safe from risk, don't give the AI people too much responsibility, let them "do stuff", who cares, the point is we can now say we are an AI-driven company on the brochures, and we have something to report up to upper management. When they ask "are we also doing this deep learning thing? It's important nowadays!" we say "Of course, we have a team working on it, here's a PPT!". An actual AI project would have much bigger risks and uncertainty. I as a manager may be blamed for messing up real company processes if we actually rely on the AI. If it's just there but doesn't actually do anything, it's a net win for me.

Note how this is not how things run when there are real goals that can be immediately improved through ML/AI and it shows up immediately on the bottom line, like ad and recommendation optimizations in Youtube or Netflix or core product value like at Tesla etc.

The bullshit powerpoint AI with frustrated and confused engineers happens in companies where the connection is less direct and everyone only has a nebulous idea of what they would even want out of the AI system (extract valuable business knowledge!).


I think the problem a lot of places has been wanting "appealing" ML/AI solutions. The kind you write papers about and put on Powerpoints.

The useful AI/ML isn't glamorous, it's quite boring and ugly. Things like spam detection, image labeling, event parsing, text classification.

It's hard to get a big, shiny model into direct user facing systems.


What would you categorize as shiny in this case? "spam detection, image labeling, event parsing, text classification" can be implemented in lots of ways, simple and shiny as well.

Either way I don't think it matters too much because people can't really tell simple from shiny as long as the buzzword bullet points are there.

The point is rather that the job of the data science team is to deliver prestige to the manager, not to deliver data science solutions to actual practical problems. It's enough if they work on toy data and show "promising results" and can have percentages, impressive serious charts and numbers on the powerpoint slides.

I've heard from many data scientists in such situations that they don't get any input on what they should actually do, so they make up their own questions and own tasks to model, which often has nothing to do with actual business value, but they toy around with their models, produce accuracy percentages and that's enough.


OK, so, we're scientists...and we're in the middle of a pandemic...amplifying/arguing over a graph showing a steep decline in job listing...that doesn't control for the pandemic...or even include a line for "overall job loss"...

https://www.burning-glass.com/u-s-job-postings-increase-four...

Looks like all job postings "collapsed during the pandemic"


yeah looks like at the least you might have lines for "overall CS based jobs" and "overall tech industry" and see the same sort of fall off appears. While not all that scientific either logically if you see similaries you can cast some more doubt/support on the hypothesis that ML is special and failing. How is it doing relative to other "hyped" or even just plain technical hiring/firing trends.


Followed immediately by the solution: Hire McKinsey analysts to help you deliver insights -- which may or may not get implemented or deliver the results, but it won't matter because everyone has moved on to the next "project".


Why do you roll your eyes? Isn't it a useful metric to know that most of the projects that are hiring these buzzword technologies are destined to fail (whether that's because the problem space wasn't fit for ML or whether management went on a hiring spree to pump their resume)?


I dislike these types headlines because I've interacted with a lot of people who see this as evidence that ML/AI is BS and destined to fail. The reality is that a project with unrealistic expectations is going to fail regardless of it being an AI project or somebody baking a loaf of bread at home.

It's important to understand why stuff fails. That's the only way to stop things from failing in the future and make sure people are on the right path to not failing. If a large number project failures are management failures, it's useful to know that. Otherwise you try to fix everything except the management structure.


Statistically, though, that's what the people in this thread are saying — that the majority of the projects in ML/AI are destined to fail because they're BS with unrealistic goals.

"Personalization" is in a similar place for digital publishing; everyone wants it, products and services carry big price tags, and few organizations want to invest in foundational work or simple, iterative improvements. So they swing for the stars with unrealistic goals like "micro-targeted messaging perfectly tailored to every visitor, no matter where they are in the customer journey" and the results are predictable…

I take the increasingly grim accounts of project failure rates from analyst firms as a good sign — they can be used to sober up executives with unrealistic dreams.


Funny thing is, McKinsey consultants exactly knows why it fails, but won't say it because the responsibility falls on the execs that choose them for their reports or consulting. A paying customer should not have any reason to fear to regret choosing you as a consultant.


McKinsey DS here. I don't think I've ever heard such a claim about data science whatever, although I would probably believe it. I do hear such claims a lot in the context of big transformations.

These claims are usually high level and based on surveys or whatever. Failing usually means leadership gave up. As far as high level awareness of project success rates, it's probably accurate enough to justify the point: companies are generally bad at doing X. This tends to be true for many different kinds of X, because business is hard.

I generally don't agree that people make up destined to fail projects for selfish gains. I'm sure it happens, but that seems bottom of the barrel in terms of problems to fix. With DS specifically, leaders just don't know what to do. So they hire data scientists, and the data scientists don't know anything about the business, so they make some dashboards or whatever and nobody uses them. It's really not easy. Business is hard.


Others have also pointed out that too many ML engineers and researchers rush into problems and end up with useless results also hinges on this. These people have to deliver something because their job depends on it. Everything is move fast even when that doesn't make sense.


At my last employer, you had a hard time moving up the career ladder unless you could point to concrete results with dollar signs attached. And the OOM on those dollars started at 7 figures.

Similarly, you couldn’t just fake these types of savings because they needed to be showing up in budget requests. If I saved $10M in hardware costs, then that line item in the budget better reflect it.


> Because people in corporations don't have the incentive to benefit the business but to progress their careers

AKA the principal-agent problem:

https://en.wikipedia.org/wiki/Principal%E2%80%93agent_proble...


Putt's Law and Pournelle's iron law of bureaucracy also have pertinent (disastrous) effects. In the public sector they are even worse.

https://en.wikipedia.org/wiki/Putt%27s_Law_and_the_Successfu...

https://en.wikipedia.org/wiki/Jerry_Pournelle#Pournelle's_ir...


This is why almost all data scientists and ML engineers that succeed in many corporate structures are essentially "yes men".

Source: https://www.interviewquery.com/blog-do-they-want-a-data-scie...


> It's probably the reason why startups can be much more fulfilling for deeply technical people

I think the opposite is just as often true: Startups often don't have any real customers, so it's all about buzzwords and whatever razzle-dazzle they can put in a pitch deck to raise the next round.


In my experience startups have the delusion and stupidity but they burn thru the 7 stages of death faster so there's the novelty of new beginnings!


In my experience a startup of a certain size is founders and sales doing razzle-dazzle, "deeply technical people" jerking off to the latest fad, and no customers.


Reminds me of the AAA gaming industry. Jim Sterling made some great and insightful videos about it.


Anyone can sell the future.


This exactly describes my experience with self driving cars


Can you share more details?


I've heard this happen in a lot of places — companies want to be "data-driven", but then leadership simply ignores the data. I think being data-driven is something that is built into company culture, or otherwise it's too easy to just ignore the results and ship.

The place I currently work is data-driven (perhaps to a fault). Every change is wrapped behind an experiment and analyzed. Engineers play a major role in this process (responsible for analysis of simple experiments), whereas the data org owns more thorough, long-term analysis. This means there are a significant number of people invested in making numbers go up. It also means we're very good at finding local maxima, but struggle greatly shipping larger changes that land somewhere else on the graph.

Some of the best advice I've heard related to this is for leadership to be honest about the "why". Sometimes we just want to ship a redesign to eventually find a new maximum, even through we know it will hurt metrics for a while.


Imagine what it must be like for the senior leadership of an established company to actually become data-driven. All of a sudden the leadership is going to consent to having all of their strategic and tactical decision-making be questioned by a bunch of relatively new hires from way down the org chart, whose entire basis for questioning all that expertise and business acumen is that they know how to fiddle around with numbers in some program called R? And all the while, they're constantly whining that this same data is junk and unreliable and we need to upend a whole bunch of IT systems just so they can rock the boat even harder? Pffft.


I expect data driven leaders to be good at analyzing data. The rest are bullshitters.


That just reads as leaders who are high on cognitive bias.

...probably true to some extent, but not all leaders are self important ass hats who refuse to acknowledge they are simply “making decisions” not “making good decisions”. Most leaders are doing the best they can (often even very well) with the insights available to them.

I don’t think most data teams are really at fault; they’re just doing what they’re told.

The problem imo lies with the analysts who fail to do anything useful with the data they’re given, and demand constant changes from the data team because they want to deliver silver bullet results to the leadership level.

That’s the problem layer; people who want to be important but have nothing to offer, whipping their data team to produce rubbish and then blaming them for either a) not producing anything fast enough or b) not making the numbers big enough.


I've seen this happen in private industry and government, and is most of the reason analytics feels like a dead-end career if it is not the primary product of the firm.

"Data science"/analytics groups are a cost center telling management things they do not want to hear, and disrupting management narrative (with receipts). There's no point to it; either you deliver "moneyball"-like opportunities that are ignored, or torture the data to fit existing narratives. In both cases, you eventually get stabbed by an experienced bureaucratic knife fighter.


You give them tons of data and then all you hear is "I'm gonna have to go with mah gut on this one"


Goodhart's law is a thing...


Ditto. The same thing happened to me a few companies back. I lead a data science team of two solving difficult problems that would determine the company's success. However, management was the type to be uncomfortable with ignorance so they had to pretend to know data science and demand tasks be solved a certain way, which for anyone who has any familiar experience has already guessed it: what they were pushing made no sense.

So, I switched from predictive analytics and put on my prescriptive analytics hat. Over the time I was there I created several presentations containing multiple paths forward letting management feel like they were deciding the path forward.

This continued until I was fired. The board didn't like that I wasn't using a neural nets to solve the companies problems. Startups often do not have enough labeled data, so DNNs were not considered. Oddly, I didn't get a warning or a request about this before being let go. I suspect management got tired of me managing upward. In response my coworker quit right then and there and took me out to lunch. ^_^


This can be applied as "nobody listens to the people who actually do the work" as in company hires ML/AI experts to analyze purchase records and service records, and spits back out trends that the service front line workers (tier 1) already knew dead solid.

Then the company doesn't listen to either group of people (neither tier 1 sales/support people, nor the ML people) and then fires / shuts down the entire division because "upper management didn't find value"


Or it could be that a lot of data is wrong. It may be "technically" correct, ie the table in a database produces X. It is no surprise that executives would ignore what the "data" says because they don't trust it.

A lot of time they are right to ignore it. I've seen tables say X, but there was some flaw up the capture stack. Very few data analyst have the broad based knowledge and dedication needed to trace the data stack to establish the needed trust with the executive team.


Probably the biggest part of a lot of projects is cleaning up and normalizing the data (just like it was with data warehousing last century). A lot of high power data scientists don't want to hear that--especially if they're research-y.


If you don't at least tolerate data cleaning, you should find a different job than data scientist.

I would have expected the researchy people to be better at it, as often you'll need to collect and analyse your own data during grad programs, and thus have some experience.


I kind of like data cleaning, I'm a weirdo like that


Some of the better historic manufacturers that "made it" were known to have good managers go and visit the filthy masses on the factory floor and get a feel for what's going on. It was very valuable for me when I used to help with manufacturing testing. I always spent some time with the techs and the people on the floor assembling stuff. A lot of it was useless but a lot of it was worthwhile and we learned to trust each other better instead of the "eggheads upstairs" and the "jarheads downstairs" that seemed to be most prevalent there.


I think what you have done is similar to what is depicted in "The Goal" by Eliyahu M. Goldratt. The graphic novel depicts it in a more succinct manner.


Contempt for this kind of knowledge is almost a religion in Silicon Valley.


I did data a decade ago at a school district. A more experienced mentor at another school district let me in on a secret that is true across all industries no matter the size or type.

If you give executives data and they don't like the results, they will ask you to tweak parameters until the data represents what they want.


I think if a business is set up to scale by volume they can see gains from it. For example, say a business is already doing well at 100k conversions a day. They manage to apply "big data/ML" to optimize those conversions and gain a 3% lift, they are now making over a 1,095,000 extra conversions a year they would not have otherwise made.


So they need to make $1 profit for each of those conversions just to make it worth if they hire 1 ML scientist for 95k/year. Or $10 if they hire 10 for 950k/year in total. And so on...

And there‘s the point where - IMHO - 3% gain may not be profitable enough.


Extra conversions/year, so 1 DS at 95k means 1mm net profit


I think the only places where it yields consistent results is organizations that have at least 80% of their staff doing the ML/DS work and less than 20% managing the people doing the work (up and down in the organization.)


Currently experiencing the exact same thing. Roughly 15 machine learning engineers through hiring and acquisitions and all of them have no product or management power. Everyone with any management or product power has zero ML/Data Science experience. I spend half my day explaining to managers what ML is.


> they paid more to get those insights than they were worth!

This understates how awful ML is at many of these companies.

I've seen quite a few companies that rushed to hire teams of people with a PhD in anything that barely made it through a DS/ML boot camp.

To prove that they're super smart ML researchers without fail these hires rush to deploy a 3+ layer MLP to solve a problem that need at most a simple regression. They have no understanding of how this model works, and have zero engineering sense so they don't care if it's a nightmare of complexity to maintain. Then to make sure their work is 'valuable' management tries to get as many teams as possible to make use of the questionable outputs of these models.

The end is a nightmare of tightly coupled models that nobody can debug, trouble shoot or understand. And because the people building them don't really understand how they work the results are always very noisy. So you end up with this mess of expensive to build and run models talking noise to each other.

When I saw this I realized data science was doomed in the next recession, since the only solution to this mess is to just remove it all.

There is some really valuable DS work out there, but it requires real understanding of either modeling or statistics. That work will probably stick around, but these giant farms of boot camp grads churning out keras models will disappear soon.


My sense is that the original sin here is conflating data science with machine learning.

A good data scientist might choose to use machine learning to accomplish their job. Or they might find that classical statistical inference is the better tool for the task at hand. A good data scientist, having built this model, might choose to put it into production. Or they might find that a simple if-statement could do the job almost as effectively but not nearly as expensively. A good data scientist, having decided to productionize a model, will also provide some information about how it might break down - for example, describing shifts in customer behavior, or changes in how some input signal is generated, or feedback effects that might invalidate the model.

OTOH, if your job has been framed in terms of cutting-edge machine learning, then you may well know - at a gut level, if not consciously - that your job is basically just a pissing match to see who can deploy the most bleeding-edge or expensive technology the fastest. It's like the modern hospital childbirth scene in Monty Python's The Meaning of Life, where the doctor is more interested in showing off the machine that goes, "ping!" in order to impress the other doctors than he is in paying attention to the mother.


This matches my experience exactly.

About 5 years ago I was coming out of academia with a PhD in chemistry wanting to get into tech. Someone pointed me toward data science and I was immediately pulled in by the potential for deep insights and working with interesting data sets. After getting into an industry research position, I was quickly disillusioned by the talk from our leadership of how we needed to become an "AI company" and discovered that was I really wanted to be doing was classic algorithm development, not data science.

Now, I've see so much poorly implemented, unnecessary machine learning applied to problems that didn't need it that I first assume that any machine learning project is a bad technical decision until proven otherwise. I've happily moved into an engineering role building interesting pipelines.


There’s people who consider classical inference and the like to be machine learning just as much as neural nets are. I like that perspective.


There are some things, like OLS and logistic regression, that are commonly used for both purposes. But there's a sort of moral distinction between machine learning and statistical inference, driven by whether you consider your key deliverable to be y-hat or beta-hat, that ends up having implications.

For example, I can get pretty preoccupied with multicollinearity or heteroskedasticity when I'm wearing my statistician hat, while they barely qualify as passing diversions when I'm wearing my machine learning engineer hat. If I'm doing ML, I'll happily deliberately bias the model. That would be anathema if I were doing statistical inference.


Oh gotcha. That’s an interesting way to draw the line and I appreciate the distinction.


And this is a good thing!

To be fair, I started to understand why developers gave out about bootcamp grads lacking a foundation when the bootcamps came for my discipline (data science).

The PhD fetish is pretty mental (even though I have one), as it's really not necessary.

Additionally, everyone thinks they need researchers, when they really, really don't.

Having worked with researchy vs more product/business driven teams, I found that the best results came when a researchy person took the time to understand the product domain, but many of them believe they're too good for business (in which case you should head back to academia).

What you actually need from an ML/Data Science person:

- Experience with data cleaning (this is most of the gig)

- A solid understanding of linear and logistic regression, along with cross-validation

- Some reasonable coding skills (in both R and Python, with a side of SQL).

That's it. Pretty much everything else can be taught, given the above prerequisites.

But it's tricky for hiring managers/companies as they don't know who to hire, so they end up over-indexing on bullshitters, due to the confidence, leading to lots of nonsese.

And finally, deep learning is good in some scenarios and not in others, so anyone who's just a deep learning developer is not going to be useful to most companies.


Just an anecdote but if you go to most baseball data departments, where there is real competition between teams, you don't just have PHds. You have people with undergrads/domain knowledge, and people with PHds.

This isn't to say that PHd knowledge isn't valuable but if you look at firms in finance that have had success with data i.e. RenTech, they hire very smart people with PHds but it isn't only the PHd. You need someone who has the knowledge AND someone who has common sense/can get results. That is very hard to do correctly (and yes, some people who come from academia literally do not want anything to do with business...it is like the devs who come from a CS PHd and insist on using complicated algo and data structure everywhere, optimising every line, etc.).


*PhD or Ph.D.

It's an abbreviation of the Latin "philosophiae doctor" == "doctor of philosophy".

Getting one is as much about persistence as intelligence, and makes you very knowledgeable about one narrow area, so what you say makes sense. Academic researchers usually branch out into other fields and subfields as well, but straight out of a PhD, narrow and deep is what you tend to get.


I worked in a place full of deep learning PhDs, and you'd have people trying to apply reinforcement learning to problems that had known mathematical solutions, and integer programming problems.

I don't think the issue is just that companies hire people who are awful at ML, it's also that people are trying to shoehorn deep learning into everything, even when it currently has nothing to offer and we have better solutions already. IMHO, we're producing too many deep learning PhDs.


This is just my general sense, as a very non-expert with more experience of doing than theory...but the benefit is someone knowing the theory AND being able to translate that into revenue.

I think most people view the hard part as doing the PHd, and so lots of people value that experience, and because they have that experience you have this endowment effect: wow, that PHd was hard, I must do very hard and complex things.

To give you an example: Man Group. They are a huge quant hedge fund, in fact they were one of the first big quant funds. Now, they even have their own program at Oxford University that they hire out of...have you heard of them? Most people haven't. Their performance is mostly terrible, and despite being decades ahead of everyone their returns were never very good (they did well at the start because they had a few exceptional employees, who then went elsewhere...David Harding was one). The issue isn't PHds, they have many of them, the issue is having that knowledge AND being able to convert it.

I think this is really hard to grasp because most people expect problems to yield instantly to ML but, in most cases, they don't and other people have done valuable work with non-ML stuff that should be built on but isn't because domain knowledge or common sense is often lacking.

A similar thing is people who come out of CS, and don't know how to program. They know a bit but they don't know how to use Git, they don't know how to write code others can read, etc.


The Man Group has had respectable returns, especially during Coronavirus. Nothing amazing, but certainly not terrible. Regardless, there's more to the picture: Sharpe ratio, vol, correlation to the market, etc


That isn't the case. First, I was talking about multi-decade, not how have they done in the last few hours. Second, their long-term returns haven't been good. They lagged the largest funds (largely because their strategy has mostly been naive trend-following). Third, you are correct that their marketing machine has sprung into action recently. But how much do you know about what trades they are making? If you were around pre-08, you may be familiar with the turn they have made recently (i.e. diving head first into liquidity premium trades with poor reasoning, no fundamental knowledge).

And again, the key point was: they have had this institute for how long? Decade plus? Are they a leading quant fund? No. Are they in the top 10? No. Are they doing anything particularly inventive? See returns. No.


How is this any different to developers who insist on using some shiny new web framework, micro service spaghetti and kubernates overkill infrastructure for their silly little CRUD app?


I don't think it is any different. Overvaluing the latest hotness is extremely common in the tech industry and is one of my least favorite parts of it.


Unfortunately, this is where the incentives of the company and that of the employee diverges. For the employee, if they choose some simpler, appropriate model or solution to the problem, they will not be able to get that next DL job. Especially early in their career. I cannot bring myself to do resume driven development, but I understand why people do it.


But you probably don't need a DL job. As my dad always said, as long as you make them/save them money, they'll never fire you.

I know that I (as a DS Lead/Manager) would hire someone who uses an appropriate solution to a business problem above someone who has an intricate knowledge of applying PyTorch to inappropriate problems.

But maybe I'm in a minority here.


It sounds so painful as someone all-in this area. But I have to agree on a task of overdoing with fancy models. Nevertheless, the most common ML algo in industry is still linear regression along with boostraping.


It's just a process of exploration, people trying out ideas to see what works. Over time, with sharing of results, we will gradually discover more nuanced approaches, but the exploration phase is necessary in order to map a path forward and train a generation of ML engineers and PM's who don't have seniors to learn from.

Of course it sucks on the short term, but there is zero chance the field will be abandoned. It has enough uses already.


But the exploration should be led by the data scientists, not by the suits who think they need a specific type of model.


I also witnesses this first hand at a Biotech company I worked at... we were using many variants of machine learning algorithms to develop predictive models of cell culture and separation processes. Problem is... the models have so many parameters in order to get a useful fit that the same model can also fit a carrot or an elephant. We found that dynamic parameter estimation on ODE/DAE/PDE system models, while harder to develop, actually worked much better and gave us real insight into the processes.

So now my advice is others is "if you can start with some first principles equation or system of equations... start there and use optimization/regression to fit the model to the data."

AND: "if you don't think such equations exist for your problem... read/research more, because some useful equations probably do exist."

This is usually pretty straightforward for engineering and science applications... equations exist or can be derived for the system under study.

In my very limited exposure to other areas of machine learning application... I have found quite a bit of mathematical science related to marketing, human behavior, etc.


Dyson asked Fermi about his take on his model fitting with four parameters. The reply was: I remember my friend Johnny von Neumann used to say, with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.


Also reminds me this one:

> Everything is linear if plotted log-log with a fat magic marker


Kind of weird that they would use ML/AI for a separations process. Separations and chemical engineering in general absolutely LOVES parameters and systems of equations. And don't go anywhere near colloids, those have so many empirically sourced parameters it will make your head spin.


You do this when you're a vendor working for big pharma and the people there can't/won't give you/don't understand the relevant quantities and aren't even familiar with standard models of the process despite their being formally trained chemical engineers and the models being 50 years old. Speaking from direct experience with several companies that are trying to bring us covid vaccines


You make it sound like "overfitting" is not an available concept machine learning researchers have even heard of or dealt with. If you tell me you don't have enough data that even the smallest model you can generate will still overfit, then that is another matter. But there are well known techniques and analytical processes to overcome overfitting. Also using situation specific equations is a common practice in ML and is usually referred to as Feature Engineering.


I completely agree with this sentiment, I've seen a lot of people throw ML at problems because they don't know much mathematics. Especially when you have a lot of data, I can understand the allure of just wiring up the input & output to generate the model.


Feature engineering is a big part of ML. If you know something about the process you should incorporate that..


Ironically, I worked on a product that had a classic use case for machine learning during this time period and still had great difficulty getting results.

It was difficult to attract top ML talent no matter how much we offered. Everyone wanted to work for one of the big, recognizable names in the industry for the resume name recognition and a chance to pivot their way into a top role at a leading company later.

Meanwhile, we were flooded with applicants who exaggerated their ML knowledge and experience to an extreme, hoping to land high paying ML jobs through hiring managers who couldn’t understand what they were looking for. It was easy to spot most of these candidates after going through some ML courses online and creating a very basic interview problem, but I could see many of these candidates successfully getting ML jobs at companies that didn’t know any better. Maybe they were going to fake it until they made it, or maybe they were counting on ML job performance being notoriously difficult to quantify on big data sets.

Dealing with 3rd party vendors and consulting shops wasn’t much better. A lot of the bigger shops were too busy with never ending lucrative contracts to take on new work. A lot of the smaller shops were too new to be able to show us much of a track record. Their proposals often boiled down to just implementing some famous open source solution on our product and letting us handle the training. Thanks, but we can do that ourselves.

I get the impression that it is (or was) more lucrative to start your own ML company and hope for an acquisition than to do the work for other companies. We tried to engage with several small ML vendors in our space and more than half of them came back with suggestions that we simply acquire them for large sums of money. Meanwhile, one of the vendors we engaged with was acquired by someone else and, of course, their support dried up completely.

Ultimately we found a solution from a vendor that had prepared a nice solution for our exact problem.the contracts were drawn up in a way that wouldn’t be too disastrous if (when?) they were acquired.

I have to wonder if an industry-wide slowdown to the ML frenzy is exactly what we need to give people and companies time to focus on solving real problems instead of just chasing easy money.


I find your post kind of interesting. I develop software in a non-AI field and have been following and experimenting with AI on the side for a long time. Academics seem intent on publishing papers, not finding solutions to creating value. Corporate AI seems focused on sizzle not substance.

It is so frustrating to see the potential in the AI world and realize almost no one is really interested in building it.


I agree that it's a shame that many research results do not get to be "industrialized" and actually used, but also I feel like many research results are created in such a sterile way that they wouldn't be applicable to real world scenarios.

I think what we got really good at is "perceptive" ML, like speech and image recognition, and those things do see industry applications, like self-driving cars or voice assistants.

I'd be interested to know where you see unrealized potential.


This is sadly so consistent with what I'm seeing at a big corporation. We are working so hard to make a centralized ML platform, get our data up to par, etc. but so many ML projects either have no chance of succeeding or have so little business value that they're not worth pursuing. Everyone on the development team for the project I'm working on is silently in agreement that our model would be better off being replaced by a well-managed rules engine, but every time we bring up these concerns, they're effectively disregarded.

There are obviously places in my company where ML is making an enormous impact, it's just not something that's fit for every single place where decisions need to be made. Sometimes doing some analysis to inform blunt rules works just as well - without the overhead of ML model management.


> or have so little business value that they're not worth pursuing

It seems that I'm inverted from you. The Machine part of Machine Learning is likely of high business value, but the Learning part is the easier and better solution.

We do a lot of hardware stuff and our customers are, well let's just say they could use some re-training. Think not putting ink in the printer and then complaining about it. Only much more expensive. Because the details get murky (and legal-y and regulation-y) very quickly, we're forced to do ML on the products to 'assist' our users [0]. But in the end, the easiest solution is to have better users.

[0] Yes, UX, training, education, etc. We've tried, spent a lot of money on it. It doesn't help.


> Everyone on the development team for the project I'm working on is silently in agreement that our model would be better off being replaced by a well-managed rules engine

That was one of the better insights with our team. We should measure the value-add of ML against a baseline that is e.g. a simple rules engine, not against 0. In some cases that looked appealing (‘lots of value by predicting Y better’) it turned out that a simple Excel sort would get us 90-98% of the value starting tomorrow. Investing an ML team for a few weeks/months then only makes sense if the business case on getting from 95% to 98% is big enough in itself. Hint: in many cases it isn’t.


I think part of the problem here is that ML development is extraordinarily more expensive then traditional dev.

I don't generally need to develop my own deployment infrastructure for every new project. However I've yet to see an ml team or company consistently use the same toolchain between 2 projects. The same pattern repeats across data processing, model development, and inference.

Oddly, adding more scientists appears to have a super-linear increase in cost - with the net effect being either duplicated effort or exhaustive search across possible solutions.


Being mostly disconnected from the fruits of your labor while being incentivized to turn your resume into buzzword bingo causes bad technology choices that hurt the organization, what a surprise.


I don't agree, most of the low hanging fruit in ML engineering hasn't been picked yet. ML is like electricity 100 years ago, it will only expand and eat the world. And the research is not slowing down, on the contrary, it advances by leaps and bounds.

The problem is that we don't have enough ML engineers and many who go by this title are not really capable of doing the job. We're just coming into decent tools and hardware, and many applications are still limited by hardware which itself is being reinvented every 2 years.

Take just one single subfield - CV - it has applications in manufacturing, health, education, commerce, photography, agriculture, robotics, assisting blind persons, ... basically everywhere. It empowers new projects and amplifies automation.

With the advent of pre-trained neural nets every new task can be 10x or 100x easier. We don't need as many labels anymore, it works much better now.


I've seen similar patterns with clients and companies I've worked at as well. My experience was less that ML wasn't useful, it's just that no organization I worked with could really break down the silos in order for it to work. Especially in ML, the entire process from data collection to the final product and feedback loop needs to be integrated. This is _really_ difficult for most companies.

Many data scientists I knew were either sitting on their hands waiting for data or working on problems that the downstream teams had no intention of implementing (even if they were improvements). I still really believe that ML (be it fancy deep learning or just evidence driven rules-based models) will effectively be table stakes for most industries in the upcoming decade. However, it'll take more leadership than just hiring a bunch of smart folks out of a PhD program.


Curious if there is a correlation with companies that failed to capitalize with the ones who relied on consultants versus really reshaping their own people.

I worked for a financial services co that saw massive gains from big data/ML/AWS. Given, we were already using statistical models for everything, we just now could build more powerful features, more complex models, and move many things to more-real time, with more frequent retrains/deploys bc of cloud.

I do agree that companies who don't already recognize the value of their data and maybe rely on a consultant to tell them what to do might not be in the position to really capitalize on it and would just be throwing money after the shiny object. It really does take a huge overhaul sometimes. We retooled all of our job families from analysts/statisticians to data engineers and scientists and hired a ton of new people


>Curious if there is a correlation with companies that failed to capitalize with the ones who relied on consultants versus really reshaping their own people.

I've worked in Data Science customers facing roles for 2 companies, and one anecdotal correlation between success with Stats/ML/AI I've seen is how "Data Driven" people really are for their daily decision making. The more data driven you are, the more likely you are to identify a problem that can actually be improved by an Stat/ML/AI algorithm. This is because you really understand your data and the value you can get from it.

Everybody has metrics, KPIs, OKS, etc, but the reality is that there's a spectrum from 100% gut to 100% data driven. And a lot of people are on the gut side of things while thinking (or claiming they are) they are on the data side.

I'll provide an example. I currently work for a company that sells to (among others) companies working with industrial machinery. If your industrial machine runs in a remote area (e.g. an Oil Field), then any question about that machine starts with pulling up data. Being data driven is the only way to figure out what's going on. These folks have a good sense for identifying the value they can get from their data and they usually understand when you say dealing with their data is a engineering task in itself.

The other side of this is a factory filled with people. Since somebody is always operating and watching the machine, the "data driven" part is mainly alarms (e.g. is my temp over 100C) and some external KPI (e.g. a quality measurement). They are much less data driven than they think they are, and a lot of them don't understand what value they could get out of their data beyond some simple stuff you don't really need ML/AI for.

I mention industrial equipment because I think a lot of people (even me) are really surprised when they hear about people working in factories not being super data driven. You think of factories, engineering, and data as being very lumped together. It's amazing how many areas (sales, marketing, HR, are other great examples) exist where people aren't as data driven as they think they are.


Yep, agreed. If decisions can be made by a human often they'll stick to that, often arguing there is no need for data.

In my former space (credit card fraud detection and underwriting), you obviously need a data driven solution. Without even considering latency requirements, you aren't do 6-10B manual decisions/year. The rationale for a more complex ML approach is easier to prove the ROI for, given the need is already there, just with an inferior technical solution.


Yeah the big data comparison is apt, and a few years ago was The Block-Chain that got middle managers frothing like Pavlov's dog.

It is clear that for most of the companies who are investing in deep learning are tangible results are always around the corner, and maybe 1 in 100 will build something worthwhile. But here is the carrot driving them all on, it's like the lottery: you have to be in to win. The stick is the fear that their competitors will do so.

This field is more art than science, give talented people incentive to play and don't expect too much for the next decade.


The problem I see is that in most non tech businesses they are not at the stage where they need ML, they are simply struggling with the basics: being able to seamlessly query or have consolidated up to date metrics and dashboards of the data scattered in all their databases. Of course the Big Data/AI “we’ll transform your data into insights” appealed to them, but that’s not what they need (also see the comments on the Palantir thread the other day).


> they paid more to get those insights than they were worth!

> They were forcing ML in to places it didn't (and may never) belong.

I find that I spend a lot of time as a senior MLE telling someone why they don’t need ML


That happened/is happening at my job. There's been a push to implement features that utilise AI/ML.

Not because it would be a good use case (although there are some for our product), or because it would be of any practical benefit, but because it makes for good marketing copy. Never mind the fact that nobody on the team has any experience with machine learning (I actually failed the paper at university).


Could it be also because for most companies after large investment in DS/ML/DL, they couldn't create a promising solution because they don't have as much access to the data/hardware/talent as Google/Amazon/MS does? And at the end of the day using just an API from the former gives better ROI?

(or) In simple terms, is profitable commercial Deep Learning just for oligarchies?


The company I work for is a large health care company. I started in robotic automation about a year ago. The company said its next three huge initiatives would be:

1) AI 2) Machine Learning 3) Robotic Process Automation

They felt RPA would help them stay more competitive since there are tons of smaller health care companies who are moving faster and innovating faster because they're not buying up companies and having to integrate all their technology at a sloth's pace. They thought RPA would be a way to mitigate these issues.

18 months later and the one manager, director and VP in my org has all but said they don't care about RPA, all their money is going into ML and AI. Even though in all the presentations I've seen them put on, its all blue skies and BS about "IF we had this, we COULD do this." Nothing concrete at all about how the plan to use ML to increase profit margins or reduce overhead.

Right now, our team is basically an afterthought in the company and I'm already starting to interview elsewhere with the knowledge at some point, they're going to kill my team and cut everybody loose.


Without ML, our business today is literally impossible (from a financial perspective).

I work in 2D animation and we were able to design our current pipeline around adopting ML at specific steps to remove massive amounts of manual labor.

I know this doesn't disprove your anecdote, I just wanted to point out that real businesses are using ML effectively to deliver real value that's not possible without it.


This has happened since the dawn of the computer age, and probably before.

Any technology too complex for the managers who purchase for it to understand fully can be sold and oversold by marketing people as "the next big thing".

Managers may or may not see through that, but if their superiors want them to pursue it or if they need to pursue something in order to show they're doing something of value, then they're happy to follow where the marketers lead.

Java everywhere, set top TV boxes, IOT devices, transitioning mainframes to minis, you name it... the marketers have made a mint selling it, usually for little benefit to the companies that bought into it.


My employer is big enough that I know we're doing a bunch of ML/AI and probably getting some value out of it somewhere.

However someone is trying to make robotic process automation the Next Big Thing - which I think is hysterically funny.


ML is a shiny object with often no discernible ROI but occasionally very large ROI, and companies are understandably nervous about missing out. Spending a small amount to hedge their bets isn't necessarily irrational.


> big data

That's because it didn't get a chance to mature and to show how it could be powerful. People kept trying to force hadoop into it and call themselves "big data experts"

We've gotten a bit more clarity in this world with streaming technologies. However, there hasn't been a good and clear voice to say "hey .. this is how it fits in with your web app and this is what you expect of it". (I'm thinking about developing a talk on this.. how it fits in [hint.. your microservice app shouldn't do any heavy lifting of processing data])


These days it's people trying to force Kafka into it and call themselves "streaming experts"


Kafka is good.. but it requires a lot of good work to get it working well for a pipeline.


Most buzzwords exist for consultants to sell their services. Successful buzzwords turn engineers into "Big Data Engineers" who only want to work on something called big data, and convince management that they need more "big data expertise".

In practice most of these technologies and their peers exist to support real applications, and it would be almost immediately recognizable that they are the appropriate choice when working on a similar application. You don't need a streaming engineer/big data consultant to cram them in.


People have been trying to used algorithms of various sorts to increase sales (actionable insights) forever. The buzzwords change, but the results are always the same. No permutation of CPU instructions will turn a product people don't want to pay for into a product people want to pay for.


There's also a lot of deception going on.

The easiest way to solve many problems is through lexers, regular expressions and plain-old pattern matching. But that doesn't sell, so, they call it AI anyways.


Or like conversion rate optimization tools.


When I hear ML, deep learning, etc, I consider it a red flag for exactly the reasons you state.

It's kind of batty actually, people looking for ideas to make money just been taking old ideas and attaching ML to the side of it as if that automatically made it better. And then not educating their customers on the limitations of ML both generally and with respect to their data size.

I personally think the companies that make and sell the software that the police used to make incorrect arrests should be legally liable. Yes, the police shouldn't have blindly trusted the software, but I guaran-fucking-tee you part of why they did is the marketing from the company themselves.


According to data from Revealera.com, if you normalize the data, the % of job openings that mention 'deep learning' has actually remained stable YoY: https://i.imgur.com/sDoKwD0.png

* Revealera.com crawls job openings from over 10,000 company websites and analyzes them for technology trends for hedge funds.


Yeah I had a suspicion that the trend shown in the chart in that thread regarding the decline of DL job posts largely resembles the trend of total job posts.


I think the fact the original tweet was not normalized in this incredibly obvious way is at least one valid reason companies could use less deep learning folk


The original tweet is one of the authors of TensorFlow (specifically Keras, a large section of the API for v2), if it's any indication of the quality of the framework.


I wouldn't expect anything different from François..


That was my suspicion as well.

Btw. I don't like twitter's new feature that prevents everyone from responding to a tweet that was used by @fchollet. It no longer feels like twitter if you can't engage.


Once you reach 100k followers, you only need a 0.1% jerk rate, to always have a 100 people in your comment section that do nothing but troll, rile you up, or demand you defend your thoughts against their stupid uninformed disagreements. Chollet has 210k followers.


> demand you defend your thoughts against their stupid uninformed disagreements.

And I shall use this pulpit to demand, in a mixture of derision and righteous anger, that you defend your comme... ah never mind.

This may not be a new thought, but it's eloquently put. Thank you.


Disingenuous framing of data or a laughably fundamental misreading of it? This is akin to trying to gain insight from a bunch of data on a map that simply has a strong correlation with population density.


If only we could apply ML/AI to the data on ML/AI job postings.


Something I've learned: when non-engineers ask for an AI or ML implementation, they almost certainly don't understand the difference between that and an "algorithmic" solution.

If you solve "trending products" by building a SQL statement that e.g. selects items with the largest increase of purchases this month in comparison to the same month a year ago, that's still "AI" to them.

Knowing this can save you a lot of wasted time.


Any sufficiently misunderstood algorithm is indistinguishable from AI.


In my AI class in college, we learned about first order logic. To me it didn't seem like we were really learning AI but I couldn't quite put my finger on it. I guess it's because it made too much sense so in my mind it couldn't be AI.


This is basically a form of the AI effect[1]:

> The AI effect occurs when onlookers discount the behavior of an artificial intelligence program by arguing that it is not real intelligence.

[1]: https://en.wikipedia.org/wiki/AI_effect


Ah yes, the AI version of the no true scotsman fallacy


AI is what we call algorithms before we really understand them.


Ha! I'm going to have to borrow this phrase.


I think it's a play on "Any sufficiently advanced technology is indistinguishable from magic".


Some decades ago, that was AI to everyone.

In the future, I expect ML to also fall out of the "AI" umbrella - it gets used primarily for "smart code we don't know• how to write", so once that understanding comes, it gets a more-specific name and is no longer "AI".

•"know" being intentionally vague here, as obviously we can write both query planners and ML engines, but the latter isn't nearly as commonplace yet to completely fall out of the umbrella.


Right, this makes sense, because the "Artificial" part goes away once we have a fully understood algorithm. It's just part of intelligence to use algorithms when they work.


Exactly! I run a data science department at a corporation. I've had exactly one production level project that was sufficiently complicated to require deep learning. I am currently working on the second. I always start with the simplest approach. That's a message that I push hard for every recent grad and intern who comes here.


Engineers tend to overestimate how difficult machine learning is. That is exactly how a good data scientist would solve this problem. If (and only if) this initial solution is not sufficient then you can iterate on it (maybe we should also take into account monthly trends, maybe one category of products is overrepresented, ...).


hah yeah "dynamic programming" has turned out to have a fortunate name


Most data-related problems, or extraction of knowledge from data, simply doesn't benefit from Deep Learning.

In my experience, what many organizations lack is simple but high-quality "Business Analytics": Reporting & dashboards are developed that look good but jam too much information together. It is often the wrong information:

Something is requested, and the developer develops exactly what was asked. The problem is that it wasn't what was needed because the person making the request couldn't articulate the question in the same terms the developer would understand. The request will say "Give me X & Y" when the real question is "I want to understand the impact of Y on X". The person gets X & Y, looks at it every day in their dashboard, and never sees much that is useful. The initial request should always be the start of a conversation, but that often doesn't happen. A common result are people in departments spending tons of time in Excel sorting, counting, making pivot tables, etc., when all of that could be automated.

This is part of the reason why companies often go looking for some new "silver bullet" to solve their data problems. They don't have the basics down, and don't understand the data problems well enough to seek out a solution.


I think we’re starting to see peak managerialism. The latest wave in stats has shown more than anything that a significant shortfall in basic statistics knowledge makes it almost impossible to make good decisions with vast amounts of data.

Without the skillsets to work with and then understand that data, they are forced into this long process of asking for data to be put into reporting and dashboards and then once they finally get them, either fixating on the limited metrics it provides while being oblivious to other context not in front of them, or to instead forced to start another long iteration to adjust that reporting and dashboards.

We’ve gone almost 30 years believing management was the sole skill required to manage teams and companies, but dealing with the new era of data is starting to show the limits


I observe the state of the art on most Nlp tasks since many years: In 2018,2019 there was huge progress made each year on most tasks. 2020,except for a few tasks have mostly stagnated... NLP accuracy is generally not production ready but the pace of progress was quick enough to have huge hopes. The root cause of the evil is: Nobody has build upon the state of the art pre trained language: XLnet while there are hundreds of declinaisons of BERTs. Just because of Google being behind it, if XLnet was owned by Google 2020 would have been different. I also believe that pre trained language have reached a plateau and we need new original ideas such as bringing variational autoencoder to Nlp and using metaoptimizers such as Ranger.

The most pathetic one is that: Many major Nlp tasks have old SOTA in BERT just because nobody cared of using (not improving) XLnet on them which is absolute shame, I mean on many major tasks we could trivially win many percents of accuracy but nobody qualified bothered to do it,where goes the money then? To many NIH papers I guess.

There's also not enough synergies, there are many interesting ideas that just needs to be combined and I think there's not enough funding for that, it's not exciting enough...

I pray for 2021 to be a better year for AI, otherwise it will show evidence for a new AI progress winter


I do not agree with this. I work heavily with NLP models for production in the Legal domain (where my baseline is where a 8GB 1080 must predict more than 1000 words/sec). This year was when our team glued enough pieces of Deep Learning to outperform our previous statistic/old ML pipeline that was been optimized for years.

Little things compound such as optimizers ( Ranger/Adahessian), better RNN ( IndRNN, Linear Transformers, Hopfield networks ) and techniques (cache everywhere, Torch script,gradient accumulation training)


I do not agree with this. I work heavily with NLP models for production in the Legal domain (where my baseline is where a 8GB 1080 must predict more than 1000 words/sec).

What kind of network are you using? I can do near-SoTA multi-task syntax annotation [1] with ~4000 tokens/s (~225 sentences/s) on a CPU with 4 threads using a transformer. Predicting 1000 words/second on a reasonably modern a GPU is easy, even with a relatively deep transformer network.

[1] 8 tasks, including dependency parsing.


Interesting. What's the main goal(s) of your NLP models?


We work on multiple models, all related to legal proceedings and lawsuits, such as: - Structure Judicial Federal Register texts - Identify entities in Legal texts (citation to laws, other lawsuits) - Predict time to completion, risk and amount due of a lawsuit - Classifying judicial proceedings to non lawyers


How accurate is your prediction of time/risk/amount? How useful is identifying entities or classifying proceedings?


Just to clarify one of your points regarding Google's involvement: XLnet, and the underlying TransformerXL technology, did have Google researchers involved:

* https://ai.googleblog.com/2019/01/transformer-xl-unleashing-...

* https://arxiv.org/pdf/1901.02860.pdf

* https://arxiv.org/pdf/1906.08237.pdf

My understanding is that a CMU student interned at Google and developed most of the pieces of TransformerXL, which formed the basis of XLNet. The student and the Google researcher further collaborated with CMU researchers to finalize the work.

(For the record, I think the remainder of your points do not match my understanding of NLP, which I do research in, but I just really wanted to clarify the XLNet story a bit).


Could you give an example of a major task that you think the state of the art could be trivially improved on with the xlnet approach?


Long (>2048 tokens) sequences.

But GP is too focused on hyping XLNet for some reason. There are much more elegant attempts at improving the transformer architecture in just the past 8 months: Reformer, Performer, Macaron Net, and my current pet paper, Normalized Attention Pooling (https://arxiv.org/abs/2005.09561).


But GP is too focused on hyping XLNet for some reason. Yeah some reasons, it might be because of stars alignment and could variate inversely with the weather. Or it might be because it's the paper with the biggest number of first place at SOTA leaderboards? https://paperswithcode.com/paper/xlnet-generalized-autoregre...

I've queried most of your examples on the SOTA database that is paperswithcode.com and they have almost zero results. You illustrate the problem, if researchers like you don't even know the general SOTA, how can it be expected to be beaten? But beyond scientists ignorance there is also the problem of models not submitting their results to paperswithcode.com or not testing them extensively but only on niche benchmarks. This second behavior sentence such potentially promising models to remain unknown and therefore mostly irrelevant.

It's always remarkable how one can be a smart researcher and yet not adjust its behavior to be rational regarding those two flaws (not seeking SOTA knowledge, and not promoting SOTA knowledge; READ, WRITE)


I asked the (at the time) SOTA author to replace its BERT implementation with XLnet. He accepted and won 0.5% of accuracy on constituency parsing https://github.com/sebastianruder/NLP-progress/blob/master/e... which is actually a huge, it means 5% less errors.

Which tasks remain to be tried? Most actually but an obvious one would be coreference resolution https://github.com/sebastianruder/NLP-progress/blob/master/e...


Thanks for the information. Do you know how the pandemic affected research output for 2020?


It'd be ironic if your comment was generated by GPT-3. But forget GPT-3. In 10 years, looking back at AI history, the year 2020 will probably be viewed as the point separating pre GPT-4 and post GPT-4 epochs. GPT-4 is the model I expect to make things interesting again, not just in NLP, but in AI.


Are any of the recent NLP advancements due to improvements beyond throwing more data and horsepower at “dumb” models? Will GPT-4 be any different?

It seems like the current approaches will always fall short of our loftier AI aspirations, but we’re reaching a level of mimicry where we can start to ask, “Does it matter for this task?”


Will GPT-4 be any different?

That's the point - it does not need to be different. If it demonstrates similar improvement to what we saw with GPT-1 --> GPT-2 --> GPT-3, then it will be enough to actually start using it. It's like the progression MNIST --> CIFAR-10 --> ImageNet --> the point where object recognition is good enough for real world applications.

But in addition to making it bigger, we can also make it better: smarter attention, external data queries, better word encoding, better data quality, more than one data type as input, etc. There's plenty of room for improvement.


No, almost all the progress is driven by bigger GPUs and datasets.

To be fair, things like CNN's and BERT were definitely massive improvements, but a lot of modern AI is just throwing compute at problems and seeing what sticks.


It's not because something seems smart that it actually is. GPT - 3 is ridiculous in that it doesn't have semantic understanding. Text generation is the wrong task, it's cool to watch and actually fuel hype on the ML train but what you're really looking for is semantic parsing, which GPT or openAI has nothing to do with and is mostly underfunded.


While this sounds plausible and has a lot of "prior" credibility coming from someone as central to deep learning as François Chollet, I'd love to see corroborating signal in actual job-posting data, from LinkedIn, Indeed, GlassDoor, etc. Backing up this kind of claim with data is especially important given the fact that the pandemic is disrupting all job sectors to varying degrees.

As you can imagine, searching Google for "linkedin job posting data" doesn't work so great. The closest supporting data I could find is this July report on the blog of a recruiting firm named Burtch Works [1]. They searched LinkedIn daily for data scientist job postings (so not specifically deep learning) and observed that the number of postings crashed between late March and early May to 40% of their March value, and have held steady up to mid-June, where the report data period ends.

There's also this Glassdoor Economic Research report [2], which seems to draw heavily from US Bureau of Labor Statistics data available in interactive charts [3]. The most relevant bit in there is that the "information" sector (which includes their definitions of "tech" and "media") has not yet started an upward recovery in job postings, as of July.

[1] https://www.burtchworks.com/2020/06/16/linkedin-data-scienti...

[2] https://www.glassdoor.com/research/july-2020-bls-jobs-report...

[3] https://www.bls.gov/charts/employment-situation/employment-l...



I actually found this, but decided not to post it because it only captures the first few weeks of post-crisis patterns, and doesn't contextualize any of the deep-learning-specific job losses against the broader job market, which as we all know was doing the same thing, directionally. It would be really cool to get an updated report of that level of detail from the author, who seems active on Twitter (https://twitter.com/neutronsneurons), but not Medium: that April job report is his latest article.


I feel like it was also a classic case of running before we could crawl. Jumping from A to Z before we could go from 0 to 1.

I work at an Residential IoT company, there are quite a few really valid use cases for Big Data and even ML. (Think about predictive failure).

We hired more than one expensive data scientist in the past few years, and had big strategies more than once. But at the end of the day it's still "hard" to ask a question such as "if I give you a MAC Address give me the runtime for the last 6 months".

We're trying to shoot for the moon, when all I've ever asked is I want an API to show me indoor temp for particular device over a long period.


This is absolutely right. And when you think about it, the reason behind has been staring us in the face: people who want to do machine learning approach everything as a machine learning problem. It's really common to see people handwave away the "easy stuff" because they want to get credit for doing the "hard stuff."

It's not just the data scientists fault. I once heard our chief data scientist point out that they don't want to hand off a linear regression as a machine learning model -- as if a delivered solution to a problem has a minimal complexity. She absolutely had a point.

Clients are paying for a Ph.D. to solve problems in a Ph.D way. If we delivered the client a simple, yet effective solution, there's the risk of blow-back from the client for being too rudimentary. I'm certain this extends attitude extends to in-house data scientists as well. Nobody wants to be the data "scientist" who delivers the work of a data "analyst." Even when the best solution is a simple SQL query.

Our company kind of sidesteps this problem by having a tiered approach, where companies are paying for engineering, analysis, visualization, and data science work for all projects. So if a client is at the simple analysis level, we deliver at that level, with the understanding that this is the foundational work for more advanced features. It turns out to be a winning strategy, because while every client wants to land on the moon, most of them figure out that they are perfectly happy to with a Cessna once they have one.


> Clients are paying for a Ph.D. to solve problems in a Ph.D way.

Ideally, "in a PhD way" is with careful attention to problem framing, understanding prior art, and well-structured research roadmaps.

I worry about PhD graduates who seemingly never spent much time hanging out with postdocs. Advisors teach a lot, but some approach considerations can be gleaned more easily from postdocs gunning for academic posts.


How good are data scientists in building reliable, scalable systems? My anecdotal experience has been that many don’t bother or care to learn good software development practices, so the systems they build almost always work well for specific use cases but are hard to productionize.


Everyone wants to fire up Tensorflow, Keras and PyTorch these days. Fewer people want to work in Airflow and SSIS, spend days tuning ETL, etc. This is the domain of data engineering, which bridges software engineering and data science with a dash of devops. I’ve been working in this field for a couple of years and it’s clear to me that data engineering is a necessary foundation and impact multiplier for data science.


Don't forget data cleaning. A huge issue I've seen is just getting sufficient data of a high enough quality.

Also, (for supervised classification problems) labelling is a big problem.

It is almost as if we need a "data janitor" title.


Phht you don't want to call it data janitor; no-one good will want that title. At least call it Data Integrity Engineer or something reasonably high-status.


At my company we have just created a position called Data Steward.


Data Sanitation Engineer :)


Machine Learning Data Integrity Engineer


This is a hugely important point. Having the right data is most of the battle in the typical project. I'm fortunate to run the data science department at a company with an amazing and supportive team of devs. We work together to make sure that all data is recorded in the most useful way and we talk regularly. Without their support, my models wouldn't be very useful.


Data cleaning always sounded suspicious to me.


As a data engineer I've often billed myself as the Charlie Kelly of data


I'm finding myself really enjoying this type of work and I think I would like to specialize in it. Any good learning resources you used on your path to where you are now?


Any formalized paths I could take to enter this field?


I got into data engineering by starting in ETL (DataStage) and learning about cloud services (AWS) on my own and getting my next job in a cloud-based SaaS startup.


How might you recommend moving into this field more?


My impression too. I earn my money turning your mess into a data "landscape" - I saw people wanting to jump on the ML wagon, who did not even heard of version control for code before. Not a winter, no, but a long bumpy road ahead.


This is my job, too. And it involves so many politics, since the first thing companies do is hiring data scientists and afterwards figuring out they need clean data but they already have all these "products" that data scientists build that are a mess built ontop of a mess...


Meh, only for people who bought into the hype without real use cases. Which I agree may be numerous.

In my company though, we've been applying DL with great success for a few years now, and there are at least five years of work remaining. And that's not spending any time doing research or anything fancy: just picking the low-hanging fruit.


I think many companies have real problems, but find that DL ends up being a poor solution in practice for various reasons.

You need not only real use cases, but use cases that happens to well with DL’s trade offs and limitations. I think many companies hired with very unrealistic expectations here.


Nice! Which company?


I managing some teams right now that do a mix of high-end ML stuff with more prosaic solutions. The ML team is smart, and pretty fast with what they do, but they tend to (as many comments here have mentioned) focus on delivering only PhD level work. This translates into taking simple problems and trying to deorbit the ISS through a wormhole on it rather than just getting something in place that answers the problem.

In conjunction with this, it turns out 99% of the problems the customer is facing, despite their belief to the contrary, aren't solved best with ML, but with good old fashioned engineering.

In cases where the problem can be approached either way, the ML approach typically takes much longer, is much harder to accomplish, has more engineering challenges to get it into production, and the early ramp-up stages around data collecting, cleaning and labeling are often almost impossible to surmount.

All that being said, there are some things that are only really solvable with some ML techniques, and that's where the discipline shines.

One final challenge is that a lot of data scientists and ML people seem to think that if it's not being solved using a standard ML or DL algorithm then it isn't ML, even if it has all of the characteristics of being one. The gatekeeping in the field is horrendous and I suspect it comes from people who don't have strong CS backgrounds wrapping themselves too tightly against their hard-earned knowledge rather than having an expansive view of what can solve these problems.


Get your math and your domain knowledge straight and you can do a lot with little. Lots of programmers want to be ml engineers because the prestige is higher because you normally take in PhDs. The big problem is hype, people are throwing AI at everything as...garbage marketing. It’s at the point where if you say you use AI in your software title, I know you suck, because you aren’t focusing on solving a problem you are focusing on being cool which will never end well.


There's a lot of what I call "model fetishism" in machine learning.

Instead of focusing our energies on the infrastructure and quality of data around machine learning, there's eagerness to take bad data to very high-end models. I've seen it again and again at different companies, usually always with disastrous consequences.

A lot of these companies would do better to invest in engineering and domain expertise around the problem than worry about the type of model they're using to solve the problem (which usually comes later, once the other supporting maturity pieces are in place)


This is why my interview question focuses around applying linear regression to a complex domain. It weeds out an enormous number of candidates.

There are 5 ML models that we maintain where I work, and none of then are more complicated than linear regression or random forests. Convincing me to use something more complex would take an enormous amount of evidence. Domain knowledge is king.


Yes! I feel this quite a lot, I've just finished my degree. I remember reading quite a few papers for my thesis where there is little discussion of the actual data that is used, what might be graspable from the data with basic DS techniques such as PCA, clustering and such. Instead, it goes right to the model and default evaluation methods, just a table of numbers.

We did have courses explaining the "around" of the whole process though, but that's not as hyped.


This is an anecdote with no data. And the entire global economy is in a recession, so the fact deep learning might have fewer job postings isn't particular notable.

I'll note that in my personal anecdote, the megacorps remain interested in and hiring in ML as much as ever.


He's now posted a follow-up analysis of LinkedIn Job postings: https://twitter.com/fchollet/status/1300417952211034112?s=20


Would be interesting to see this dip relative to other tech subfields like javascript/react or even data science and other such keywords. Does anyone know of a public LinkedIn dataset?

The author disables tweet replies so I'm not sure where they get their numbers from.


This agrees with what I see, but megacorps and in general many large organizations are often slow to move both in and out. They can take years to stop building up experience in areas that changed from being a new promising technology to mature fields to oversold fads. They also have a lot of money help weather many overpriced hires. So I am not sure that megacorps hiring is a very strong counter-argument. Just my 2c.

However, megacorps do not seem to suffer much for such continuous lagging in hiring. I do not know why this is so: is it that they still hire smart engineers who can easily change groups and fields or do they work on their core technology to help build the next peak (after the debris are washed away in a fad crash there is often a technology renaissance).


Missing in the original chart/data: have ML/DL job postings decrease more or less than other comparable job categories (programming, business analyst, etc.)


Great point. Not as good point: is looking for pytorch and tf the right measure?


In his tweet I thought he made it clear he wasn't predicting an AI specific slowdown but a universal recession due to Covid?


He's wrong and using not the best data for such an assertion.

Data science jobs are not slowing down, though they're not really increasing either.

In comparison since 2016 software engineering jobs revolving around building up systems for data scientists have increased 6 fold, maybe even more since I last looked.


I am a DL researcher at a top industry lab.

I'm completely unsurprised by this. Regularly, at lunch, I'll ask my coworkers if they know of any DL applications that are making O($billions), and no one knows any outside of FAANG.

FAANG is making an insane amount of money due to DL. Outside of them though, I don't know who's making money here. When I was interviewing for jobs, there were a ton of startups that were trying to do things with DL that would have been better done with a few if statements and a random forest, and that had a total market size in the millions.

I think that, eventually, there'll be a market for this stuff, but I'm not convinced that it's anywhere near being widespread.

I was also a consultant before my current role. The vast majority of non-tech firms don't have their data in well organized + cleaned databases. Just moving from a mess of Excel sheets to Python scripts + SQL databases would have made a HUGE difference to the vast majority of clients I worked with, but even that was too big of a transformation.

Basically, everyone with the sophistication to take advantage of DL/ML already has the in-house expertise to do it. There's almost no one in the intersection of "Could make $$$ doing DL" && "Has the technical infrastructure to integrate DL".


99% of the time you don't need a deep recurrent neural network with an attention based transformer. Most times, you just need a bare-bones logistic regression with some carefully cleansed data and thoughtful, domain-aware feature engineering.

Yes, you're not going to achieve state-of-the-art performance with logistic regression. But for most problems the difference between SOTA and even simple models is not nearly as large as you might think. And two, even if you're cargo-culting SOTA techniques, it's probably not going to work unless you're at an org with an 8-digit R&D budget.


I know very little about the DL/ML space, but as a full-stack engineer it feels like most companies have tried to replicate what FAANG companies do (heavy investment in data/ml) when the cost/benefit simply isn't there.

Small companies need to frame the problem as:

1) Do we have a problem where the solution is discrete and already solved by an existing ML/DL model/architecture?

2) Can we have one of our existing engineers (or a short-term contractor) do transfer learning to slightly tweak that model to our specific problem/data?

Once that "problem" actually turns into multiple "machine learning problems" or "oh, we just need todo this one novel thing", they will probably need to bail because it'll be too hard/expensive and the most likely outcome will be no meaningful progress.

Said in another way: can we expect an engineer to get a fastai model up and running very quickly for our problem? If so, great - if not, then bail.

ie: the solution for most companies will be having 1 part-time "citizen data scientist" [1] on your engineering team.

[1]: https://www.datarobot.com/wiki/citizen-data-scientist/


The way I see it, only those companies that had already been using a data oriented approach to business can really reap the benefits of ML. From a company's point of view, ML/AI should be a natural evolution of an existing tool set to better solve problems they have been trying to solve in the past using deterministic methods and then statistical methods, etc. Any other project that is diving right into ML is likely to fail because

1. There's no clear problem statement. They have never formulated one and now trying to bolt ML on to their decision making.

2. They don't have well catalogued data for engineers/scientists to work with because they never tried to do rigorous analysis of data before ML became a thing.

3. Managers have no idea how to deal with data driven insights. What if the results are completely unintuitive to them? Are they going to change their processes abruptly? What if the results are aligned with what they have been always doing? Is it worth paying for something that they have been doing intuitively for decades?

I'm not a data scientist. But the biggest complaint I hear from my colleagues is that they lack data to train models.


Yeah, you really shouldn't conform data to the problem. It's more an emergent silver gun than a constructed silver bullet.


Isn't it the same pattern every 10 years or so for "AI" related tech ? Some people hype tech X as being a game changer - tech X is way less amazing than advertised - investors bail out - tech X dies - rinse and repeat.

https://en.wikipedia.org/wiki/AI_winter


This is more akin to the Internet bubble than the previous AI winter. The technology is valuable for business, but the hype is huge and companies aren't ready for it yet.


AI has a business problem.

Very few businesses I know actually have a deep learning problem. But they want a deep learning solution. Lest they get left out of the hype train.


Blockbuster didn't have an Internet problem.


Dentistry didn't have a sledgehammer problem and, after all these years, it still doesn't.


"This is evident in particular in deep learning job postings, which collapsed in the past 6 months."

Have they? Specifically, have they "collapsed" relative to the average decline in job listings mid-pandemic?


Companies trying to add machine learning to everything they do like if that's going to solve all their problems or unlock new revenue streams.

80 or 90% of what companies are doing with machine learning results in systems with a high computing cost that are clearly unprofitable if seen as revenue impacting units. Many similar things can be achieved with low-level heuristics that result in way smaller computing costs.

But nobody wants to do that anymore. There's nothing "sexy" or "cool" about breaking down your problems and trying to create rule-based systems that addresses the problem. Semantic software is not cool anymore, and what became cool is this super expensive blackbox that requires more computer power than regular software. Companies have developed this bias for ML solutions because they seem to have this unlimited potential for solving problems, so it seems like a good long term investment. Everyone wants to take that bus.

Don't get me wrong. I love ML, but people use it for the stupidest things.


That may be true in the research arena (where Mr Chollet works), but I don't think that's the case in terms of where deep learning is actually applied in industry, nor will it be the case for years to come IMO.

It's just that much that needed to be invented has been invented and now it's time to apply it everywhere it can be applied, which is a great many place.


Some context, for those unfamiliar: https://en.wikipedia.org/wiki/AI_winter


The poster explicitly states he does not think this is indicative of AI winter.


Mentioning context does not mean the OP assumes equivalence.

It is context.


I've been using voice commands on my android phone, in situations where I can't use my hands. Most often all I want to do is.

1. Start and stop a podcast.

2. Play music

3. Ask for the time

The phone understands me, but then android breaks the flow, so I have to use my hands.

1. It will ask me to unlock the phone first? I have gloves and a mask on. It won't recognize my face, and my gloves don't register touches. Why do I have to unclock the phone to play music in the first place.

2. It gets confused on which app to play the music/podcast on. Wants to open youtube app, or spotify, and so on ...

3. Not consistent. I can say the same thing, and sometimes it will do one things, and another next time.

4. If I'm playing a video, and I want to show it full screen. I have to maximize and touch the screen. Why can't it play full screen be default.


I have similar with my Google Home; it can play Netflix but can't work out BBC iPlayer most of the time. And many times if I ask it to play music, it'll give an error saying it can't play on YouTube Music because I don't have a subscription, even though my default music player is Spotify in my account.


Clearly whoever wrote this android integration didn't hire enough high-quality ML PhDs to reach the necessary benchmarks for full-screen defaults.


My favorite joke on this is "The answer is deep learning, now whats the problem?"


labeled data


I'm currently a masters student and I'm rather glad I opted not to take a specialized degree such as Machine Learning, taking on computer science instead. All this discussion about DS, ML, AI (and even CS) becoming over-saturated has made me rather wary and I worry that I'm choosing the wrong 'tracks' to study (currently doing ML and Cybersecurity as I genuinely am interested in those fields). I won't be graduating until next year but I'm forcing myself to be optimistic that the tech job market will be in a better place by then.


It's not exactly a great year for extrapolating trends about what people are doing with their time. I wonder how much of this is 2020-specific and not just due to the natural cycle of AI winters.


at least some is pure 2020. we want to hire, we can’t right now.


Why not? I would have thought it was a buyer’s market now with all the layoffs.


There's tons of layoffs because businesses are doing really badly. Current cashflow may not support another developer. Future cashflow doesn't look that great in any B2C market, either, and the B2B markets will start to look slim pickings not too far after that.


I’m not sure I follow. My question was directed toward OP, whose company is hiring, which presumably means they are doing well. Can you please clarify?


If it weren't urgent (i.e., lost a job) I'd be a little reluctant to join a company/team that I'd never met in person.

I can imagine that others would be equally reluctant to hire someone they've only seen through Zoom.


Also if you didn't lose a job, you might not want to change right now if you're in a stable position, even if it's not your dream job.


Finally! Big companies need to realize they must understand what what they are doing with technology to get any value of out it.

They've long resisted that, of course, but I'm pretty sure half the popular of deep learning was it leveled the playing field, making engineers as ignorant of the inner-workings of their creations as the middle managers.

May the middle-manager-fication of work, and acceptance of ignorance that goes with, fail.

-----

Then again, I do prefer it when many of those old moronic companies flounder, so maybe this is a bad thing that they're wising up.


Deep Learning has become mainstream. The place work at actually uses 2 unrelated products based on NN.


I imagine this correlates to the "blockchain" postings.


I would have expected a comparison to job postings in general: how do deep learning job postings compare to job postings for any kind of technical position?


Meanwhile, the academic job market, certainly in my area, ie linguistics/computational linguistics, has collapsed, too. A colleague did a similar and equally nice analysis here: https://twitter.com/ruipchaves/status/1279075251025043457

It's tough atm.


Data science and ML In big companies are pulling resources away from the real value add activities like proper data integrity, blending sources, improving speed performance. Yes Business Intelligence is not cool anymore. Yes I also call my team “data analytics”. But let’s not forget the simple fact that “data driven” means we give people insights when and where they need them. Insights could be coming from an sql group by, ML, AI, watching the flying of birds, but they are still simply a data point for some human to make a decision. That means we need to produce the insight, being able to communicate it to people, have the the credibility for said people to actually listen to what we are saying. Focusing on how we put that data point together is irrelevant, focusing on hiring PHDs to do ML is most likely going to end in a failure because PHDs are not predictive of great analytical skills, experience and things like sql are much better predictors.


On the plus side, ML systems have become commoditized to the point that any reasonably skilled software engineer can do the integration. From there, it really comes down to understanding the product domain inside and out.

I have seen so many more projects derailed by a lack of domain knowledge than I have seen for lack of technical understanding in algorithms.


There will always be Snake Oil salesmen and hence Snake Oil..


Out of curiosity: are there job postings that did not "collapse" over the past six months?


Every company of course is very different, but I have seen that companies understood that fro Deep Learning you need a Pytorch or TF expert or maybe some other framework and most of these experts already work in Google/Facebook or any other advanced companies (NVIDIA, Microsoft, Cruise, etc), hiring is very difficult and cost is high. Then you can start using regular SQL and/or AutoML to get some insights. For a large number of companies that's enough. When there is so much complexity, such as DL modeling there's little transparency and management want to understand things. After COViD time will tell, but my take is that only a few companies need DL.


In general does anyone know if its a good time to look for a new dev job? I was really going to move this year, but it seems sensible to wait. Just sucks to see friends with RSUs going up in value so quickly.


No harm in having a recruiter or two feed you opportunities on a regular basis to interview at (just be up front with them that you're holding out for a solid fit for your criteria). Better to have a job while interviewing than be under pressure to accept the first half decent thing that comes along.


For most companies ML is just part of the long term strategy, with covid priorities have shifted from long term R&D to short term survival, so I don't see anything out of the ordinary here


A lot of thee c folks aren't tech folks or even math folks. They want to try to use deep learning to do prediction or get some insight when something as simple as regression would have worked.


what's particularly surprised me is how effective gradient boosting is in practise. I've seen so many cases of real world applications where just using catboost or whatever worked ~95% as well or even just as well as some super complicated deep learning approach and it saves you ten times the cost


To be fair, if you're willing to write code to perform feature engineering for you, you can often replace the complicated boosting approach with a much simpler regression model.

Turtles all the way down, I guess.


Deep Learning has been so commoditized and compartmentize over the past 5 years, now I think average SDE with some basic understanding of it can do a reasonable job in application.


I don't think anyone should freak out when they see a tweet like this: deep learning is just one particularly trendy part of ML, which is just one piece of data science, which is just one job title in the "working with data" career space. I think that most people with backgrounds or interests in DL are very well equipped to participate in the (ever more important) data science world.


Booms imply crashes. Anyone who is surprised at this couldn't be smart enough to be a good machine learning engineer.


No question ML is powerful and can do great things. Also no question a lot of companies where just throwing money at stuff for fear of being seen as behind in this space. When the going gets tough such vanity efforts are the first things to go.

Teams adding measurable value for their companies should be fine but others might not be.


In my industry (research), we still have a strong line of business. Some commercial clients have killed their contracts with us to save money during the COVID era, but government contracts are still going strong. In areas where there's a clear use case I think there is still work to go around.


My belief in an AI breakthrough is so strong that I would invite another AI winter to try to play catch up


What is your belief based on?


The graph means very little without a comparison line of “all programming jobs” and/or “all jobs.”


Good riddance. The majority of it is snakeoil, relabeling and "smoke and mirrors". A lot of smart or lucky people made a lot of money, a lot of dumb people with power over money lost... probably insignificant amounts of it.


ML/DL is at the exploratory phase for most companies. I have no surprise when seeing this post. Nevertheless, this also open new opportunities in other domains and new kind of business based on data. I have no doubt.


Is there a LinkedIn tool that allows you to make similar trend plots as shown in the Twitter thread, or has the author been archiving the data over time?


Unless you're doing ML/DL/etc research then what you're really doing is engineering, like always.


the fact that he doesn't allow people to answer his tweets making data-less claims like this is really a problem


He labels anyone who criticizes him as a troll. Unfortunately he is a public figure in the ML space and does have his share of trolls, but doesn't take too well to even well thought out replies.


That and he makes these tweets about threats and insults from "people using Pytorch" and the TensorFlow/Keras vs Pytorch "debate" without taking a screenshot or actually showing any kind of proof.

He seems pretty oblivious to the fact that simply not mentioning them would make the problem go away as no one beside him seems to actually care.


he's so French, in the worse way possible. I say that as a French person myself


Also his analysis is shoddy. He shows an absolute decrease in DL job postings since covid hit, and claims that DL is in decline irrespective if other fields like SWE are also in a similar decline. Utterly surprised by the analysis given the data.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: