From the RAND report "First, industry stakeholders often misunderstand — or miscommunicate — what problem needs to be solved using AI."
From personal experience this seems like it holds for most data-products, and doubly so for basically any statistical model. As a data scientist, it seems like my domain partners' vision for my contribution very often goes something like:
0. It would be great if we were omniscient
1. Here's some data we have related to a problem we'd like to be omniscient about
2. Please fit a model to it
3. ????
4. Profit
Data scientists and ML engineers need to be aggressive at early planning stages to actually determine what impact the requested model/data product will have. They need to be ready for the model to be wrong, and need to deeply internalize the concept of error bars, and how errors relate to their use-case. But so often 'the business stuff' gets left to the domain people due to organizational politics and people not wanting to get fired. I think the most successful AI orgs will be the ones that can most effectively close the gap between people who can build/manage models, and the people who understand the problem space. Treating AI/ML tools as simple plug and play solutions I think will lead to lots of expensive failures.
Except there is no winning move as an IC in pushing back against a half baked product definition that lacks business rigor. I pushed back in my org against features whose unit economics didn't add up, and I was labeled not a team player, leading to negative professional development. One year later, the entire ML org was laid off because investors lost confidence in our ability to produce a sustainable business model. There is no fix as an IC for unsophisticated product and business leadership.
Right, I’m saying that if you’re looking at a roadmap and it’s vague then you need to give feedback and walk. Businesses are failing because of what you’re describing, hopefully the survivors have figured out more effective management strategies.
Completely agree. The vast majority of the failed projects fail at the planning stage. To put it simply, if the cost of a misprediction is high then it is usually not going to work because all models make mispredictions. This is something that can be identified in the planning stage i.e. it's usually completely avoidable. Also, of the 20% of projects that succeed, I wonder how many of them actually needed ML.
> The vast majority of the failed projects fail at the planning stage.
This is my experience in 20 years of SWE as well.
The biggest project failure debacles were obvious from the get-go to all the senior ICs on the team. Generally managers were pushing them for "reasons", and in some cases even wink-wink about the fact they too didn't believe in the project but.. "reasons".
This is my experience too. I find that most "regular" people think 95% is basically 100%, and not "it fails 1 out of 20 times consistently". Even 99% isn't good enough in many cases and should not be considering infallible.
Also in a case when misprediction is cheap, but makes the whole system unrealiable in such mode of work. E.g. my company tries (too late) to ride the hype and created a halfbaked tool for internal use to classify test results. Since the accuracy of neural networks is never 100% it simply doesn't matter as it doesn't save any time, all test results needs to be verified manually anyway. But several people are busy full time working on it, reporting some results, some amazing performance metrics and so on. And are very visibly upset when we push back or plainly say that the tool is worthless. :)
The problem is not if the 80% of them fail, but if of the remaining 20% you get several black swans that make profit skyrocket for the whole investment set.
The problem is if none of them, even the surviving ones, don't worth too much anyway. In that case those billions would had been wasted. But if you invested everything to just one player, and that player failed, then your whole bet failed.
Sufficiently large companies should have a healthy “internal portfolio” of projects as well. Some safe bets/easy wins, some “this probably won’t work but if it does it’s game changing” projects as well. If they want to continue to exist in a competitive market they need to strike gold occasionally (which means hitting rock a lot of the time as well).
I would guess that the failure rate is worse for AI startups in general because of lower capital and experience. These failing experiments can only shell out billions for Nvidia’s shovels for so long before they have to start selling their shirts for lunch.
I guess that perspective depends if you are a VC or a company trying to apply AI.
The VC expects to lose most of the time and hit it out of the park once in a while, for a large net gain.
For companies trying to automate or increase productivity with AI, there are unlikely to be any massively profitable winners that will make up for the failures, so too many failures is going to hurt.
I'm not sure the second statement is true - there are plenty of sectors I can think of which could be massive winners that would make up for the failures.
Obvious examples are things like AI-assisted language translation for books if you are a publisher, implementation of cancer screening technologies if you are a healthcare provider, AI-assisted drug discovery if you are a pharma firm, AI-assisted discovery if you are a legal provider, AI trading algo's if you are a hedge fund...
All of these could result in some pretty profitable winners. It's also the sort of thing that can result in 'losers' if you call it wrong - e.g. if there is a 20% chance of success and I don't invest and my competitors do, what happens if I am wrong? You don't want to be the Kodak of your industry. As an example - if I am the worlds biggest provider of call centres, do I really want to bet big that my industry will never be automated, or do I want to start investing now to prepare for that as a possibility?
The 20% chance of success isn't typically 20% chance of a moonshot paying off, but rather of a project being successful and achieving it's cost saving/productivity and user acceptance goals.
I don't think your Kodak example works very well here - they were a classic business school case of not realizing what business they were really in, but most uses of AI (whether one means LLMs, or something else like most of your examples) are going to be automation/productivity enhancement, not "pivotal" changes.
Pretty sure this is about AI projects, i.e. potentially career-ending failures, not failed AI companies which in a VC context have a high expected rate of failure.
Anecdotally, I've never seen a project be a career ender. Most individuals have many projects under their belt, it's rare that an individual has bet the farm on an effort in such a way that others would not employ them.
When a project fails, the lessons are often valuable for the next project - when a project succeeds it can often just be do to market position.
That's bad management if it's seen as career ending failure. Proper management just does the ROI on the successful ones versus overall cost. And doesn't risk everything on them. Large tech companies test hundreds of model changes and rollouts per year in AB tests for that reason.
It all depends if the rate of black swan is too low to bother with. There is probably a point where spending vc money on lottery tickets starts looking like the more pragmatic investment.
Actually an additional point, If you read the report, the 80% failure rate claim does not come directly from this report, but links to the following article as it's source:
https://fortune.com/2022/07/26/a-i-success-business-sense-ai...
(https://web.archive.org/web/20240105144440/https://fortune.c...)
Which actually further quotes ostensible sources without linking:
"That’s borne out in a slew of recent surveys, where business leaders have put the failure rate of A.I. projects at between 83% and 92%."
Unless someone is able to track down the original source of this information I would treat it with great scepticism.
> First, industry stakeholders often misunderstand — or miscommunicate — what problem needs to be solved using AI.
So the #1 problem every startup faces.
> Second, many AI projects fail because the organization lacks the necessary data to adequately train an effective AI model.
This is interesting, and reinforces the trend towards hoarding data assets.
> Third, in some cases, AI projects fail because the organization focuses more on using the latest and greatest technology than on solving real problems for their intended users.
Makes sense, tough to pick an architecture or model to stick with when better options release weekly.
> Fourth, organizations might not have adequate infrastructure to manage their data and deploy completed AI models, which increases the likelihood of project failure.
Sounds like lack of capital.
> Finally, in some cases, AI projects fail because the technology is applied to problems that are too difficult for AI to solve.
So only a minority of cases? All in all this report seems to be saying "AI is promising but startups are still hard".
>> Second, many AI projects fail because the organization lacks the necessary data to adequately train an effective AI model.
> This is interesting, and reinforces the trend towards hoarding data assets.
Companies completely misunderstand where the data is suppose to come from and attempt to hoard user data. The issue with LLMs is that the problems they are currently best suited for require data generated by the company. Manuals, decision trees, guides, tutorials, expert knowledge in general and companies aren't producing that material, because it's expensive. Also if that data existed, then maybe they wouldn't need an LLM.
Tons of LLM implementations are poor attempts to cover up issues with internal processes, lack of tooling and lack of documentation (without which the LLM can't function).
> > Fourth, organizations might not have adequate infrastructure to manage their data and deploy completed AI models, which increases the likelihood of project failure.
> Sounds like lack of capital.
I actually think this is also an engineering problem, or at least a 'human capital' issue. The skillset for developing an AI model and the skillset for deploying a massive data-based product are highly different, but people who are good at the former often get press-ganged into doing the latter. This is kind of a capital problem (more money means maybe they can hire a second person to manage the operations), but I think it's also just a general lack of awareness that MLOps is really it's own thing. Especially when you're moving fast, tech-debt with these systems builds up really quickly (shockingly quickly). More money lets you hide these problems better, but IMO the solution is only going to come with time as people develop better and better best-practices for this type of project.
edit: There's a section in the full report called 'Too Few Data Engineers' that does a better job making this point. Everybody wants to make fancy AI models, nobody wants to be responsible for the 10K lines of uncommented Python and SQL you're using to build your test/train sets
>Everybody wants to make fancy AI models, nobody wants to be responsible for the 10K lines of uncommented Python and SQL you're using to build your test/train sets
I'm unfortunately the guy who that gets dumped on and it's the most hated part of my job. I've tried talking to the people who authored such atrocities but they refuse to acknowledge that's bad code and have huge egos about it and see any slam dunk tools like using a linter to be an impediment to their work.
Is anyone else finding their company is asking teams to “insert ai everywhere any way you can”?
That’s a sign of a problem imho. The hype is so high the directives are to use ai everywhere regardless of fit. I’m a believer of ai but shoehorning it into everything as that currently boosts stock prices seems insane.
* blocking every known LLM url due to fear of leaking information to it
* not wanting to hire expensive data scientists for any in house development
I even asked an Engineering Manager at Meta how much their own team use Llama day to day to multiply their productivity. Their answer was they don't use it at all, and they weren't aware of any internal tooling to utilize it for work
This kind of fits the narrative of some of the Mag7 earnings calls where they more or less say "we aren't sure where the revenue will ever come from.. but its a game theory style arms race where we can't afford to NOT be there if someone figures out how to make revenue in the space".
So the big guys are buying GPUs, building out datacenter, developing & training models, etc.. just in case.
Maybe LLMs will change some niche dramatically, maybe it will reshape society, or maybe nothing.
More prior revolutionary developments end up like crypto, voice assistants, IoT, smart homes than the number that end up like smartphones, web, or the PC.
I think the only case I can think of where AI will revolutionize positively is self driving cars. Revolutionizing transport will have huge implication. The next thing would be robots, but that's just making people lazy in the household and replacing jobs. Crypto, voice assistants, IoT, smart homes, are bad examples are these have a great chance to grow more still. They will probably replace smartphones as smartphones did with PCs.
I semi-purposely left out self driving cars, and the revolution it has/is/will provide again.. remains to be seen. Waymo is nearly a 20 year old project at this point and is seemingly quite great in 2 cities, serving ~1% of the US population. These cities also happen to be in warm climates so theres a whole slew of environments / "edge cases" they just don't have to deal with. Maybe 20 more years?
So it's both outperformed what pessimists might have said (never work) and vastly underperformed what the median enthusiast projected (it's always just a few years away). I'd wager we are still teaching teens how to drive even 20 years from now.
LLMs are ~7 years old, so maybe another decade to being useful if we go by self driving cars learning rate?
Meanwhile at Google, 50% of code characters are from LLM autocomplete: https://research.google/blog/ai-in-software-engineering-at-g... Which is a little disconcerting. Maybe need to up my code review game. Also I don't personally use them at all - am I really missing out? Sometimes I wonder.
Yes, I was on an internal project recently that wanted to use LLMs in a way that was appropriate to evaluate if changes between two versions of a text were semantically meaningful, and limited to that scope, it would've been a really valuable tool.
We had a directive from management to, for political reasons, use AI in the tool as much as possible to show how innovative and forward-thinking the company is. This led to a bunch of poorly-thought-out choices and while the project is in production and has internal users... I don't think it was particularly successful.
Not all of that is due to the "use AI" directive; there were also poor technology and deployment stack choices that made things overly complicated and cost us a bunch of time.
Common issue when new tech comes out. The people who know the tech, but not their companies business focus on the tech. Many of them will get promotions, and make their way up in the company. The company will lose likely millions, and the guy will leave to damage another company. If the company is lucky, another person who takes the time to understand the companies customers will come in... throw away the stuff their predecessor built, and solve some real problems. If the company is unlucky, they will double down on the over complicated solutions, and lose to a startup that ignored the sexy stuff and focused on the customer problem.
I work at a financial services company that is quite behind their peers tech-wise and watching the internal politics of AI has been fascinating.
Management seems to see this as their opportunity to catch up on the cheap.
Rather than having modern tech systems and properly staffed engineering department, let's just uh.. have non-technical people do AI hackathons! Also instead of automating excel jockey jobs with server side data pipelines, what if we.. you guessed it.. gave the excel jockeys AI!
I can imagine this playing out in a lot of industries where the underdogs think its a shortcut and yet..
This is happening at my company. Thankfully I'm senior enough to push back on most of the requests to add AI as I still haven't found a good use case for it in our product.
"There's a new technology out there that makes new things possible, let's explore whether it makes sense to integrate it into what we're doing?" - not only is that the correct attitude, in the long run it's the only attitude that keeps companies alive assuming they have exposure to tech.
See the internet/web revolution, the mobile revolution, etc.
I recently had the opposite, where the CEO of the startup was killing almost all ideas of adding AI to their products. Perhaps AI is just polarizing. Some companies are jumping at the opportunity. But older startups like the one I was working for are being ultra-conservative with spend and are maintaining a wait-and-see attitude.
that's expected behavior on any new wave right? I've seen the same with microservices, ORMs, SPAs, etc - "use this hammer any way you can!" - and with product trends (crypto, mobile apps, SoLoMo, etc). It's normal. Companies live and die on how well they surf hype cycles.
Yes. No one learned any lessons the last time around with "put everything on the blockchain".
Or maybe they did learn you can make a profit off of hype alone, but it's not making the end user or anyone else's life better as a result. Who cares - line goes up, people get promotions.
Isn't that better than normal? It used to be 90% of startups going belly up within 3 years, and out of remaining 10%, 9% becoming zombies and 1% having a proper exit? This looks more like Pareto 80/20 which is way better.
It's about in-house development on large companies...
So, that would make it about 2x worse than normal. What IMO, sounds way too good to be true.
(But then, I've seen AI projects being determined complete successes by having the same kind of result that would be considered failures on a normal product: being complete, but nobody using them.)
A few percentage points? Should mean nothing for vcs as they look to 10x profit. This is the time to go heavy with investing. Turn the stones others are afraid to turn because of spooky single digit percent interest rates. More likely to find your 10x now than when money was cheaper.
“DART achieved logistical solutions that surprised many military planners. Introduced in 1991, DART had by 1995 offset the monetary equivalent of all funds DARPA had channeled into AI research for the previous 30 years combined.”
I work on a team doing some shitty AI feature, and as far as I can tell the only reason it's still alive is because our C-level has overdosed on the kool-aid and are adamant that they can squeeze blood out of the AI stone. Pretty much everyone in engineering is telling them it's a monumental waste of time, effort & money (especially money, our AI tooling/provider bills are astronomical compared to everything else we pay for), but to them the word "AI" holds so much power that they just can't resist sinking further and further resources into it.
It's really reinforced in me the knowledge that most execs are completely clueless and only chase trends that other execs in their circles chase without ever reflecting on it on their own.
Currently hugged so I can't read the article, but I can only wonder how this compares to the batting average of any given R&D effort. 20% of projects succeeding on a cutting edge technology might be pretty good, no?
And in a hype cycle many many more projects get off the ground that normally, outside of a hype cycle, wouldn't have ever received the requisite funding.
But the man in black leather said that people don’t need to learn to code because AI will now do all the coding. Who should we believe?
Also, it is funny seeing how all the AI true believers in this thread coping. I am going to go short Nvidia after its earnings whatever the earning results. It is such an obvious trade.
It is impossible to overstate how risky options are in this situation.
The implied volatility for NVDA is astronomical. Put differently, the OP isn't the first one with the idea to short them, so this incredible demand for options drives the premiums up substantially.
That stack of (temporary) paper you are paying for is likely way more expensive than you think it is.
Shorts I think are only useful if there is an imminent and concrete devaluation event such as defaulting on credit etc. A general predicted market downturn is hard to tie to stock behavior, especially within a concrete time frame.
Coding is one area where AI has been successful. This is about other areas where AI has been unnecessarily inserted. You're misdirecting your Schadenfreude.
> Coding is one area where AI has been successful.
I have my doubts about this. Do you have actual good data for this? Most devs including streamers like primeagen and others seem to think Devin is a joke.
Sure, but if AI was actually useless for coding I doubt we would see numbers like this in the Stack Overflow Developer Survey: https://survey.stackoverflow.co/2024/ai/
These data points are not a good indicator of AI being useful because we don’t have a baseline to compare it to.
The baseline is the same IDE tooling support we see with copilot but with more traditional tech like search engines and API docs search. Dev productivity without relying on AI is a market few cared about until AI money came along looking for problems to solve.
I can only speak for myself but AI has had a profound effect on my coding. I really can't imagine going back and search engines and API docs are not even in the same ballpark. I wrote a bit on why here: https://news.ycombinator.com/item?id=41350824
I wasn't sure if I would but I agree with the points you made. I'm generally not for using AI but there are places it works great and as you said, stuff I don't want to do because it's tedius and mindless, well the AI is great at that...
One use I had today... Make all the fields on this record nullable and add these attributes to them. Done in seconds what would really have been at least 30 minutes of work. It's mindless and tedius but I didn't need to do it...
I don't think I would trust the AI to catch edge cases I didn't think about, but I get the feeling for you it's more that dealing with them would be super tedius and they are truly rare. Like an http server unable to create a socket... It's an edge case but realistically let it crash, well the AI can write some more graceful exit with logging I guess...
NVIDIA may be overvalued, but they can probably "grow into" this valuation with ongoing industry adoption and inference volume, even if LLMs don't get a whole bunch more capable than they currently are. Google now selling corporate Gemini annual licenses for $200/pop to help write e-mails and marketing drivel, etc.
How many signups is Google getting for that? Will it sustain their capital expenditure and cover the depreciation? All sever equipments come with an expiration date. Based on my experience, all the LLMs have an initial cool factor that wears off pretty quickly. The productivity boost is questionable at best and downright negative in many cases. This is true in many AI projects. Just look at the Computer Vision geniuses that gave us Amazon Go Indian Mechanical Turks.
> All sever equipments come with an expiration date
Corporate mindset. When I worked at a major cloud provider the platforms group evaluated the ongoing economic cost/benefit of existing machines and every year the decision to retire ancient machines was "not yet". A suit-wearing corporate IT guy gets new hardware every 3 years. GCP will still rent you a Sandy Bridge machine from 2011. EC2 still has Haswell CPUs in its mainstream offering and if you really want them they have Harpertown Xeon from 2008 available.
I also question the actual productivity benefit, but I'm not sure most large corporations are so rational in their decision making. Once they've bought Gemini and CoPilot licenses for whole swathes of the business (something which it seems we're still at the beginning stages of), then how likely are they to reevaluate and go back to the old "manual" way of doing things? It's a bit like the "no-brainer" decision to outsource developer jobs to India, based on lower salaries, and then subsequent failure to measure productivity to see if it is really paying off.
I work with Indian developers. They are usually less motivated and laidback compared to Americans. The famous Indian polychronic time culture is real. Bad developers regardless of geographic location can create garbage and the more motivated they are the more garbage they generate. So, good developers is best, followed by unmotivated bad developers under the supervision of good developers.
From the report 80% is "twice the rate of failure for information technology projects that do not involve AI" (https://www.rand.org/pubs/research_reports/RRA2680-1.html) so seems that a 20% hit-rate isn't actually that great. Possibly it's a quirk of how they're normalizing the success/failure count?
Most of those non-AI projects are likely using well-established practices and technologies. From that perspective a 20% success rate seems pretty good to me.
Sure that makes sense, overall "success of a technology product" seems like a very fuzzy thing to try and measure so I imagine one could spin the numbers basically however you want
I think that is the question, how much legitimate R&D is really going on here vs trying to shove an LLM into some random hardware and ship it?
Or shove an LLM in some app trying to solve a problem that it isn't capable of.
No doubt that creating these models are hard, having the data is hard. But how much of the AI startups is actually that vs just shoving the OpenAI API in something.
It's not a hit rate I think, it's a not yet failed rate. A lot of these were started just recently. And a lot is deep in red but is supported by the VC IV line. And by supported I mean totally. If a Word competitor needs to cut costs it theoretically can scale down and survive. If a neural network startup needs to cut costs it has to shut down operations, due to incredible price to train models and run inference.
I think we’d need to dig into the 80% that failed and what kind of “AI project” they were. Is this really R&D? Or were you trying to insert AI into something that didn’t need it, and failed because no one is using your expensive and annoying “chat with us” popup that your VP insisted would keep your company competitive?
There's so much LLM shovelware getting spammed here daily that I have a hard time believing 20% of all projects are succeeding. Are we even far enough into the LLM era though for bad projects to have run out of their borrowed time?
This seems like a very strong selling point for B2B AI providers versus in-house enterprise builds of AI.
> First, industry stakeholders often misunderstand — or miscommunicate — what problem needs to be solved using AI.
The provider at least partially validates that this is a problem space that AI can improve which lowers the risk for the enterprise client.
> Second, many AI projects fail because the organization lacks the necessary data to adequately train an effective AI model.
The provider leverages it's own proprietary data and/or pre-trained models which lowers the risk for the enterprise client. They also have the cross-client knowledge to best leverage and verify client data.
> Third, in some cases, AI projects fail because the organization focuses more on using the latest and greatest technology than on solving real problems for their intended users.
Provider, especially startups, will lie about using the latest tech while doing something boring under the hood. This, amusingly, mitigates this risk.
> Fourth, organizations might not have adequate infrastructure to manage their data and deploy completed AI models, which increases the likelihood of project failure.
The provider manages this unless it's on-prem although in the latter it can provide support on deployments.
> Finally, in some cases, AI projects fail because the technology is applied to problems that are too difficult for AI to solve.
Still a risk but a VC or big tech budgets covers that so another win.
Timing will play a role in how that number shakes out. AI projects often stacked huge investments and may not yet have had enough time to burn through all the cash or have it clawed back by investors.
> 1 For this project, we focused on the machine learning (ML)
branch of AI because that is the technology underpinning most
business applications of AI today. This includes AI models
trained using supervised learning, unsupervised learning, or
reinforcement learning approaches and large language models
(LLMs). Projects that simply used pretrained LLMs (sometimes
known as prompt engineering) but did not attempt to train or
customize their own were not included in the scope of this work.
buried in a footnote. i wasn't sure what "ai project" actually meant
I wonder what the failure rate if it actually included "things that use a llm as an api" is too
In my experience, hiring managers worry way too much about finding research engineers with deep math skills when at the end of the day they need software folks to operationalize simple maybe slightly fine-tuned foundation models.
When it comes to truly novel things, 20% of even modest success is a very high number. I worked in research heavy places (industry labs) over the last decade and if 90% of things you try do not fail, your work is not ambitious enough. That is very hard thing for a SWE to live with, but such is the price of progress. The remaining 10% tend to make it worthwhile. 20% is twice that. It needs to go lower still - you’re not going to succeed by just finetuning yet another llama variant.
>> By some estimates, more than 80 percent of AI projects fail — twice the rate of failure for information technology projects that do not involve AI.
So 40% of projects with more proven/experienced technologies fail? That's super high. Replace "AI" with any other project "type" in the root causes and sounds about right. So this feels more of a commentary on corporate "waste" in general than AI.
80% seems far too optimistic. From what I know of projects and development I would think upwards of 90% of all software projects are never shipped. Maybe 95%. Even higher would not surprise me. Maybe this is considered pre-crash or pre-burn by them.
Maybe "80% of projects that get publicly acknowledged and are expected to be successful" crash and burn. It must be so much higher.
Most of these projects are too similar in nature to succeed to begin with... Everyone is out to create yet another text chat bot or an image maker and then slap Google Authentication on it and tricks to get people to enroll into a monthly $ubscription... Few are out to be visionary and make products that can be sold to companies that will integrate the tools into their apps
This movie is familiar…most of this summary in the Rand report applies/applied to any overhyped new technology. E.g., try substituting “NoSQL” for “AI” and see how well most of it reads.
It says the problem is management understanding is failing, but its more they do not listen to their technical staff but instead read the equivalent of Cool Stuff Magazine, know their investors read it too, and that the next board meeting is going to revolve around asking them what their grand strategy for AI is.
It doesn't matter that they dont have one, frankly most of their data projects fail anyway and you just need one article published about your new vision to sell it for another six months your investor class.
I bet larger part of that was project decided by the management, with unrealistic goals and some external interested party stating they are possible...
If you only have to explore five time-bound AI* projects to discover one that eradicates recurring costs of toil indefinitely, arguably you should be doing all of them you can.
* Nota bene: I'm not using AI as a buzzword for ML, which the article might be doing. In my book, a failed ML project is just a failed big data / big stats project. I'm using AI as a placeholder for when a machine can take over a thing they needed a person for.
This is answered. They only looked at projects that actually implement machine learning etc, and they did not look at projects that use ready to use models (the so called prompt engineering projects)
Site has a database error, so can't read the report. But here's what it probably says. "80% of AI projects don't solve a critical customer need, and find themselves with low usage/sales, and eventually run out of runway"
It's the same reason most businesses fail. Sell something people want, and people will buy it. Sell something people don't care about, even if it's powered by cool tech, people still won't buy it.
It probably also says something about the high cost of AI... but frankly if you're providing enough value to the customer, you can up your prices to compensate. If your value is too low (ie: not selling something people want) people won't pay it.
Yeah, as predicted. As a film guy I told my totally hyped colleagues a few years ago that 3D films are not going to stick in the way they expected. When Bitcoin and crypto currencies started to become the next big thing I was the only person in my circles that had actually tried purchasing something with it in a real world setting, years prior. When LLMs became The Shit, I warned against overblown expectations as I had some intuition a out the limitations about it stemming from my own machine learning experiences.
And the only reason I was right all these times was because I looked at the technology and the technology did not remotely convince me.
Don't get me wrong stereoscopic Films (or 3D as they called it) are impressive in terms of technology. But the effects within movies doesn't bring much. The little distance that remains when people look onto a screen instead of being in a world is something many people need. 3D changes that distance which is not something everybody enjoys.
To be honest, I am really frustrated with what is happening: the hype train killing something which in principle could be a good thing, as usual. In the second half of the 2010s, it was blockchain: Payments - blockchain is the solution. Logistics - blockchain. World hunger - blockchain. Cure for cancer - blockchain. 75% of all job offers from startups were blockchain-related, and admittedly, I worked at such a startup, which, from what I'm able to gather, is a few months away from total collapse.
With vision models in the late 2010s, I was seeing AI winter 2.0 just around the corner - it felt like this was the best we could come up with. GANs were, to a very large degree, a party trick (and frankly, they still are).
LLMs changed that. And now everyone is shoving AI assistants down our throats, and people are trying to solve the exact same problems they were before, except now it's not blockchain but AI. To be clear: I was never on board with blockchain. AI - I can get behind it in some scenarios, and frankly, I use it every now and then. Startups and founders are very well aware that most startups and founders fail. But most commonly, they fail to acknowledge that the likelihood of them being part of the failing chunk is astronomically high.
Check this: a year and a half after ChatGPT came about and a number of very good open-source LLMs emerged, everyone and their dog has come up with some AI product (90% of the time it's an assistant). An assistant which, at large, is not very good. In addition, most of those are just frontends to ChatGPT. How do I know? Glad you asked - I've also been very critical of the modern-day web since people have been doing everything they can to outsource everything to the client. The number of times I've seen "id": "gpt-3.5-turbo" in the developer tools is astronomical.
Here's the simple truth: writing the code to train an AI model is not wildly difficult with all the documentation and resources you can get for free. The problems are:
Finding a shit load of data (and good data), which is becoming increasingly more difficult and borderline impossible - everyone is fencing their sites, services, and APIs - APIs which were completely free 2 years ago will set you back tens of thousands for even basic data.
As I said, the code you need to write is not something out of reach. Training it, on the other hand, is borderline impossible. Simply because it costs A LOT. Take Phi-3, which is a model you can easily run on a decent consumer-grade GPU. And even if you are aiming a bit higher, you can get something like a V100 on eBay for very little. But if you open up the documentation, you will see that in order to train it, Microsoft used 512x H100s. Even renting them out will set you back millions, and you can't be too sure how well you would be able to pull it off.
So in the grand scheme of things, what is happening now is the corporate equivalent of pump-and-dump. It's not even fake it till you make it. The big question on my mind is what would happen with the thousands of companies that have received substantial investments, have delivered a product, only for it to crash the second OpenAI stops working. And even not so much the companies, but the people behind these companies. As a friend once said, "If you owe 1M to the bank, you have a problem. If you owe 1B to the bank, the bank has a problem." In the context of startup investments, you are probably closer to 1M than 1B. Then again investors are commonly putting their eggs in different baskets but as it happens with investments and the current situation, all baskets are pretty risky, and the safe baskets are pretty full.
We are already seeing tons of failed products that have burned through astronomical amounts of cash. I am a believer in AI as an enhancement tool (not for productivity, not for solving problems, but just as an enhancement to your stack of tools). What I do fear is that sooner or later, people will start getting disappointed and frustrated with the lack of results, and before you know it, just the acronym "AI" will make everyone roll their eyes when they hear it. Examples: "www", "SEO", "online ads", "apps", "cloud", "blockchain".
From personal experience this seems like it holds for most data-products, and doubly so for basically any statistical model. As a data scientist, it seems like my domain partners' vision for my contribution very often goes something like:
0. It would be great if we were omniscient
1. Here's some data we have related to a problem we'd like to be omniscient about
2. Please fit a model to it
3. ????
4. Profit
Data scientists and ML engineers need to be aggressive at early planning stages to actually determine what impact the requested model/data product will have. They need to be ready for the model to be wrong, and need to deeply internalize the concept of error bars, and how errors relate to their use-case. But so often 'the business stuff' gets left to the domain people due to organizational politics and people not wanting to get fired. I think the most successful AI orgs will be the ones that can most effectively close the gap between people who can build/manage models, and the people who understand the problem space. Treating AI/ML tools as simple plug and play solutions I think will lead to lots of expensive failures.