Hacker News new | past | comments | ask | show | jobs | submit login
Nvidia's CUDA Monopoly (matt-rickard.com)
105 points by ingve on Aug 7, 2023 | hide | past | favorite | 120 comments



Nvidias CUDA monopoly will end itself.

Nvidia is pushing upmarket, focusing on data center products it can charge huge amounts for.

As Nvidia pushes upmarket, the traditional computing market forces will come into play. These forces have played out for 50 years.

Users will seek cheap and available GPUs and they’ll find a way to get the job done with them.

At the moment, people say they can’t use retail GPUs because of RAM constraints.

This will change however. Through necessity, software will be developed that gets the job done on consumer grade GPUs. It might be open source. It might be AMD or Intel software.

This has always been the way with computers…. innovation at the low end will foil the plans of IBM to control the entire market. Oops did I say IBM? I meant Nvidia.

Nvidia has the chance to own everything long term, but monopolists can’t help but become greedy and become their own worst enemy. Nvidia is milking its customers for huge, huge profits. The customers will find another way, this Nvidia will have indirectly created its true competition.

If Intel and AMD want to defeat Nvidia they need to not play Nvidias game of going for the high end and turning its nose up at the low end. AMD and Intel need to produce the lowest cost, most powerful GPUs they can. They also need to attack Nvidia where it hurts. Nvidia has artificial constraints on its GPUs. Through drivers it prevents certain uses so that customers are forced to use high end data center GPUs. Where ever Nvidia has artificial constraints, AMD and Intel need to NOT have those constraints.

Where Nvidia is closed source, Intel and AMD need to be open source.

Nvidias dominance won’t end, but it’s monopoly will, and viable competition will form simply because Nvidia is so anti consumer, and in the computing game consumers are very resourceful.

What’s needed is a bunch of smart programmers who know how to squeeze usable AI out of a 24 gigabyte AMD 7900 XTX.


Given the current situation, this is wishful thinking/a long way off. Nvidia is considerably better at letting people get the job done on consumer grade GPUs than AMD.


Given the amounts of money and attention being bought to bear? It'll be 12-24 months or less.

Tonight I'm going to play around with LLMs. I'm probably going to go to bed early because my AMD graphics card running ROCm (on an unsupported system and a probably unsupported graphics card) will eventually cause a kernel panic.

In some sense that is unforgivable. In other senses, they are only 1-2 bugs away from being perfectly good enough and competitive with Nvidia for me. That isn't much of a moat. It usually takes an hour or so for the drivers to collapse and for those hours everything works great.


> It'll be 12-24 months or less.

AMD also said this 24 months ago.


I think I've been hearing that ATI/AMD drivers are about to catch up with nVidia for 20 years.


You, for many definitions of "you", never work on a supercomputer. Despite the fact that today's smartphones are faster than a Cray-1 of the yesteryear. Similarly, you don't need to have ATI/AMD to catch the absolute performance crown from Nvidia, like AMD did several times with Intel, for you to use AMD solutions.

We'll see how smart the market and the monopolist are this time around, yes, in the next 12-24 months... and maybe longer.


Well, most of the new DoE machines are MI300 based so AMD seems to be doing pretty well even at the supercomputer level :)


Good point. I forgot about the broken record that was gaming drivers.

I was specifically referring to AMD's investments in a unified computing stack (their equivalent of CUDA) which were supposed to bring them to computational parity in the AI/ML.


“AI Developers Are Buying Gaming GPUs in Bulk”

https://www.extremetech.com/computing/here-we-go-again-ai-de...

Retail gaming GPUs are available and they’re MUCH cheaper than cloud GPU.

It’s happening now.

AI software needs to make a choice too, does it only run in Nvidia, or does it make itself compatible with the mass market.

For 50 years software developers have chosen to make their software work on cheap mass market devices.


But retail gaming GPUs are overwhelmingly Nvidia? https://store.steampowered.com/hwsurvey/videocard/


Those overwhelmingly cheap Nvidia gaming GPUs got purchased for gaming use - and have usually low tensor compute capabilities and low VRAM ammounts so they're less desirable even for post-training quantization generative large language models.


Nvidia imposes constraints on its drivers so you cannot really make full use of them and must go for high end data centre devices.


It's true that a lot of ML development has taken place on gaming GPUs, over the past decade and a half. And you can certainly get a long way with gaming GPUs in a lot of areas of ML - 8GB is enough to run things like stable diffusion, just about.

However, some ML models (such as LLMs) demand a lot of vram to train. Simultaneously, a lot of gamers have been unimpressed by the latest generations of gaming GPUs due to the limited vram.

nvidia's GTX-1070 released in 2016 with 8 GB of vram and an MSRP of $379. Then the RTX-2060 Super, with 8 GB of vram and an MSRP of $399. Then the RTX-3060 Ti, with 8 GB of vram and an MSRP of $399. Then the RTX-4060 Ti with - you guessed it - 8 GB of vram and an MSRP of $399.

Some people think nvidia is deliberately being miserly with vram on gaming cards, to try and force ML users onto the $$$$$ data centre GPUs.

I suspect this is what andrewstuart means by "Nvidia is pushing upmarket" - that they're deliberately letting their gaming products languish, in pursuit of the data centre market.


> 8GB is enough to run things like stable diffusion, just about.

4GB (from personal experience, 2GB from what I’ve heard) is enough to run stable diffusion 1.x/2.x with the major current consumer UIs and the optimizations they make under the hood. Not sure about SDXL. The original first-party inferencing code takes more.

> Some people think nvidia is deliberately being miserly with vram on gaming cards, to try and force ML users onto the $$$$$ data centre GPUs.

And, sure, as long as there is no real competition, that kind fo segmentation makes sense. OTOH, if there is competition and desktop ML demand, it will make progressively less sense. So, the question seems to be, will there be competition that matters?


> AI software needs to make a choice too, does it only run in Nvidia, or does it make itself compatible with the mass market.

Who's making that choice? It sounds like a lot of people want to make this stuff run well on iPhone and Android and AMD and Mac, but they just don't have the API control or insider help. This isn't a thing where Open Source developers step up to the task and fix everything with a magic wand. This is a situation where the entire industry converges on a compute standard, or Nvidia continues to dominate. They are betting on Apple, AMD and Intel being stuck in an eternal grudge-match, and their residuals won't stop rolling in.

The real question is much less hopeful; can the industry put aside it's differences to make a net positive experience for consumers?

...probably not. None of the successful companies do, anyways.


But surely it is in the best interest of the industry as a whole to ensure that there is parity across the competitors so they aren't milked forever? ie they could invest time and effort into making the AMD compatibility better, and the Intel Arc equivalent.

I say this as a person who's diving right into the NVidia moat. "Well everyone gets NVIdia GPUs so that's what I should get".


Sure,

this isn't in the hands of the industry as a whole, though. The question is: Does AMD think they can earn a better profit down the road than if they invest in/focus on something else (e.g. maintaining their lead over intel on x86, continue being dominant in the console gaming GPU sector, etc...).


> But surely it is in the best interest of the industry as a whole to ensure that there is parity across the competitors so they aren't milked forever?

So what? “The industry as a whole” isn’t an actor that makes decisions.


> What’s needed is a bunch of smart programmers who know how to squeeze usable AI out of a 24 gigabyte AMD 7900 XTX.

The first thing that's needed is for AMD to wake up and invest into its tooling stack.


That's the article we need

What's AMD tools look like and what would it take to catch up with CUDA

and I guess it does not depend on AMD alone, the cloud providers trying to get alternatives to Nvidia could be investing into this too


There have been tons and tons of articles about AMD needing to wake the fuck up, but they seem to focus all their efforts into their console business - not without reason though, both major latest-gen consoles use AMD SoCs, and that's a fairly stable market compared to the boom/bust that hit NVIDIA hard with crypto.


The reason (IMO) that AMD will never make ROCm actually comparable to CUDA is so that they don't have to worry about their gaming GPUs cannibalize their data center GPUs. nVidia is stuck putting tiny VRAM pools onto gaming GPUs so they are less attractive as compute, and AMD doesn't want to put themselves in the same position.

Maybe at one point they wanted to beat CUDA, but they are pretty happy with the feature segmentation now.


> The reason (IMO) that AMD will never make ROCm actually comparable to CUDA is so that they don't have to worry about their gaming GPUs cannibalize their data center GPUs.

At least by revenue, consoles are now the leader in gaming anyway, the PC market is shrinking - the old dudes crowd is aging out of hi-perf gaming (due to work and having kids), and the young crowd is either on hassle-free consoles or mobile). And a lot of the "gaming" GPU marketshare of the last few years were coin miners.

If AMD wants to be known for something else than tying their future to two large companies who can always hop back to NVIDIA, they absolutely have to step up their compute game to make sure they at least have the ability to pivot, should either of their deals with Microsoft and Sony fall apart. NVIDIA is the best example, it's a miracle Soldergate didn't sink them entirely a decade ago.

[1] https://www.gamesindustry.biz/report-pc-and-console-global-g...


The Switch uses NVidia.


First Intel and AMD need to provide stable drivers, and a polyglot stack to program GPUs across C, C++, Fortran, Julia,...anything PTX, IDE and graphical debuggers tooling for all GPGPU layers, and powerfull set of GPGPU libraries.

Then maybe, most researchers will chose anything other than CUDA.


I think geohot is working on that with tinygrad. Activity on the ROCm repo seems to have increased a lot recently:

https://github.com/RadeonOpenCompute/ROCm/graphs/code-freque...


Last I heard he's abandoned working with AMD products.

https://github.com/RadeonOpenCompute/ROCm/issues/2198#issuec...



He had a phone call with CEO of AMD and she has been redirecting teams to start fixing issues. They seem to be motivated to fix their problems


>What’s needed is a bunch of smart programmers who know how to squeeze usable AI out of a 24 gigabyte AMD 7900 XTX.

What would be the motivation of said programmers? Who will pay them?


Presumably anyone who wants to to do AI work but doesn't want to get caught up in the expense and queues associated with the Nvidia data center parts and is willing to take a risk to get it done.


Presumably the hope is, similar to Valve and Wine, there is a company that spends vast amounts of $$$'s on GPUs and so the cost of hiring a couple of engineers to fix the AMD situation isn't a large percentage, and they are able to predict massive amounts of savings.

Whether the Valve situation strikes twice though isn't guaranteed. We already see all the other PC software stores continue to crowd around Windows providing apologies for its actions. If an AI firm can see past the next quarter, it could have huge ramifications.


Heaven forbid a human do something not for monetary gain


You do know that the CEOs of Nvidia and amd are family, right?


Well I agree with VRAM restraints, I don't think low-cost hardware is the real barrier, although certainly part of the equation. There needs to be a true competitor to nvidia's AI stack cuda. ROCm is not supported across all of AMD's consumer range GPUs nor has feature parody with Cuda. The biggest issue nobody's matched the developer experience of Cuda.

I'm not sure about the state of intel software stack or hardware. Amd hasn't reached hardware parody either with some of the specialized execution units that are available across all most of nvidia's products. Perhaps that's low hanging fruit for AMD?

I'm looking forward to the day that somebody fully upturns this market with products that are available to end the consumers.


People harp on the "NVIDIA pushing upmarket" but the reality is that NVIDIA is running their lowest operating margins since ~2011 right now, and that's with a secular shift towards high-margin enterprise products being rolled into those topline numbers.

(cool bug facts: for the Q1 numbers, AMD's gaming division actually had a higher operating margin than NVIDIA as a whole including enterprise sales/etc! 17.9% vs 15.7%)

https://www.macrotrends.net/stocks/charts/NVDA/nvidia/operat...

https://ir.amd.com/news-events/press-releases/detail/1146/am...

The new generations of goods simply are that expensive to produce and support - TSMC N4 is something like 3-4x the cost per wafer of Samsung 8nm, even with the smaller die sizes and smaller memory buses these products are actually some thing on the order of twice as expensive as Ampere to produce. Tapeout/validation costs have continued to soar at an equal rate, and this is simply a matter of physics, not something TSMC controls, therefore not something that can be changed by pushing profits around on a sheet.

https://www.tomshardware.com/news/tsmc-will-charge-20000-per...

https://semiengineering.com/how-much-will-that-chip-cost/

MCM doesn't really affect this either - this is about how many mm2 of silicon you use, not how well it yields. Even if you yield at 100%, 4x250mm2 chiplets is still 1000mm2 of silicon. And the cost of that silicon is increasing 50-75% for every node family you shrink. Memory and PCIe PHYs do not shrink, and cache has only shrunk modestly (~30%) at 5nm and will not shrink at 3nm at all. So at the low end you are getting an additional crunch where the design area tends to be dominated by these large fixed, unshrinkable areas.

This is the fundamental reason that has followed NVIDIA on pricing for years now with products like 5700XT, 6800XT, Vega, Fury X, etc. TSMC N7 was already around twice as expensive per wafer as Samsung 8nm or TSMC 16FF/14FF. NVIDIA was using cheap nodes and AMD was being forced to use expensive leading-edge nodes just to remain competitive (and despite failing to take a commanding lead). They didn't follow because their cost-of-goods was higher - which is also the reason AMD cut memory bus width in RDNA2, along with PCIe width and things like video encoders on certain products. They were on the leading nodes, so they took the hit to area overhead sooner, and started looking for solutions sooner. It's not oligopoly, consumers just haven't adapted to the reality of cost-of-goods going up.

AMD being willing to cut deeply on inventory at the end of the generation is one thing, but you also have to remember that the Gaming Division (RTG+consoles) is barely in the green right now and that's even with AMD signing a big deal for next-gen consoles that is more or less straight profit for them. They're losing money on RDNA2 hand-over-fist right now, this is a clearance sale at the end of the generation, not a sustainable business model.

When they can undercut they have undercut - like 7900XT/XTX. The 4080 is fundamentally unsaleable at its current price, and 4070 Ti isn't exactly great either, and when AMD saw the sales numbers they cut the prices. That's the counterexample to oligoploy - they are willing to do it when they can. When they match pricing, or do a token undercut, it's usually because they can't afford to do drastically less than NVIDIA themselves, because they're affected by the same costs. For example Navi 32 (7700XT/7800XT) is going to be pretty unexciting because MCM imposes a big area/performance overhead and they simply can't go way cheaper than NVIDIA, despite the 4060 Ti also being an awful price.

As such, the premise of your post is fundamentally false. AMD generally costs about the same as NVIDIA, and that's part of the reason they've failed to take marketshare over the last 10 years. It's generally a less attractive overall package with less features, other than sometimes having more VRAM (but not always, Vega had less than 1080 Ti, Fury X had less than 980 Ti, 5700XT had the same as 2070/2070S and less than 1080 Ti, etc). And this is because AMD is fundamentally affected by the same economics in the market as NVIDIA. And Intel will be too, when they stop running loss-leader products to build marketshare.

Consumers don't like it but this is the reality of what these products cost to design and manufacture today. And you don't have to buy it, and if those product segments don't turn a profit they will be discontinued, like the $100 and $150 segments before this (where are the Radeon HD 7750s and 7850s of yesteryear?). That's how capitalism works, the operating margins have already been reduced for both companies, they're not going to operate product segments at a loss just so gamers can have cheap toys. Even if they do it for a while, they're not going to produce followups to those money-losing products.

Nor is NVIDIA leaving the gaming market either. They'll continue to produce for the segments that it makes sense to produce for. The midrange (which is now $600+) and high-end ($1000+) will both continue to see good gains, because they're less affected by the factors killing the low end. And MCM will actually be incredibly great in the high end - imagine two or four AD102-sized chiplets working together. But it's going to cost a ton - probably the high-end will range up to $4k-8k within a decade or two.

The low end will have to live with what it's got. AMD holding the 7600/7500 back on N6 (7nm family) is the wave of the future, and it seems like NVIDIA probably should have done the same thing with the 4060. A 3060 Ti 16GB for $349 or $329 would probably have been a more attractive product than a 4060 8GB at $299 on 4nm. Maybe give it the updated OFA block so it can use the new features, and call it a day.

If $300 is your budget for a GPU, buy a console - the APU design is simply a more efficient way to go. A $300 dGPU is about 90% of the hardware that needs to go into a console, and if you just bolt on some CPU cores and an SSD you're basically there. The manufacturing/testing/shipping costs don't make sense for a low-end product (and $300 is now entry-level) to be modularized like this anymore, integration brings down costs. The ATX design is a wildly inefficient way to build a PC, and consoles eschew it for a reason. A Steam Console could bring costs down a lot, but it still will involve soldered memory and other compromises the PC market doesn't like (but will be necessary for GDDR6/7 signal integrity etc). Sockets suck, they are awful electrically and ruin signal integrity. The future is either console-style GDDR or Apple-style stacked LPDDR.


Exactly. I have written something similar in the past on HN. But 99.999% of HN are complaining about Nvidia's pricing without actually understanding the basic of Semiconductor and Fab Pricing. Even when those information were fed to them they simply refuse to read or believe.

All they wanted is Big equals Evil. And the classic underdog story.

But it is nice to see there is at least one more person on HN understanding hardware.


According to the article the _alleged_ wafer price is rising dramatically, but they point out they have no way of confirming it. It could just aswell be a plan to hike prices together by announcing that your buddies are now paying more, even though their individual contract may not reflect it.

I wouldn't go as far as saying Nvidia is making less just because their net income is less, couldn't they still be huge but consumed by even greater investment into future?


> The low end will have to live with what it's got.

CPU integrated GPU is what's killing the low end. You don't need any more than that for desktop/light gaming.


I'm interested to see how the iGPU scene will pan out with Intel slated to integrate their ARC GPUs into their CPUs from the next generation.

If iGPUs can get to punching in the same weight class as the RTX 3060s, that will be a paradigm shift for gamers and most household graphics users.


> If iGPUs can get to punching in the same weight class as the RTX 3060s

This will never happen; a discrete card will always be faster. The interesting question rather is: will or when will iGPUs become fast enough for most purposes that currently still require a discrete GPU?


> a discrete card will always be faster.

And a Ferrari will always be faster than a Ford. But, for most people, the Ford is good enough to do what they want. The largest gaming segment (consoles) is already powered by APUs. Discrete GPU sales have been in a steady decline since 2009 [1].

[1] https://siliconangle.com/2022/12/29/sales-computer-graphics-...


I doubt an iGPU will ever be viable for gaming simply because it has to use DDR memory instead of GDDR


Throw some extra 3D cache and go to quad-channel memory and I think that would be enough for a lot of people.


Is there any reason a new CPU (and iGPU) generation couldn't use GDDR?


Because some people will screech that the RAM is soldered and can't be upgraded; the laws of physics dictate that DIMMs degrade the signals too much for the speeds desired.

Of course, if Intel puts more emphasis on ARC iGPUs I wouldn't be entirely surprised to see mobos come out with GDDR VRAM soldered on. Some people will definitely still screech, but they aren't important.


Basically the number of wires going to the memory.

On discrete GPUs memory is placed in a circle around the GPU chip, each memory module connected with individual wires.


So, my impression is you’d lose replaceable memory, and the total practical memory would be lower than the maximum practical RAM + VRAM in situation with a CPU with (LP)DDR and a dGPU with GDDR, but intuitively it seems like you’d get better performance than with an iGPU otherwise and still save on cost/performance (though cap performance lower) than a CPU + dGPU setup for lots of use cases.


I am not sure if operating margin is indicative of anything but the strength of accounting department.

In the interest of big corporations is to have profits as low as possible, so that they don't pay excessive amounts of Corporation Tax.


If this is true, why did they even bother to create Lovelace and RDNA 3? Just keep selling the previous generation.


Geohot is working on AMD AI with tinygrad.


The main reasons of "monopoly" are long-term vision and enormous effort spent in software development. AMD is lacking behind because they have not yet seriously invested on their own software stack.


monopoly isn't a bad word here and doesn't need quotes. In America and major markets, a monopoly is fine if that's what the market chooses, anticompetitve practices and abusing market position is what gets at least put under scrutiny. They may be limited in purchasing additional companies to expand or maintain the market position, but there is no need to shift perception about what is it because that's not regulated or under threat of regulatory scrutiny simply from providing a better service.


I disagree. A monopoly stifles innovation and consumer adoption. It's not "fine".


okay then it didn't need to be in quotes if its true here as well

that was my only point, they put monopoly in quotes to attempt to defend Nvidia from regulatory scrutiny

they havent done the thing that causes regulatory scrutiny while it still being accurate about what they have achieved


Are there any examples of monopolies that haven’t engaged in anti-competitive practices?


at local levels it’s more obvious

at larger levels there are typically business units within a larger organization that might have a monopoly in that sector

if everyone chooses your phone brand out of social ramifications and you dont do anything to accelerate competitor’s exiting the market, is what it is


Monopoly are fine if they are natural monopolies - e.g. utilities; which need to be regulated. Otherwise monopolies are suboptimal as prices won’t come down due to lack of competition.


So you are fine paying 10x what's worth because the monopoly?


No one buys any product that costs more than it is worth to them. If value is lower than cost sales tend to go to zero. The continued market for a product even in the midst of shortage indicates that someone somewhere values the product more than its market price. Otherwise the price would drop until demand converged with supply.


I think this is true when there is competition and/or the good/service is not a necessity. If my electric company decided to add a mandatory $100 fee then I'd be forced to pay it. It's also why the emergency room can charge so much- are you going to shop around if you are in the middle of having a heart attack?


Yeah, but we (consumers) don’t usually see full exploitation of that principle as a good thing. Like, when pharmaceutical companies charge exorbitant prices for lifesaving drugs just because they can, that’s bad.


Martin Shkreli wasn't convicted for manipulating medicine prices, just outrunning his shareholders.

We, the consumers, don't get a damn word about what we want. Unless the free market says it's a bad thing, you're stuck enjoying whatever the corps decide the fight-of-the-week is.


What are you talking about?

The reason they still do so is because the majority of voters aren’t too bothered by it.


That’s … reductionist, any voter that attempts to find a representative for a non-party line cause gets gaslit by rapid partisans about hating women or a marginalized group because its not their top cause guiding all voting decisions.

voters dont have control of their representatives either way


No it's not reductionist, it's the truth. Reductionist would be different, for example if it implied that change is impossible, or that voters will always stay the same, or that there is a guaranteed inevitability, etc...

Maybe your reading your own thoughts into someone else?


okay, voters don’t have control of their representatives and nobody has floated a constitutional amendment


If you genuinely believe voters have literally no control over their representatives in the US, I highly doubt you would spend time writing comments on HN about pharmaceutical pricing and not on more important matters.


rabid*


Sure, but often the costs of high prices of inelastic goods are externalised. People excluded from housing, banking, energy, education, and health markets necessarily turn to various anti social activities which create costs for society… and those who are making real profits rarely end up picking up the bill for these.

Though I don’t believe this line of reasoning applies to CUDA/GPUs (yet).


Other than government mandates, I don’t pay for things that cost more than they’re worth to me. I just don’t buy them and I think most people have a similar approach.


Until there are core libraries that support AMD and other GPU platforms, NVidia will continue to dominate.

I was really encouraged to see this the other day: https://news.ycombinator.com/item?id=36968273

Because the AMD GPU FFT library isn't feature complete compared to cuFFT or even FFTW.


I have a copy of Byte magazine with the front cover that says “Is Unix dead?”.

Back then, everyone thought Windows NT had “won”, and it was game over for every other operating system.

Same thing here. Nvidia might look strong now but all it takes is an open source project that competes effectively to change everything.


>I have a copy of Byte magazine with the front cover that says “Is Unix dead?”.

For practical purposes it is. Bur it's not Windows NT that killed it, it's Linux.


You miss his point. NT was thought to have “won” and their position seem unassailable but the unexpected happened and their reign in the server world was short lived.


It certainly won the workstation market, moreso with the ongoing upgradability issues in the Mac Pro.


Aren’t macOS and iOS Unix-based? Unix is one of the most widely used operating systems in the world right now.


macOS yes, iOS not really, you won't go far using only UNIX for iOS apps.

The actually reality is POSIX flavoured OSes are most widely used operating systems in the world, for various levels of what means to be POSIX, and to what level they expose it.

"Transcending POSIX: The End of an Era?"

https://www.usenix.org/publications/loginonline/transcending...


But Darwin, the iOS kernel, is Unix?


The iOS kernel isn't exactly Darwin.

Additionly Darwin evolved from a mix of BSD and MACH.

NeXTSTEP drivers were written in Objective-C, and a C++ subset on Darwin.

In any case you won't be running CLI applications on iOS, rather applications that depend on a mix of Objective-C and nowadays also Swift, hardly UNIX.

Using BSD sockets on iOS doesn't support all the iOS networking capabilities, for example.

NeXT approach, later adopted by Apple, was similar to Windows NT POSIX in spirit.

The UNIX compatibility is there to help bring applications into the platform, and take place in US government contracts, not to make it easier to port applications elsewhere.

There are even some recordings from meetings at NeXT, with Steve Jobs discussing how NeXT should position itself on the workstation market.


Al it takes is a different better AI model that doesn’t require the type of computing capabilities Nvidia offers or for the LLM hype to die.

Nvidia made their fortune riding 2 consecutive hype waves - crypto and LLMs.


> Al it takes is a different better AI model that doesn’t require the type of computing capabilities Nvidia offers

That's a pretty reductive way of looking at things. Nvidia got to where they are today because they made the "different better AI model" by spending billions of dollars in AI research when other companies were fighting over scraps. They were the ones investing in a high-level GPGPU programming model when Apple and AMD were busy stabbing each other in the back over OpenCL. Just look at their list of papers and compare it to the contributions of your favorite FAANG member: https://research.nvidia.com/research-area/machine-learning-a...

Nvidia owns this market because Apple, AMD and Intel have a grudge match that is apparently more important than cross-platform harmony.


Add Khronos to the party, the blindness from all parties to push C for OpenCL, until it was too later for anyone else to care.

I attended one Khronos webminar where the panel was surprised that anyone would ask about Fortran support, and showed total lack of knowledge about PGI compilers.

Additionally with very crude tooling.


I exclude Khronos because at least they're able to put aside industry politics to deploy an open API. The problem is the rest of the industry - even if there was a good alternative, nobody would use it. Microsoft has DirectML™, Apple has Accelerate™, and Google has a few thousand tensor accelerators they don't know what to do with.

Everyone is out for blood in this market. Going after Khronos for not addressing Fortran FFI is a cart before the horse argument, from where I'm standing.


Khronos is all about politics and design by committee APIs.


I do not see how that contradicts my overall point.


They don't put aside politics, they are quite there in the myriad of extensions and clunky designs of their APIs.


I remember how it was in 2008 on Linux desktops/laptops: AMD laughed straight in the face of their Linux customers. NVidia did support them, although at its own rules (closed source). When the need to run scientific computation over GPUs arouse, NVidia was there, ready. AMD got what it deserved for ignoring a market of scientists and researchers


Does anyone have more inside knowledge from OpenAI or AMD on AMDGPU support for Triton?

I see this:

https://github.com/openai/triton/issues/1073

But it's not clear to me if we will see AMD GPUs as first class citizens for pytorch in the future?


I don't see in the article or thread, anyone else remember when in ~2012 PGI (Portland Group, who at the time were a subsidiary of STMicroelectronics), a long time builder of high-performance compiler tooling was showing a new round of platform-independent parallel accelerator tooling including teasing a credible independent CUDA implementation that could target non-Nvidia platforms, then Nvidia bought them in 2013 and it disappeared from the face of the earth in favor of "we're excited to be working on CUDA Support for FORTRAN"?

...Yeah.

Nvidia seems well aware that the "There is existing code many users want to run written in CUDA, and you can only run CUDA code on an Nvidia part" situation is their competitive advantage.

ATI/AMD's failure to settle on a stable GPGPU toolchain (CTM/THIN/Brook+, Stream, ROCm with OpenCL, HIP, and perpetually broken CUDA compat...) and OpenCL's ugly boilerplate gave them an opportunity to get that core set of lock-in software, and they're not giving it up without a fight.


GPUs are reliving the bad old days style proprietary tools & ISAs world.

Which the GP computing world broke out of with the advent of multi vendor ISA compatibility in commodity CPUs (x86, later others) and open source compilers (GCC, later others). Before that the sw stacks were buggy, vendor locked, expensive, were gatekeeping programming langugaes development, very similarly as now in GPUs.


That never changed in the GPU world, because contrary to urban myths, game consoles never really supported GL, only inspired variants of it.

Not to mention extension spaghetti with proprietary features anyway.


Rather poor piece. Not even mentioning the competition (that's trying to catch up, yes).

Sometimes it sounds more like gushing than anything else, always the best, consistently sold out, etc etc.


>software companies don’t have the hardware capabilities, and vice versa

Nobody stops a hardware company to hire software developers. The other way around would be harder.


Hardware companies often find it very hard to competently build software.

They tend to lack appreciation/understanding of software in the first place (hardware-first thinking) leading to underinvestments.

It's also hard for them to identify great software people - at all levels, starting from the CEO/board.


Software developers, especially for certain type of components, e.g. compilers, are NOT easy to find out.


There's no such thing as hard to find. Only "I don't want to pay what they ask". Which is weird for a hardware company since you don't need that many and you only write the driver once, but you sell it with millions of devices.


They can start on GCC and LLVM mailing lists.


Writing VHDL/Verilog is much like writing software, however.


I can't understand why AMD leaves so much on the table. Of course their ROCm efforts have intensified, but there's still a long way to go.


Because AMD is tiny in comparison and they are fighting titans on the CPU and GPU fronts, meaning they have to pick their battles.

They decided to fight Nvidia on the gaming segment and let them win productivity. It seemed like a good strategy a few years ago when GPU were unobtainium, not so much today.


Also AMD pretty much won the full console market while it's absolutely not clear that they'd be in the same position if they wanted to take on nVidia for GPU compute.


Only if we ignore the Switch, and its sales.


Yes, the cheapest and most low margin of the consoles.


Nintendo and NVidia are quite happy with it.

It is only one of the biggest selling consoles in the history of game consoles, breaking the record of GameBoy and PS 4, not a big deal.


The switch is low margin? When has that ever been Nintendo's strategy? They have used older and cheaper hardware with higher margins for generations.

Sony/Microsoft literally lose money on each console sold hoping to recoup in subscription and store costs.


The switch is low margin?

The margins in question here are the margins for Nvidia/AMD, not Sony/Microsoft/Nintendo. It would not surprise me at all if AMD makes better margins on the Playstation than Nvidia does on the Switch.


Ahh my mistake, thank you for clarifying


Additionally, it's not enough to just invest the time and money that Nvidia did. Now that Nvidia are embedded and have a head-start, AMD would need to spend much more to undo the advantage.


> I can't understand why AMD leaves so much on the table.

You're asking why a smaller, less funded team, doesn't throw massive resources into following an encumbent with multi-year head start and questionable ability to extract a big profit... vs. spending resources into entering disjoint markets where they can win (e.g. being the vendor of choice of almost every single console on the market - PS5, Xbox, Deck, etc.).


I wonder if ROCm is 100% CUDA compatible.


It's not 100%. CUDA for example has math functions with special rounding modes, but HIP does not.


No. They want you to target HIP, which is compatible with both CUDA and ROCm.

https://rocm.docs.amd.com/projects/HIP/en/latest/index.html


And with HIP now supporting Windows[0] it could be a game changer. I would love to see it working with llama.cpp[1]. I tried HIP via Blender, but it did not work for my 6600XT, it still used my CPU...

[0] https://github.com/RadeonOpenCompute/ROCm/issues/666#issueco...

[1] https://github.com/ggerganov/llama.cpp/discussions/915#discu...


Not at all, as it doesn't cover all CUDA use cases.


it’s all about investment, nvidia invested hard and got rewarded. Similar to how the iPhone disrupted the marketed. I don’t see an open source solution winning unless a company can invest in the development


the software is important, but the hardware is much more so. if AMD could make performant GPU that had solid drivers, the world would rally and port all the major ML libraries to it over night. CUDA is not the moat you think it is. Nvidia is going to shoot itself on the feet with how they have been handicapping their latest GPUs. Unfortunately, it doesn't look like AMD is prime to compete.


I’m just amazed amd has produced so sort of working shim for their stuff that enables PyTorch use. Even if it’s slow and rough.

Rocm seems perpetually imminent…


No mention of OpenCL / VK compute?


When they are able to handle C++, Fortran, Julia, Python JIT, ...., provide graphical debugging experience for GPGPU code, then maybe it will matter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: