In other words, Nvidia has earned the right to be hated by taking the exact sort of risks in the past it is embarking on now. Suppose, for example, the expectation for all games in the future is not just ray tracing but full-on simulation of all particles: Nvidia’s investment in hardware will mean it dominates the era just as it did the rasterized one. Similarly, if AI applications become democratized and accessible to all enterprises, not just the hyperscalers, then it is Nvidia who will be positioned to pick up the entirety of the long tail. And, if we get to a world of metaverses, then Nvidia’s head start on not just infrastructure but on the essential library of objects necessary to make that world real (objects that will be lit by ray-tracing in AI-generated spaces, of course), will make it the most essential infrastructure in the space.
These bets may not all pay off; I do, though, appreciate the audacity of the vision, and won’t begrudge the future margins that may result in the Celestial City if Nvidia makes it through the valley.
It’s a very good summary. Nvidia to me gets a lot of flak (sometimes rightfully so) in large part to being the big risk-taker in the GPU/graphics/AI space that pushes forward industry trends. AMD, meanwhile, is largely a follower in the GPU/graphics space with few contributions to rendering research. Even worse, they even use their marketing to angle themselves as consumer-friendly when I reality they’re just playing catch-up. For instance, they started off badmouthing ray-tracing as not being ready back when Turing launched (when is any new paradigm ever “ready” when it requires developers to adapt?) before adopting it with much worse performance the following generation.
Well, if that "consumer friendliness" is usable Linux drivers, then I don't give a shit about NVidia pushing the envelope, even if it means the graphics holy grail of real time raytracing.
And is NVidia really the primary pusher of graphics algorithms and AI? Yeah, I doubt it. SUre they might contribute some things, but modern fabs have so much available transistor real estate that really NVidia is just trying to maintain market control by defining the computing interfaces rather than some fundamental achievement in computing theory.
Raytracing algorithms aren't new, and they are "embarrassingly parallel" like most graphics algorithms, very well suited for modern chip fab design. If NVidia went away, Sony/Microsoft/Nintendo would just find another graphics vendor for real time raytracing. There were "processor per pixel" raytracer experiments in the mid 1990s when 66 MHz was "fast". If anything, real time raytracing likely has been suppressed by the massive investments in rasterization toolchains.
I mean one of the pain points are mainly that they treat other companies so poorly many in the industry just kinda hate their guts, not just AIB partners as shown by EVGA's messy divorce, but also others such as Apple or the Linux Foundation. While this happens their consumers are ambivalent towards them due to the fact that all their "envelope pushing" mostly being proprietary tech that drags the performance of other competitor's cards and the fact that their high end cards are price gouged akin to Intel CPUs prior to 2017 due to a lack of meaningful competition. All that and their recent negative pr from price gouging the mining boom, the "4080" 12gb, and Jensen being publicly out of touch about the price of computer hardware "unfortunately never going down" in this context has dragged their goodwill a bit lately.
The author is not doing a good job of differentiating ordinary ray tracing with path tracing. It is path tracing that actually gives all of those wonderful render effects "for free." Ray tracing by itself isn't that impressive. But we are a long time away from path tracing being a widespread and practical rendering method.
The big worry here is that nVidia is pursuing the equivalent of internet streaming video in the 1990s. Sure, eventually the technology will get there, but that could far enough in the future that nVidia will be disrupted or displaced long before it is a reality.
It also accuses rasterization of being "a hack", as if 1 spp with the denoiser coming to the rescue was nothing of the like. Also no need to label things as "hack"; they are all approximations, and good ones at that.
It also then argues that developers get RT "for free", which is nothing of the sort since there are many details involved to get it to work decently.
Current RTX hardware is rather lame to only get to 1spp isn’t it?
It’s not even clear that it’s the right approach.
A graphics engineer I know has this to say about it. “Telling me you made hardware accelerated BVH is like telling me you implemented hardware accelerated bubble sort.”
I don't think real-time ray tracing is lame. Diffuse is low-frequency and redundant, denoisers work magically well. It's 1spp per frame, but the denoiser is essentially allowing the application to distribute the workload across frames. For perfect specular a single ray is sufficient, and glossy is somewhere in between the two. The limitation doesn't have to do much with firing more rays than handling more complex materials; what I just described decouples diffuse/specular and is how games are doing it these days. But if the current materials are plausible, then what's wrong with this approach?
I do think it is the/a right approach... when used judiciously and when an alternative rasterization approach doesn't already exist or has obvious flaws. For example, some games use a hybrid SSR-RT reflection; do the SSR first, then fire rays where the SSR misses (insufficient information in screen-space.) I think the UE5 Matrix demo does this. You kind of get the best of both approaches.
The article in this sense proposes a false dichotomy; to choose between the complex "hacks" of rasterization, or the easiness of RT. Except that rasterization has decades of research and solutions are well-known and already implemented, and RT is not 'do the dumb thing and let the hardware do the work'.
I am no fan of nVidia, but yes, rasterization is a hack. I like to call raster GPUs overglorified sprite rotoscalers. Every little thing that light does in reality, except perspective, needs to be painstakingly implemented in dedicated code. Path tracing is much closer to just modeling what light does, and all the optical phenomena fall out of it naturally.
> I [nvidia CEO] don’t think we could have seen it [massive effect of Ethereum merge on bottom line]. I don’t think I would’ve done anything different, but what I did learn from previous examples is that when it finally happens to you, just take the hard medicine and get it behind you…We’ve had two bad quarters and two bad quarters in the context of a company, it’s frustrating for all the investors, it’s difficult on all the employees.
This is not confidence inspiring. It was obvious that the Etherum merge would affect the bottom line in a big way. Why this professed ignorance? Does it have to do with the fact that to admit that it was visible a mile away would have been to admit the deep reliance the company had come to have on the short-term Ethereum mining boom?
Couple of comments to this point suggest that it couldn't have been predicted since the official timing of the merge was only announced in 2022, and silicon supply chain requires planning in advance of that.
But that point is ignorant of this truth - Proof-of-stake has been on the roadmap since ~2017 if not earlier.~ Edit: 2016 - Thanks friend! :)
I think the reality is that the impact of Ethereum on Nvidia's business was not fully appreciated, and that 'veil of ignorance' may well have been intentional. They never truly served the crypto market directly (e.g., there wasn't really a "miner" line of cards), and as a result didn't do the due diligence to understand how those customers played into their business performance and strategy. Or they did, and just really underestimated the Ethereum devs on ever making the merge happen. But I lean towards the first.
Either way, I think that with crypto in the rearview, I'm actually more confident in their leadership team. They seem better suited to gaming and AI.
> But that point is ignorant of this truth - Proof-of-stake has been on the roadmap since 2017 if not earlier.
it's been on the roadmap since 2016. That's actually still a problem though, a perpetually-rolling-deadline is effectively worse than not having a deadline at all.
Was NVIDIA just supposed to cut production for the last 6 years in anticipation of something that was continuously pushed back 6 months every 6 months? That's not a reasonable expectation.
As an observer, I never got the sense that there were strong commitments being made on timelines until A) beacon chain was live (running in parallel), and B) testnets started getting merged successfully.
The moment of Genesis for the beacon-chain started a clock that Nvidia should have been paying attention to, and I think would have given them plenty of time to foresee the present situation.
Great question, and perhaps at the point where decisions were being made, it'd be hard to argue a different path internally (hindsight being 20/20).
However, it seems clear that the business built both insane prices and the crypto lockup of devices (whether explicitly, or implicitly) into their forecasts for the business. They didn't have a good pulse on the actual demand/usage of their product, and when that usage pattern would shift.
The path they're taking right now, specifically regarding pricing towards & serving higher-end enthusiasts with newer products, makes sense while the used inventory gets cycled around the lower end of the market.
From a product perspective, I don't have any useful opinions to share because I'm not in hardware, and I don't have the information set they're operating from internally. But, they should have hoovered up as much cheap capital as they could while their stock price was high and the going was good to make the next period of heavy investments (to be fair, shares outstanding did grow, just not by a ton, %-wise, and they have a fair bit of cash on the balance sheet)
> The path they're taking right now, specifically regarding pricing towards & serving higher-end enthusiasts with newer products, makes sense while the used inventory gets cycled around the lower end of the market.
This approach might take a beating if AMD is willing for a price war. And AMD can because of their chiplet approach being cheaper.
I will buy nVidia purely because of dabbling interests in ML images... so a price war might not be so effective thanks to a technical moat.
people absolutely SCREAMED a year ago when there was a rumor going around that NVIDIA was pulling back on new chip starts, it was going all around that it was a plan to "spike prices during the holidays".
In the end 2021Q4 shipments were actually up according to JPR, of course. But people were mad, and I still see that MLID article brought up as proof that NVIDIA was deliberately trying to "worsen the shortage" and "spike prices during the holidays".
Now, what MLID may not really know, is that wafer starts typically take about 6 months, so if he's hearing about reduced starts in October, it's probably more like NVIDIA is pulling back on expected Q1/Q2 production... which indeed did come down a bit.
But as to the public reaction... people were fucking mad about any sign of pulling back on production. People are just unreasonably mad about anything involving NVIDIA in general, every single little news item is instantly spun into its worst possible case and contextualized as a moustache-twirling plan to screw everyone over.
Like, would it have really been a bad thing to pull back on chip starts a year ago? That actually looks pretty sensible to me, and gamers will generally also suffer from the delay of next-gen products while the stockpile burns through anyway.
It's nowhere near the "sure miners may be annoying, but deal with it for 6 months and then we all get cheap GPUs and everyone holds hands and sings" that LTT and some other techtubers presented it as. Like, yeah, if you want a cheap 30-series card at the end of its generation/lifecycle great, but, you'll be waiting for 4050/4060/4070 for a while. Even AMD pushed back their midrange chips and is launching high-end-only to allow the miner inventory to sell through.
And people hate that now that they've realized the consequence, but they were cheering a year ago and demanding the removal of the miner lock / etc. More cards for the miners! Wait, no, not like that!
It's just so tiresome on any article involving NVIDIA, even here you've got the "haha linus said FUCK NVIDIA, that makes me Laugh Out Loud right guys!?" and the same tired "turn everything into a conspiracy" bullshit, constantly.
And that's just the hardware drama. The software hate against Nvidia is partially unwarranted too - Nvidia's Wayland issues mostly boil down to GNOME's refusal to embrace EGLStreams, which got whipped up into a narrative that Nvidia was actively working to sabotage the Linux community. The reality is that desktop Linux isn't a market (I say this as an Nvidia/Linux user), and they have no obligation to cater to the <.5% of the desktop community begging for changes. Honestly, they'd get more respect for adding a kernel-mode driver to modern MacOS.
In the end, Nvidia is still a business. Putting any money towards supporting desktop Linux isn't going to have an adverse effect on their overall sales. We're just lucky that they patch in DLSS/ray tracing support to Linux games and software like Blender.
Nvidia doesn't need to offer full linux support, only the bare minimum to make the lives of those making their job a bit easier.
Also, let's not kid ourselves, Nvidia has a lot to gain in the datacenter/AI business with proper linux support.
Let's also bear in mind that Linus Torvalds said this: "they are the exception, not the rule". If that doesn't speak volumes, I don't know what does.
> Nvidia has a lot to gain in the datacenter/AI business with proper linux support
They do, which is why they support CUDA and Docker. They have nothing to gain from supporting Wayland besides appeasing a bruised-and-battered bunch of people that don't really care anyways. Nvidia seems to be admitting as much by open-sourcing their drivers and letting us play with their Legos, since we're so smart.
I have no particular animosity towards Nvidia, and while I'm critical of the ignorance/mistakes made regarding crypto, I think they're doing incredible work and running an insanely difficult business as best they can.
It's easy to forget that the world isn't as simple as we imagine it to be, and that gnostic conspiracies are attractively intuitive.
These also had models that you can’t find anywhere on Nvidia website like HX170 that is basically a A100 with less memory
A lot of miners preferred consumer cards though as those can be sold to gamers once the bust comes again (and with crypto it always will every few years)
NVidia (and AMD for their own line) did not talk about these in the public much as they were really bad PR during the worst moments of the gpu shortages.
To the point that they never even updated the site for the later/better models (like the already mentioned 170HX). Only mentions of it you can find are the sellers you have to go to to get one (NVidia did not sell these directly to anyone as far as I know)
AMD developed a line of mining cards as well as well, and also some PS5-based mining rigs from asrock (it's a PS5 apu reject on a motherboard with VRAM...) that utilized AMD BC-250 mining processors.
> They never truly served the crypto market directly (e.g., there wasn't really a "miner" line of cards)
There were deliberate, and miserably failed, attempts to make lines of cards that could not be used for mining, while in parallel keeping their non-limited lines in production, making them the defacto miners' lines.
So yes they did know about it and tried to address it by catering to both markets, but were unable to do it correctly.
The point you responded to is also fair; the fact they tried to lock miners out was a de facto acknowledgment that crypto had an impact on their sales, and is more critical evidence they handled it poorly.
I think Huang does not want to draw investors’ attention to crypto because he doesn’t want people to equate Nvidia’s performance as a company with crypto performance. He doesn’t want Nvidia to just be a crypto company.
At the same time, he also definitely wants to cash in on any future crypto booms, because they are lucrative.
It is best for him to take a position that mostly ignores crypto. I think he legitimately doesn’t want crypto to be the future of Nvidia and doesn’t want to build for that use case, nor does he want to be financially reliant on it, but there is also no point in him talking shit or spreading doom about crypto when he can just shut up and still sell gpus.
I think we forget that Silicon manufacturing is planned a lot farther out than Silicon shopping.
They were likely trying to make TSMC purchase orders in the start of the pandemic, before a crypto boom. They also tried to handicap their GPUs wrt crypto. They likely didn’t expect the absolute shit show of a chip shortage (because who predicted or understood the pandemic early).
The rest of the market was desperate, and they probably expected it to be more robust than it ended up being. The merge would have been so far away at the time that they wouldn’t predict if it would happen at all nevermind when.
Many folks in crypto expected those Ethereum-mining GPU farms to just switch to some other GPU-minable cryptocurrencies. It wasn't a certainty all those farms would just close up and dump their GPUs on the market en masse. But Fed interest rate policy hitting at the same time, driving down the crypto market across the board (and all other risk assets), may have unexpectedly changed the ROI calculation there and resulted in the dump.
Many poorly educated folk maybe. All other chains were already less profitable and had a small fraction of Ethereum's hashrate. Even 5% of ETH's hashrate moving to them is plenty to make them unprofitable for most. This was never a likely outcome.
> Many folks in crypto expected those Ethereum-mining GPU farms to just switch to some other GPU-minable cryptocurrencies.
This was a pretty common take but if you did the math Ethereum had about 90% of the GPU-mining market (by hashrate) so it was obvious the profitability was going to tank on those other currencies as soon as Ethereum switched.
In the long run yes, there will probably be another big spike in another cryptocurrency that starts another GPU boom. But it's not magic where one instantly springs up to absorb all the ethereum hardware at equivalent profitability.
A GPU crash was inevitable regardless of the interest rate drop hitting at the same time.
I hoped there would be a rise in proof of work like chains, where in the work was something useful like training an AI or brute forcing a hard but useful problem. Like a SETI@Home, but paying crypto for successful solutions as opposed to relying on altruism.
It's hard to pull this off, if not impossible. A key attribute of proof of work systems is that the difficulty should be dynamically adjustable and that everyone has perfect consensus on what "work" is. Doing meaningful work, while admirable, puts the owners of those projects in control of defining "work" and adjusting difficulty, i.e., people in the loop. That's not trustworthy from a currency POV, no matter who the people are.
The problem is that in order for work to be useful to secure a blockchain, the work needs to have special properties. Specifically...
- It must be difficult to compute. It should be possible to get exponential increases in difficulty out of problem size increases.
- It must be easy to verify. It needs to be verifiable solely through data that can be copied on-chain, with no reference to external sources[0], and an easily-understood computer program needs to exist to verify that data.
- There must be no ambiguity in interpretation or valuing of the work. Every node should be able to look at the data and agree that work has been done just by having a connection to other nodes and valid consensus algorithms.
Most PoW is hashing-based, there are a few weirdos like Primecoin that use prime searches as a PoW; but these are both forms of numerical work. Number problems are easy, they have neat exponential-blowup properties, we can validate that the work has been done far quicker than they take to produce, and we only have one axis of value: how hard is it to produce that work? ML training and SETI@Home don't work that way. Verifying that someone did training requires testing the entire training set, and verifying that a Fourier transform worked involves just running the inverse, which doesn't have that same performance gulf between verification and work.
Furthermore, useful work has a value ambiguity problem. What ML training sets are considered valid proof of work, and how do we put them in a nice total ordering so that we can resolve chain splits? Who gets to inject signal data from SETI@Home into SETICoin? The nature of the work is not value neutral; if we just allow people to say "here, I trained an AI on a training set and it got better", then that allows deliberate construction of "easier" problems that look hard. Imagine like being able to unilaterally change Bitcoin from an ASIC-friendly hash function into one that is more akin to, say, Burstcoin and Chia. That would be a major trust problem for the chain.
[0] This is why most smart contract proposals fail. If we want to have the network, say, track container ships, we can't have every node operator actually board every ship to punch in data and make sure it was done correctly. There has to be trust, which disqualifies us from using an unpermissioned blockchain.
I think it's more like there are so many games at play as CEO in that position that anything but vague denial would be far more trouble than it's worth. Anything you say is going to attract a lot of criticism so the only thing you can say is the least damaging one.
In other words, most public statements are mostly nonsense engineered for response and have only a casual association with the truth.
They have consistently underestimated the effects of crypto, it's been screwing up their demand forecasts for a long time. I think what happened was they had all these efforts to prevent miners from buying cards so gamers could buy them instead, and they thought they were successful. So they attributed strong demand to gaming, but they were actually failing and miners were still buying all the cards. I don't know why they thought they were successful...
I'm guessing it is less about proof of stake and more about the fact that all of crypto is in the gutter and it's not as economical to mine anymore. I would guess that is the big core issue that is enhanced by things like proof of stake. Kind of a perfect storm when paired with inventory issues and crazy inflation. I think that storm is what he is saying they couldn't have predicted.
It was sudden and couldn't have been predicted. Ethereum ended PoW just this month but the GPU crash was 7 months ago. In reality the PoW transition had nothing to do with the GPU crash, it was the end of WFH and the crypto decline caused by the russian invasion that resulted in the GPU crash.
What? One of the biggest recent decline in crypto happened when LUNA foundation dumped multiple billions as BTC in order to keep terra stable (didn’t work out). The other dump is caused because borrowing money for leverage won’t be as cheap as it was for at least the next 4 years (taking bloomberg projections of Fed rates here).
These are all caused by general asset declines due to the russian invasion, inflation spiking, and fears of fed reaction. General asset declines put pressure on crypto whose more dramatic moments is really traced back to the russian invasion.
Once I got my hands on stable diffusion, I bought a 3090 because I was afraid that they would go up in price, but my power supply wasn't big enough and I returned it for a 2060. But now I see that 3090s are dropping like 40% in price since then. This article does a great job explaining these dynamics.
Metaverse and crypto will be a bust, but democratized AI is going to explode, the big risk for NVDA is that models and tools get small and efficient enough that they don't need $1,000+ hardware.
> the big risk for NVDA is that models and tools get small and efficient enough that they don't need $1,000+ hardware.
You will always be able to do more and better with the premium hardware. Maybe for video games the marginal differences aren't enough to sway most consumers towards the high end cards, but I expect there will be a lot of people willing to pay top dollar to be able to run the best version of X AI tool.
With games, the best card might make sense cuz it could be the difference between 60+ fps and 40.
For workloads though, does it really matter if it completes in 20 seconds instead of 15?
Unless there's some feature difference in the higher end cards (like AI-specific subchips), not just more of the same & faster, then a lower card of the same generation shouldn't impact your ability to run something or not.
How many times do you have to complete that 20/15 second task in a year? Multiply that by 5 seconds to get the "wasted time". Factor in the employees salary by that wasted time, how much are you paying them to wait on that slower hardware? Is it more money than the price difference between the better GPU and the cheaper GPU?
Or maybe it's an automated thing that runs constantly? The difference between 20 and 15 seconds is 480 additional iterations per 8-hours. How much is that worth to you?
There's also the unmeasurable benefits. What if you're losing potential good employees applying because word on the net is your company uses last-gen hardware for everything, and your competitors use the newest.
I don't think it's that binary a thing. It's not like either you have a supercomputer or you have last decade's junk. There are so many different models and price points, each company/team/dev can find their own sweet spot and upgrade cadence. It's not necessarily the case that 20% better performance is 20% more money, and 5 seconds times like 100 times per week times a year is still just a single workday. I doubt any employee works effectively 365 days a year... and the cost difference of the highest end card vs a medium one is definitely more than I get paid in a day, at least.
I mean, it's no different than CPUs and RAM and such. A middle-of-the-road laptop from the last year or two is good enough for most users... no company I've ever worked for gets the top-of-the-line model every single refresh for every single dev. That'd be an absurd waste of money, IMO.
Personally, I wouldn't want to work for a company that upgrades hardware all the time for single-digit performance increases. That money is better used elsewhere, whether employee salaries/benefits or donating it to the community or just saving customers some money or paying dividends to shareholders.
Even if you're running massive continuous workloads, it is probably more cost-efficient to just parallelize a bunch of midrange systems (optimize for perf/dollars and/or perf/watt) than to just chase the highest-end chips for no reason.
It is probably the default safe position. I think there would be a larger need in explaining how they won't be a "bust" compared to saying that they will. The internet needed explaining to many people to see it's real advantages before it took off.
People have viewpoints. They have enough of them that explaining them in detail every time they come up would be tedious.
You and I both could write up plausible explanations of what he might mean by that, as both those topics are already discussed to death here. There's no particular reason to think he's being dogmatic, just pithy.
I'll also note that it would be easy to make the same accusation about your own posts. E.g.: "The contemporary democratic nation state is a facade over a distributed and decentralized oligarchy." https://news.ycombinator.com/item?id=32837598
But presumably you see yourself as holding that position for non-dogmatic reasons, so I'd encourage you to do the same for others.
I actually don't think Metaverse will be a bust, but I think the current corporate implementation of it will be. Companies are trying to cash in too early and it's not really working out for them, particularly Zuckerberg's Meta.
A spiritual successor to Second Life with modern graphics and VR support would likely be very successful. Dumbed-down versions like Fortnite Creative and Roblox have been very popular. Zuckerberg won't be the one to build a successful metaverse game though, not because Meta doesn't employ skilled game developers, but because culturally the leadership is completely out of touch. Second Life thrived because of its uncontrolled player creation environment, which Meta would never allow. It let you 3D model and script anything you could think of using in-game (and external) tools and had a free market economy where the developer only took a cut when converting to/from real world currency. The whole debacle with the "virtual 'groping'" and the personal space bubble has demonstrated that even if Facebook's Metaverse did have good creative tools, all it would take is a single article written by a journalist with an axe to grind for them to cripple it. They've already removed legs from avatars in Horizon Worlds for "safety", which is ridiculous.
Zuck's Metaverse (tm) is very forward looking in some ways, and very backward looking in others.
Although his particular vision is very weird and ugly, Zuck is on the right track: in the future, people will spend real time (not async time, a-la tiktok/instagram) with each other in virtual spaces. Facebook is already a text-based version of the metaverse, and it minted a pretty penny.
The metaverse, as Zuck imagines it, is already here though: video games. People who are hardcore about video games wake up, go to work, get home, and then plug in. Their friends and social circle exists on the internet, in the game. This is the metaverse. Video-game communities; Fortnight [1] , WoW, League, CS-Go, etc; these are the most complete so far existing metaverse(s). One exists for every major flavor of gameplay we can think of. Look on steam for people who have 10,000 hours in a game. These people live as Zuck wants us all to: in the metaverse.
Zuck's vision is to expand these virtual communities beyond the hardcore gamer group. You go to work (in a metaverse) then you come home and spend time with your friends (in a different metaverse), then you play a game (in yet another metaverse), then you sleep and do it all again the next day. It's actually a very forward-looking vision, and one that probably will come to pass over the next few decades. A metric for this is "time spent in front of a screen." It inexorably climbs year after year, generation after generation.
But the key thing is that the experience has to be good. The metaverse of Fortnight or CS-GO is worth spending time in because those games are fun and engaging in their own right; for some, more fun and engaging than reality itself. These games print money, because a small group of people legitimately spend every leisure moment in them. Zuck's vision is to expand this beyond the market for hardcore gamers and into work, socialization -- life itself.
I suspect that the market doesn't quite fully understand this part of Zuck's metaverse dreams. In part because of Meta's spectacular failure at making the metaverse look fun and compelling.
Personally, my money's on Epic to actually bring the Metaverse to fruition. They already own a Metaverse, Fortnight, and they also own Unreal, which will probably be the software foundation upon which the normie metaverse gets built.
> A metric for this is "time spent in front of a screen." It inexorably climbs year after year, generation after generation.
Inexorably? That's not a measure that even can grow infinitely. There are only so many hours in a day. Nor should it grow infinitely. Many people find that taking up hobbies away from the screen improves their lives significantly.
It doesn't, you can run this on $150 GTX 1060s if you're willing to wait a few moments. The interesting angle here is that there is a consumer segment of hardware (gaming GPUs) that can be used to drive cool artistic experiments. I think for most people, even non-gamers, getting dedicated acceleration hardware is not any more expensive than an entry-level analog synthesizer.
This article lays out a solid overview as to how Nvidia got to where they are now. I'm curious (as someone without a deep knowledge of the technology or the industry) as to why AMD can't create a similar dedicated ray-tracing functionality in their chips? It seems to be where the industry is going, and the article later goes on to point out how Meta and Google are doing this themselves.
Why has AMD ceded this market? Is it a patent issue? Capability issue? Something else?
I'm theorizing here, but I suspect it's because AMD feels like building out a dedicated or proprietary capability could really hurt them if it didn't take off. By that logic, NVidia's risk here could hurt them as well (and that they are taking a risk is the point the article it trying to make.)
AMD has for a long time favored an inclusive, compatible approach. First (that I can recall) with x86-64, more recently with AdaptiveSync over G-Sync, and now with their Ray Tracing approach. Each time they chose a move efficient path that was open to the industry as a whole.
This seems to have had some pros and cons. On the one hand, they've been able to keep up with the market with a solution that is the best value for them. They've never been a large company against the likes of Intel and NVidia, so I suspect there's less appetite for risk.
On the other hand, by always going that route, they cede the leadership role to others, or if they do have leadership, it's not in a way they can really leverage. It becomes commoditized. Note how when the industry was moving to 64bit, AMD ended up setting the direction over IA-64 with their more inclusive approach. But it didn't turn into any significant leverage for them. They set a standard, but it was one that everyone else then adopted, including Intel.
So I feel like while AMD's approach keeps them alive and always in the running, it's an approach that will never put them on top. Whether or not this is a bad thing really depends on what the goals of the company are, and if the the goal is to remain steadily in the race, then they're doing great.
But arguably, NVidia pulls the industry in directions by way of its choices. They're risky and sometimes irritable. It's also put them in front.
So in my opinion, AMD hasn't ceded the market, but they have ceded leadership in many instances by their safe approach. It's still profitable and safe for them. But they'll always remain second place as a result.
The argument is made in the article. AMD is cutting all the things that are expensive but with limited markets (no special hardware on chip for raytracing or AI as well as using a slightly older fab). They'll focus on being much cheaper with nearly the same performance and less energy requirements than NV.
It makes sense -- as the underdog they need to erode entrenched advantages, starting up a standards-based compatible approach is a cheap way of doing so (on top of being totally laudable and good for the community).
I wonder if we could ever see Radeon Rays on Intel's GPUs, or even their iGPUs. Raytracing in every low-resource-requirement MOBA, I say!
AMD already has dedicated raytracing hardware on their GPUs, but are behind Nvidia.
PC games that opted for Nvidia raytracing earlier on run poorly with an AMD GPU with raytracing turned on. Cyberpunk 2077 is an example of this, runs beautifully on Nvidia gpus with rt on, framerate falls through the floor on an AMD card.
Raytracing on the "next gen" consoles PS5 and XBox Series X is done using AMD hardware and runs really well.
Indeed. The platform the devs target is very relevant. Often one can get caught up just comparing RT across vendors, but the reality is that the implementations are different and probably make different trade-offs. It's therefore no wonder that console games often outperform on AMD, and NVIDIA-signed games likewise on NVIDIA cards.
IMO, a major ingredient of raytracing running well enough in a game that also looks as good as any other current AAA title in raster-only mode has been DLSS. Raytracing is (or at least, at time of 2xxx series GPUs, was) still quite expensive to run at full res at resolutions (1440p, 4K) and framerates (60,120,144) that PC gamers demand. However, rendering raytraced games at a much lower resolution is just within reach for dedicated CUDA hardware. So DLSS makes up the difference with very sophisticated upscaling. Without DLSS, I think the raytracing in titles like Cyberpunk 2077 might not be performant enough.
In light of this, you might go back and see that AI and RT for NVidia have gone hand in hand, because one enables the other to be performant enough for AAA titles. Opinions may vary greatly on this, but personally, I don't think AMD's FSR upscaler is capable of matching what DLSS can do in this regard. (Intel's upscaling does seem to be capable of doing it, but very high performance parts are still some ways away from release).
AMD has "tensor cores" (called something different but they're very similar matrix accelerator units) in CDNA. RNDA3 is supposed to have "something", it has the same instruction as CDNA, it's supposed to be less than a full unit but presumably there wouldn't be an instruction without some level of hardware acceleration either.
The bigger problem is that AMD doesn't want to pay to keep up on the software side... at the end of the day when you're coming from behind you just have to pay someone to port key pieces of software to your platform. AMD has really coasted for a long time on letting the open-source community do their work for them, but that's not going to fly with things like PyTorch or other key pieces of software... if AMD wants the sales and the adoption of their hardware, it's just going to have to pay someone to write the software, so that people who want to do research and not PyTorch maintenance can justify buying the hardware.
I am not particularly interested in the perceived historical justifications for the current situation, it doesn't matter to the businesses who might be AMD's customers. And actually in many ways they've gotten even shakier recently, what with dropping RDNA support from their NN/ML package. As a cold statement of reality, this is table stakes going forward and if AMD doesn't want to do it they won't get the sales.
It's not even just PyTorch either, it's... everything. AMD is just coming from a million miles behind on the software, and "welp just write it yourself if you want to use our hardware" is not an attitude that is conductive to selling hardware.
That seems very inadvised. Nvidia's libraries being usable by a broad range of developers on a wide range of hardware is critical to their wide adoption. AMD cannot expect to have real adoption if only their fancy enterprise grade unobtanium cards support ML systems. AMD needs a wide & engaged community trying to yse their stuff to figure out what software/drivers they simply have to build.
Talking specifically about ROCm here, their ML package.
God this is such a tough paragraph to write accurately. AMD themselves have conflicting information all over their docs and repos and half of it is not even marked as "outdated"...
The official supported platforms at this point are RDNA2 (pro), GFX9 (pro), and CDNA. Consumer versions of these (RDNA2, Radeon VII, and Vega 56/64) probably work, although Vega 56/64 are an older version with much less hardware support as well. RDNA2 support is also "partial" and ymmv, things are often broken even on supported cards.
If RDNA2 works, then RDNA1 may work, but again, mega ymmv, things may not even be great with RDNA2 yet.
The "hardware guide" link talks about supporting Vega 10 (that's V56/64) and says GFX8 (Polaris) and GFX7 (Hawaii) are supported... but that doc is tagged 5.0 and 5.1 was the release that dropped the other stuff. So I'd say Vega 64/56 chips are probably broken at this point, on the current builds.
Up until earlier this year though, it was unsupported on any consumer card except Radeon VII. They dropped Hawaii/Polaris/Vega support about 6 months before they started adding partial RDNA2 support back.
And in contrast... NVIDIA's shit runs on everything, going back 10 years or more. At least to Kepler, if not Fermi or Tesla uarch (GTX 8800 series). It may not run great, but CUDA is CUDA, feature support has been a nearly forward ratchet, and they've had PTX to provide a rosetta stone in between the various architectural changes.
I mean... at the end of the day the hardware hasn't changed that much, surely you can provide a shader fallback (which is required anyway since RDNA2 doesn't have tensor acceleration). I don't get what the deal is tbh.
Yeah there was unbelievably terrible lag getting Polaris/RDNA2 generation going. It was jaw droppingly slow to happen! But it did finally happen.
I was afraid all RDNA[n] were going fully unsupported, which felt like a sure invitation of death.
Sounds like there's still a lot of uncertainty, but it also doesnt sound as bad as I'd first feared; it seems like RDNA2+ could probably hopefully possibly work decent well. As opposed to, you have to buy unobtanium hard to find stupidly expensive cards. Seems like it's still playing out, & we dont know what RDNA2+ is good for yet, but it doesnt sound like the walk towards certain death this originally sounded like.
Thanks for the intense hard to develop reply. A lot of in-flight status updates to gather. Appreciated. Really should be better established, what AMD is shooting for & what we can expect.
They did try. A few years ago AMD and Apple collaborated to make OpenCL, which was a pretty half-hearted attempt at building a platform-agnostic GPGPU library. Their heart was in the right place, but that was part of the problem. Nvidia's vertical control over their hardware and software stack gave them insane leveraging power for lots of dedicated use cases (video editing, gaming, 3D rendering, machine learning, etc.)
In the end, even after years of development, OpenCL was just really slow. There wasn't a whole lot of adoption in the backend market, and Apple was getting ready to kick them to the curb anyways. It's a little bit of a shame that AMD got their teeth kicked in for playing Mr. Nice Guy, but they should have know that Nvidia and Apple race for pinks.
Intel and AMD never bothered to move OpenCL beyond bare bones C source, while NVidia not only moved CUDO into a polyglot ecosystem early on, they doubled down on IDE for GPGPU computing and first class libraries.
Google never bothered with OpenCL on Android, pushing their C99 Renderscript dialect instead.
Apple repented themselves of offering OpenGL to Khronos and the direction not going into the way they wanted to.
Those that blame NVidia for their "practices" should rather look into how bad the competition has been from day one.
OpenCL had a bit of a "second-mover curse" where instead of trying to solve one problem (GPGPU acceleration) it tried to solve everything (a generalized framework for heterogeneous dispatch) and it just kinda sucks to actually use. It's not that it's slower or faster, in principle it should be the same speed when dispatched to the hardware (+/- any C/C++ optimization gotchas of course), but it just requires an obscene amount of boilerplate to "draw the first triangle" (or, launch the first kernel), much like Vulkan, and their own solution is still a pretty clos
HIP was supposed to rectify this, but now you're buying into AMD's custom language and its limitations... and there are limitations, things that CUDA can do that HIP can't (texture unit access was an early one - and texture units aren't just for texturing, they're for coalescing all kinds of 2d/3d/higher-dimensional memory access). And AMD has a history of abandoning these projects after a couple years and leaving them behind and unsupported... like their Thrust framework counterpart, Bolt, which hasn't been updated in 8 years now.
The old bit about "Vendor B" leaving behind a "trail of projects designed to pad resumes and show progress to middle managers" still reigns absolutely true with AMD. AMD has a big uphill climb in general to shake this reputation about being completely unserious with their software... and I'm not even talking about drivers here. This is even more the widespread community perception with their GPGPU/ML efforts than with their drivers.
AMD doesn't have a library of warp-level/kernel-level/global "software primitives" like Cuda Unbound or Thrust either. So instead of writing your application, you are writing the primitives library, or writing your own poor implementation of them.
It's just a fractal problem of "the software doesn't exist and AMD would really rather you write it for them" all the way down and nobody wants to do that instead of doing their own work. AMD is the one who benefits from the rewrite, for everyone else it's a "best case scenario it works the same as what we've already got", so if AMD isn't gonna do it then pretty much nobody else is gonna leap on it. And then AMD has poor adoption and no software and the cycle continues.
AMD really really just needs to get serious and hire a half dozen engineers to sit there and write this software, cause it's just not going to happen otherwise. It's a drop in the bucket vs the sales to be realized here even in the medium term, like one big ML sale would probably more than pay those salaries. They're not doing it because they're cheap or they're doing it because they're not really serious, take your pick, but, AMD is no longer that broke, they can afford it and it makes financial sense.
Again, not a "nice" thing to say but it's the cold truth here. I feel like I've made some variation on this post about every 6 months for like 5 years now but it's still relevant. If you as a vendor don't care about writing good code for key features/libraries for your product, nobody else is either, and you'll never get uptake. It's the same thing with AMD/ATI not embedding developers with studios to get those optimizations for their architectures. Console lock-in will only get you so far. If you don't care about the product as a vendor, nobody else will either.
It's remarkable how much flak Jensen got for "NVIDIA is a software company now" back in 2009, and how people still don't get it, AMD is not a software company and that's why they keep failing. Writing the framework that turns into StableDiffusion and sells a billion dollars of GPUs is the NVIDIA business model, AMD keeps trying to jump straight to the "sell a billion dollars of GPUs" part and keeps failing.
The ROCm software primatives library is rocPRIM [1] and the ROCm equivalent to Thrust is rocThrust [2]. Though, if you're a user of CUB, it might be easier to use hipCUB [3] to support both platforms.
That's HIP, and you're putting your faith in ongoing support, but, it's cool that those libraries got ported. I bet my old thesis would probably HIPify pretty well.
> as to why AMD can't create a similar dedicated ray-tracing functionality in their chips?
They do and it works well.
> out how Meta and Google are doing this themselves.
Meta and Google develop products everywhere all the time.
The author doesn't play or develop games, so it's okay that he doesn't really know anything or can meaningfully comment on it. He just took what Huang said and Ctrl+C Ctrl+V'd it.
WHY DO PEOPLE CARE ABOUT RAYTRACING?
Photoreal gets financing. For your game, for your crypto thing, whatever. Raytracing makes photoreal demos at pre-financing costs.
Also essential reason why Unreal Engine is appealing. Unity is for people who make games, Unreal is for people who finance games.
WHY DO PEOPLE CARE ABOUT AI-GENERATED CONTENT?
The Darwinian force here is the same: people believe you can make a game (or whatever) out of DALL-E or whatever dogshite. They don't believe you when you say you would hire an artist in Indonesia for the same cost (or whatever). So you'll get financed by saying one and not the other.
The reasons why don't really matter. It's idiosyncratic. You're going to spend the money however it is. Just like every startup.
Also the AI generation thing attracts people who think they're the smartest, hottest shit on Earth. That attitude gets financing. Doesn't mean it reflects reality.
DO THESE TECHNOLOGIES MATTER?
I don't know. What does Ben Thompson know about making fun video games? It's so complicated. I doubt Big Corporate or VC capital is going to have financed the innovative video game that uses raytracing or AI generated content. It's going to be some indie game developer.
> I doubt Big Corporate or VC capital is going to have financed the innovative video game that uses raytracing
Sorry, you don't believe that a AAA game developer is going to take full advantage of the latest high-end GPU capabilities? That's the one thing they have reliably done for the past 25+ years.
Uh, AAA game developers not using NVIDIA-developed tech? Yes. Tessellation was a huge fail that didn't deliver on performance. Geometry shaders were another huge fail. Mesh shaders are shaping up to be pretty unimpressive. Pretty much all of NVIDIA's middleware (Ansel, Flow, HairWorks, WaveWorks, VXGI) haven't really caught on fire; usually they're placed into the game through NVIDIA paying developers large sums of money, and usually ripped out upon the next game release.
What instead happens is that game developers are developing large bunches of tech in-house that exploit new features like compute shaders, something NVIDIA has struggled to keep up with (lagging behind AMD in async compute, switching compute/graphics pipelines has a non-zero cost, lack of FB compression from compute write).
I say all of this as a graphics developer working inside the games industry.
Ben makes a bunch of historical errors, and some pretty critical technical errors, that I consider it to be almost a puff piece for NVIDIA and Jensen.
(NVIDIA definitely did not invent shaders, programmability is a blurry line progressing from increasingly-flexible texture combiners, pioneered by ArtX/ATI, NVIDIA looooved fixed-function back then. And raytracing really does not function like however Ben thinks it does...)
The keyword there was innovative, I'm sorry. Like imagine Portal, an innovative rendering game - now, is there something innovative that is going to be done with a raytracing feature like accelerated screen space reflections? No, probably not, people could render mirrors for ages. It's going to be like, one guy who creates cool gameplay with the hardware raytracing APIs, just like it was a small team that created Portal.
They literally did. They found that Lumen looked better and used less performance then hardware ray tracing. At least thats what they said in a developer interview a few month ago.
> Have you?
No, I haven't. My day job isn't even in the game industry.
I've only dabbled a little after I bought my first VR headset.
The either was because you were dissing the article author by saying they weren't a game dev, with the argument that UE is only used because it looks shiny to management, with is pure nonsense.
I cannot find the interview I remembered so you're both almost certainly right wrt hardware ray tracing.
I still think that their original premise about unreal engine is nonsense however.
Calling it a "valley" almost seems silly. They're returning to normal after a boom perpetuated on smoke.
Nvidia will be fine. Investors don't like to see it because they somehow couldn't comprehend that the growth they saw was completely artificial (how is beyond me), but the company will be fine.
Their latest decision with the 4000 series was smart though. They realized suckers will pay insane amounts for the cards, even disregarding prior crypto. So, make 4000 series insanely expensive. That will drive sales of 3000 series to empty the over-supply and make their relatively lower prices look like a steal.
In the end, they get people to way over-pay on both the 3000 and 4000 series. Double dipping!
> Investors don't like to see it because they somehow couldn't comprehend that the growth they saw was completely artificial (how is beyond me)
This was absolutely fascinating to watch over the past 18 months.
Anyone looking for GPUs starting from February 2021 new exactly what was going on. Cards were being gobbled up by miners. They never made it to any shelf for a consumer to buy; webshops were botted, very few humans had a chance to buy.
Regular consumers only got them on eBay or similar. And it was blindingly obvious that consumers weren't paying four figures markup for certain cards. When ETH skyrocketed to almost $5000, a friend of mine was reselling 3060 Tis he bought for €419 from Nvidia's webshop, using a bot he wrote, and resold them for €1250. His regular buyer took all cards he could get his hands on (dozens per month), and resold them to central European countries (popular among miners) for even more markup.
Again, this was blindingly obvious. Availability of cards followed the ETH price; when ETH dipped in summer 2021, cards became available again. When ETH went up towards the end of the year, my friend was selling used 3080s for €1800 again. Then ETH started to crash again, and suddenly Nvidia was facing a massive oversupply.
The fact that Nvidia to this day refuse to acknowledge the role that miners played in artificially inflating growth is weasily, to say the least.
I think the investors are just hopping stocks. Nvidia was generating extreme profits, now that that's over they'll jump to another hypey thing. Energy perhaps. They don't care about the companies they back. Just about money.
It's causing good companies with a long-term vision to suffer (note I'm not considering Nvidia one of these) and promoting the hollow-shell money grabbers and pyramid schemes.
I don't know how it can be solved though.. We've made this situation ourselves by speeding business up to "the speed of light". Perhaps a bigger role for governments in investment but I know that's cursing in the American church of the free market :)
But in the long term we really have to find some stability back in the system IMO.
IMO the instability was always "really" there, but reduced information flow hid those fluctuations. Maybe we just need to get used to the instability - business is risky and will have ups and downs, responding quickly ultimately makes the system more efficient in the long run.
In the price-sensitive consumer space, price/performance matters a lot. But all the other places, libraries/SDKs/interoperability matters as much or more. Most of all the Stable Diffusion stuff that is appearing is heavily powered by nvidia cores, with AMD support being spotty at best. Same goes for many other areas in AI/ML.
AMD compatibility (via ROCM) for Tensorflow and PyTorch are perfectly workable (albeit a little finicky on old AMD cards). Stories of AMD's ML demise are greatly exaggerated.
Sure, for Tensorflow and PyTorch, you can use shims. But there are countless of other examples. I'm not saying it's impossible, but even you have to agree that the tooling is not as mature and extensive as with nvidia?
AMD upstream their changes: no need for shims for either library. From my point as a dabbler in the arts - most implementations [1] of ML papers use pytorch, tensorflow or other high-level libraries. I'm yet to see one implemented in CUDA directly.
1. e.g. Huggingface transformers or stable diffusion
What makes you think AMD's chips are that much less complex? They hold up well in benchmarks.
And add price to the comparison (since we are commenting on price/performance), and AMD already comes out ahead of Nvidia. Here's an article [1] that basically reproduces AMD's on PR on this, but other sites corroborate this.
Navi 33 (their midrange chip, for like 7600X) will be very cheap indeed - it is both 6nm GCD and 6nm MCD io-dies, with a 128b bus. While the cache hasn't grown this gen - it is much faster apparently (in bandwidth) so it's a throughput vs hitrate tradeoff.
Navi 31, the 7900X or whatever, it kinda depends where the performance lands I think. If they match or beat NVIDIA they can pick their price... I think they would probably pick $1499 or similar to at least come in under some perceived high prices. If the cutdown comes in at $999 with a little more equitable configuration that'd do fine, but I'd expect a pretty decent step between the AMD offerings as well. For $1199 or so, eh maybe I guess, they certainly didn't hold back from pricing RDNA2 high against ampere so it's possible, but it wouldn't satisfy any of the critics lol. People want cheap cards and that's not going to happen until at least miner sellthrough starts to slow down - I figure six months before even AMD starts to really crank navi33 out probably.
(Navi 33 may launch at CES and have availability in like, late feb or something, with production ramping, but I bet it's probably March before we see even AMD willing to go... the miner dump is still just getting started, people seem to have been caught completely unawares... somehow. lmao)
Navi 33 was originally planned to go first, pushing it back tells me they're skittish about miner inventory too imo. So I don't think they're going to launch some super-cheap 6800 non-XT like SKU probably... with one chip I dunno if AMD will go much below $1k either maaaaybe $900. Remember, they have to do that with the big Navi 31 chip, and Jensen ain't wrong about node costs either, people are gonna have to get used to it, 5nm R&D+wafer costs are skyrocketing even compared to 7nm, and they're gonna increase again hugely at 3nm. 6nm is gonna be real real popular for a long long time, because it's the only decently-priced node TSMC makes right now.
You're not wrong, Walter, you're just an asshole. Or at least not wrong in general - the Ada price structure is much more about keeping price competition away from Ampere inventory. Can't launch anything too cheap or it'll push Ampere and miner cards down even further. Even the "cheap" stuff like 4080 12GB is really a 4070 or 4070 ti and probably would be an eye-watering $700 or $800 or something in a little more normal market. He's not wrong that it would have been more expensive than people wanted no matter what, node and R&D costs are getting more and more bullshit and the gains are smaller and have more and more caveats and the old "just make it 2x faster per gen!" is no longer reasonable in a post-Moore's Law and post-Dennard Scaling world.
(calling it a "4060" is bullshit though, because that ignores the contribution of the cache... it's like calling a 6900XT a "RX 480 successor" because it only has a 256b bus. The cache changes things and anyone who's doing the "4060" stuff can be pretty safely discounted as being hysterical.)
This power density on 5nm shown so far with Zen4 and Ada probably will not spare AMD either - I bet they're at 400W or higher as rumored a while back, if they're around the same performance as a 4090. 5nm is just super super dense and clocking it high takes a lot of juice.
Same thing happened during the last crypto boom. It was impossible to find 1000 series cards, and Nvidia saw how much people were willing to pay, so they priced the 2000 series high, just as (former) crypto miners were selling 1000 series cards.
Belittling the community, calling them teenagers, and telling them to "get over it" doesn't actually materialize $1,600 to be able to buy the card though. Especially right in the middle of a period of inflation and economic downturn. And then there's the competition Nvidia is in with themselves given then glut of used 3090 cards on the market, and the 3090 was only selling priced so high because of crypto mining in the first place, and that's gone now.
Who knows, maybe you're right or maybe Nvidia's in for a valley, or Nvidia will end up dropping their prices.
Yeah, I remember a similar fit thrown around the 20xx series release not that long ago. I had a 1080Ti at the time, so i didn't care to upgrade. Comes around the release time of 30xx, suddenly almost everyone i knew and everyone on reddit was upgrading from their 20xx series cards (or complaining how overpriced 30xx series was and how they weren't going to upgrade to it from 20xx).
People say they're pissed, but open their wallets all the same. I wouldn't correlate internet outrage to anything real. Same as every time people say they're going to boycott anything online.
The pricing of their cards are probably fine once you consider how stupid expensive fab costs are going to be, but their PR both communicating that and their weirdass 4080 naming nonsense is still hot garbage.
I do agree though that their long-term fundamentals are fine. They're still reactive enough to be competitive with AMD (another company with strong fundamentals), avoided amateur mistakes like going full cryptobro, and they just generally positioned themselves well in the global market.
Fab expenses haven't raised that much, meanwhile the margin for AIB has gone from 30% to 5%. Nvidia is making huge margins on their chips, probably the most of any company selling to consumers.
the estimate I've seen is that AD102 is 2.2x the cost of GA102, I personally think that might be an undercount. Samsung 8nm was cheap as fuck, NVIDIA is using a customized TSMC N5P with optical shrink, but ironically maybe yields might go up? there were always rumors about samsung 8nm sucking ass
anyway it only gets worse, 5nm design costs were $542b and comparable 3nm design costs are projected at $1.5b, for a "complex NVIDIA GPU" (a Hopper successor I guess). Node costs are probably gonna roughly double again.
6nm is gonna be very popular because it's the only node that isn't zooming upwards in cost faster than the benefits can be justified to the consumer.
Personally I'm very eager to see AMD's upcoming RX 7000-series from a pricing as well as performance standpoint compared to Nvidia's new 4000 series.
As noted in the article, "Nvidia likely has a worse cost structure in traditional rasterization gaming performance for the first time in nearly a decade" due to the much lower die costs that AMD is targeting. I predict Nvidia's share price will continue its downward trajectory - both on the weakness of 4000 series (given high price and surplus 3000 series inventory) and on AMD pricing pressure.
What's more interesting is Nvidia's long term focus on ray tracing and continuing to dedicate more and more precious die space to it. AMD, on the other hand, seems to be more focused on the short term: rasterization performance. It's a bit reminiscent of Apple vs. Samsung, where AMD is Apple as it waits on technologies to become mainstream before full adoption and Nvidia is Samsung as it bets on new technologies (e.g. foldables).
The article kind of led with Crypto, but could have easily started and ended with Crypto. Full Stop.
Overlay Ethereum-USD and Nvidia charts for the past 5 years and what does that tell you? My take is both Nvidia's stock price and Ethereum-USD price were products of helicopter money. Sure some gamers bought cards, but not on the scale to which miners acquired them.
Analysis of mining hash rate post-merge bears this out[0]. With some estimates being $5 to $20 Billion(USD) in newer GPU's that are no longer "profitable."[1] The 'chip' shortage meant Nvidia could not satisfy demand for their products. I posit no amount of chips would have satisfied demand. They, along with many other companies, made record profits during the pandemic. There was just soooo much 'extra' money flowing around with stimulus, overly generous unemployment, PPP, and more.
I can't tell you how many people were YOLO'ing every cent of their stimulus into Crypto and WSB stonks. It was an enormous cohort of people based on the explosion of Robinhood and Coinbase use mentioned in WSB subreddit. Mooning and Diamond hands FTW, eh?
Mining crypto with gainz from a roulette spin with Robinhood options? Yeah, while we are at it, lets buy 4 mining rigs because mining Ethereum is very, very profitable during this time. The thinking that one can literally turn electricity to heat while making money (even while one sleeps no less) and trade options while one is awake. It may look crazy right now, but the subreddits were non-stop talk like this.
There are mining rigs all over my Facebook marketplace and local craigslist now. Crypto winter is here (for who knows how long) and Nvidia is not going to make those types of returns for the extreme foreseeable future. Reality check 101.
> Overlay Ethereum-USD and Nvidia charts for the past 5 years and what does that tell you? My take is both Nvidia's stock price and Ethereum-USD price were products of helicopter money. Sure some gamers bought cards, but not on the scale to which miners acquired them.
While I do get your point, and largely agree, I don't think that stock price is a good comparison. You can overlay also Meta or Amazon stock, and get same conclusions. Last few years were across the board tech bubble.
Agreed, it has been very bubbly across the board. However, I think Nvidia (AMD as well) and Ethereum are especially correlated as first and second order effects seem proportional to incentive feedback into their respective ecosystems. Ethereum has been hounded by the 'wastefulness' of mining, which is to say 'proof of stake' wasn't born in a vacuum. Likewise, Nvidia GPU's were being bought at many multiples of msrp in BULK! Call it a perfect storm, call it what you want... but Nvidia has a tough road ahead.
How many people do you think are 'into' ML enough to spend $1k+ on a SOTA gpu and associated hardware? I am slowly getting there having moved from Colab to my own set up...
However... Ethereum enabled/lured many everyday joes into buying 4+ $1k cards due to the incentive structure of; 'buy mining rig then buy a lambo.' Setting up a GPT-NeoX is orders of magnitude more difficult than pointing GPU compute at Nicehash. I really have a hard time thinking ML will have any meaningful uptake in that regard because the incentive structure isn't the same and its much, much harder.
Big cloud seems to be going there own way in regards to compute. GPU's are great for ML, but that doesn't mean they will always hold the crown. TPU's, NVMe storage, and chiplets may find a better path with newer software.
I just don't see how Nvidia really thrives without drastically reducing price (less margin). I don't think they are dead, but they are in big trouble as are many companies.
Correct, MBP's can run stable diffusion and other ML workloads on non-nvidia hardware. I clearly see this becoming a trend. GPT-J, Neo and NeoX run really well on Colab TPU's, again these are not made by Nvidia.
Training is dominated by Nvidia, I will not question that as most papers I have seen say something similar. I will say that I do not believe training will always be dominated by Nvidia's datacenter options. Two things that will hasten the withdraw from Nvidia; Cuda and hardware advances around the motherboard (ASICs, RAM proximity, PCIe lanes, data transfer planes, etc).
Think about this... what if a company released an ML training/Inference ASIC that used regular DDR4/NVMe, performed like 4 x A100's and cost $8000? Would you be interested? I would! I don't think this is too far off, there has to be someone working on this outside of Google, Apple and Meta.
Lots of people talking about the economics but my take away from the article was that we're on the cusp of something really cool. The combination of ray tracing, physics via ray tracing, and GPT like content generation are all the ingredients needed to make huge immersive worlds. Imagine GTA with every house, office, and sky scraper full of unique realistic looking people doing realistic things in a fully destructible environment.
Offtopic: Stratechery has a podcast at [1] including the translated interview with Nvidia CEO [2]. The weird thing from the marketing/conversion perspective is that the interview translation says "To listen to this interview as a podcast, click the link at the top of this email to add Stratechery to your podcast player." instead of having a direct link in the web page.
if chiplet is the silver bullet to die costs, I don't get why Nvidia won't go for it. After all, Nvidia has smart ppl too. There must be some design tradeoffs that made Nvidia engineers to believe their approach is still the best in terms of perf and area. For 4090 and 4080 price, I think Nvidia strategy is: they are higher tier, with 3000s as the lower tier more affordable option. They feel they can succeed with this strategy in the current market (1 year or so) given the competitive landscape. Then they can use 4090 and 4080s to drive up the ASP and margins. Their biggest motivation is sustain the stock price.
Even AMD is only splitting off IO dies this generation. That's an advantage for sure, because you can push 20% or so of your die off to separate chiplets, which means you can go 25% larger than you otherwise could with a purely monolithic design.
(and in particular you can also push them off to N6 or some other super-cheap node and save your super-expensive N5 allocation for the GCD... at the cost of some power for infinity fabric to send a TB/s of data around between chiplets)
But so far nobody seems to have progressed on splitting the GCD itself... which has always been the actual goal here. Multiple chiplets computing together like one, transparently without software needing to be catered to like SLI.
AMD's general approach seems to be "big cache-coherent interconnect between GCD chiplets", which is generally unsurprising on multiple levels (it's how infinity fabric already works in their CPUs, and it's how SMP generally works in general today almost everywhere) but there still seems to be some barrier to making it work with graphics. There are of course a lot of gotchas - like temporal data or needing access to other parts of the frame that may be in another GPU - but cache-coherent interconnects generally solve this.
But yeah, NVIDIA isn't avoiding some silver bullet here, they've actually been on the forefront of this with NVSwitch since Pascal, that was the first serious "multi-GPU that acts like one" attempt. I don't know why they're not doing it, or at least the "split the IO dies off" thing.
edit: thinking about this a bit more, one reason they may not be doing the MCD/io die idea is because they're doing L2 instead of L3. L3 can be pushed off to a separate die... L2 is really part of the SM, it's a lot "closer". Again, they may have engineering reasons to favor L2 over L3, or it may be a patent thing.
I hadn't heard the term chiplet before, thank you! When I got out of school in the late 90s, system-on-chip (SoC) was a pipe dream. It has issues because memory might have different requirements than CPU, so it's hard to put all of the components on a single die.
I always thought the situation was frankly pretty silly, because I didn't want one big die anyway. I wanted a scaleable grid of chips like the 3D processor from the Terminator 2 movie.
It took over 20 years, but maybe we'll have a chance to get something like that with an open source RISC-V. Maybe an array of matchhead-size CPUs at 1 GHz costing $1 each that could be scaled to hundreds or thousands of cores with content-addressable memory for automatic caching and data locality. Then it could be treated as a single processor with access to a one large shared memory space and we could drop the (contrived) distinction between CPU/GPU/TPU/etc and get back to simple desktop computing rather than dealing with the friction of vertex buffers.
My guess is that Nvidia doesn't want any of that to happen, so works to maintain its dominance in single-die solutions.
"match-head sized chiplets" sort of falls into the chiplet mythos I think. Chiplets aren't magic, they actually increase power usage vs an equivalent monolithic chip, data movement is expensive and the more data you move the more expensive it is. People just think chiplets are efficient because AMD made a huge node leap (GF 12nm to TSMC 7nm is like, more than a full node, probably at least 1.5 if not 2) at the same time, but chiplets have their own costs.
The smaller you split the chiplets, the more data is moving around. And the more power you'll burn. It's not desirable to go super small, you want some reasonably-sized chiplet to minimize data movement.
Even if you keep the chiplets "medium-sized" and just use a lot of them... there is still some new asymptotic efficiency limit where data movement power starts to overwhelm your savings from clocking the chips lower/etc. And there's copper-copper bonding to try and fix that, but that makes thermal density even worse (and boy is Zen4 hot already... 95C under any load). Like everything else, it's just kicking the can down the road, it doesn't solve all the problems forever.
It isn't the case that NVIDIA won't go for a chiplet design, they're working on it but it isn't ready yet. Current expectations for NVIDIA's chiplets to be ready seem to be for either a refresh of the 50xx series or the 60xx series.
They will clearly suffer for a while, a bad year or maybe two, not enough to sink the company or to kill the golden goose.
They are spreading themselves a bit thin with all their projects, but I can see them succeeding with enough of them to be in a better relative position 2-3 years from now.
GPUs are becoming a general-purpose data-plane computing platform, this not simply related to gaming, crypto or AI training, but everything.
There are a number of seriously conflicting stories. Some of them say "the cards were individually tuned and undervolted to run at maximum efficiency to make the most money so they'll be fine" and some say "these cards were overclocked and left to run in a boiling hot shipping container then they washed them off with a water hose".
But I used my Radeon RX 6700 XT for mining nearly 24/7 for about 10 months (between purchase and when it paid itself off), while using it for gaming in between (I'd obviously stop mining). It ran around 65°C during that time. Very low core clocks, but memory was run at close to the maximum recommended speed by AMD's Adrenalin software. At least so far no signs of any problems.
tldr: Cards (like any piece of other electronics) do have a lifespan, but mining doesn't affect that. Cards that are kept clean and in better working conditions will run faster.
NV saw the profits it made from the shortages of the last mining boom, and determined that it was better to make a much higher profit even at the cost of shipping fewer units. They went full bore on the 'costs and tdp don't matter'. They saw that people were willing to pay any price for cards that were 5% faster at a higher power draw.
Only, the mining boom is over, there are a mountain of 3000 series cards floating around out there, and NV is sweating. The flood of used mining cards is going to make this the first good time to buy a GPU in years and no amount of shenanigans with 4000 series cards is going to change that math.
This sounds like a win for nVidia in any case, since it'll entrench their RTX, CUDA, NVENC and DLSS moats, making them ubiquitous targets for games and driving purchasing decisions in the future.
But they'll need to wade out a generation for that - which probably isn't the first time.
>Similarly, if AI applications become democratized and accessible to all enterprises, not just the hyperscalers, then it is Nvidia who will be positioned to pick up the entirety of the long tail.
How does this square with Cloud providers themselves being hyperscalers ? What's to stop Google/AWS/Microsoft's hardware divisions from outpacing Nvidia since those businesses themselves may need the hardware and can then provide it to end users of their cloud platforms.
> I don’t think we could have seen it. I don’t think I would’ve done anything different,
This is a CEO admitting they essentially learned nothing, will not admit they made a wrong choice in light of data, and typically means they will do it again.
They could have seen the writing on the wall about how people were angry about crypto emissions and that proof of stake was being bandied about as a solution (and that Cardano already had it, so it's not just a theory).
I've worked AI into my day-to-day workflow to an extent where I'm simply not worried about NVidia. It's like the early days of the internet, or of computing. I can do things my friends can only dream about.
The key risk to NVidia is if someone catches up to them. Intel. China. AMD. Whomever. However, if they maintain their lead, the demand will come.
I've been hearing about this for a while but has any mainstream game actually demonstrated meaningful realtime graphics streaming? I suspect that kind of like NFTs or Metaverse, most developers see that it's a pretty bad idea in a practical sense, but talk about it because it's easy PR.
People just barely tolerate games which require an internet connection to run for DRM, I can't imagine them appreciating the requirement for the connection to also be fast and stable enough for streaming, especially with handheld PCs starting to catch on. Especially since they'll also want to take down the expensive servers quickly after sales fall off, most likely making the games unplayable.
> People just barely tolerate games which require an internet connection to run for DRM
While some people are generally very loud about this on Reddit and gaming forums, the average person doesn't seem to care. Even people who complain loudly about it online will quietly buy it if it's the only choice. It reminds me of that old screenshot of a Steam group dedicated to "Boycotting Modern Warfare 2" in 2009. The day after release? Most of the group was playing it.
There's a difference between being angry at DRM in general and not being able to play a game because your cheapskate ISP gave you a shitty overbooked connection.
Stadia hit that issue quite a bit. And I haven't really seen cloud titles on Switch (a famously offline console bought by people who really want to play away from stable networks) have significant sales success.
I played with Microsoft Flight Simulator 2020 gamepass streaming. It was fine, games like that don't require fractions of a second response time, it looked like I was watching an interactive high compressed youtube video of the game set to medium, but the whole time I was going "Why don't I just play the prettier version on my PC?"
Is there any kind of precedent where bad news for the semiconductor manufacturers means good news for specific downstream industries? Examples? I.e. the oversupply of chip X allowed industry Y to buy more chips resulting in innovation Z...
Whats stopping them just selling the excess chips to hyperscalers? They already force them to buy "Datacenter" chips (pure price discrimination). Why not cut them a license to sell the gaming chips in a cloud environment.
There is obviously shortages of GPU's for AI training on all public clouds, and the price (due to the above price discrimination) is very high.