At first I thought this was the 48GB card that had been rumored for a while, but it turns out it's just a sanctions-compliant RTX 6000 Ada for export to China.
> "That's not productive," Raimondo said. "I am telling you if you redesign a chip around a particular cutline that enables them to do AI, I am going to control it the very next day."
So basically she's saying "if you do exactly as much as the rules allow, I'll change the rules to not allow that". That sounds like the general sort of thing that tends to end up with whichever government agency did that being on the losing end of a lawsuit?
Well what she is saying is if you intentionally attempt to circumvent the intent of the regulation they will adjust the regulation to continue to bring about the intent. An awful lot of law and regulation isn’t cut and dry like a computer program but is intent driven. Sounds like Nvidia understood the intent but is trying to knowingly circumvent the intent by being minimally compliant. They can sue but it won’t go their way, and the government could move against them hard for willful non compliance.
I thought the performance thresholds they put in the law were supposed to express just that intent (i.e sell only second rate GPUs to China). If the real intent was to stop all GPU sales to China, why doesn't the law say so?
> knowingly circumvent the intent by being minimally compliant. They can sue but it won’t go their way, and the government could move against them hard for willful non compliance.
In the eyes of the court, there's no "minimally compliant". If they are compliant, they are compliant.
If the agency keeps moving the goalposts to "clarify" intent because it was not sufficiently clear initially, that's on them, but it's also not a failing of Nvidia.
> In the eyes of the court, there's no "minimally compliant". If they are compliant, they are compliant.
Right, so if there is a new export ban then nvidia will be in violation and the agency just clarified that it will create such a ban if nvidia makes it necessary.
The cost of trying to play cat and mouse would be on nvidia loosing lots of R&D time and money and not the agency that only has to copy paste the product details into what is probably a preexisting form.
I’d also note a lot of regulation is not enforced at court, but by the regulator. They get to decide. You can appeal it to a court, but courts also grant the regulators pretty good latitude to interpret behavior as willful non compliance even when it’s done literally compliant but to the extent it enables and facilitates the specific outcome the regulation is intended to prevent. Technologists always have a hard time accepting the idea laws and regulations aren’t absolute or formulaic. Judgement and common sense can also be used and often is in situations where there is an intentional evasiveness.
"We put a speed bump at the dangerous crossing to prevent accidents, but they made a car that has more suspension so that they can still cross it at almost the same speed."
They tried the speed bump, they did the suspension thing, now it’s time to reduce the speed limit and install a traffic camera.
Regulation is always like this. I work in a lot of highly regulated spaces and my work deals with it an awful lot as it’s sensitive stuff. This is how it works. They always try to do the less restrictive language first and once some jackass starts trying to do literalist stuff to circumvent the intent the thumb screws come out.
Why do we end up with absurd overly restrictive regulations? Jackasses who fully knowingly try to circumvent the intent by cutlining the edges of the language. Most regulators try to make the rules loose enough to allow for things they didn’t intend to squash that are in line with the intent not being squashed.
In this case it was obvious what the intent was. They had no intention of punishing gamers in China or whatever other edges. Right or wrong they didn’t want.high end AI enabling GPUs sold to China, so they tried to propose something that captured a looser definition and held it up and said “ok folks this isn’t overly restrictive just remember why we did this and don’t be a jackass.”
Then, of course, they went and were a flagrant jackass thumbing their nose st their regulator. First, that’s just a stupid idea. Second, it’s why we can’t have nice stuff.
> Why do we end up with absurd overly restrictive regulations?
Beacuse instead of making the law, "go 3 mph beacuse I'm fearful of what will happen" fearful people install a speed bump so they don't get called a totalitarian. Then they get mad when people don't go 3 mph, even though that isn't the law, beacuse you know, they are totalitarian.
We aren't trying to prevent China from having <x> number of CUDA cores but we are trying to prevent an outcome with a fuzzy guess at what hardware specifications would prevent the outcome.
Continuing the traffic analogy: The goal is to prevent accidents. To do this you enact speed limits and then someone causes an accident while obeying the speed limit.
The goal is to not strengthen China's military. To do this you enact limits on GPU tensor cores and then China uses these to improve their military.
I think the solution might be to err on the side of extreme caution. I'd export only what is necessary for inference but retain the hardware to train models while also releasing free "Made in America" models to the world.
They are so dangerous they can't have video cards, we know beacuse the same people who encouraged western companies to go offshore to China for decades told us so.
The problem USG is dancing around is that this policy is probably not legal if you laid it out in the plain text, regardless of whether it's strategically desirable. If tomorrow the USG came out and said "the goal is to deny china access to these tools, prevent their ability to train their own models, etc" there would be a case in the WTO for an undeclared trade war and they would probably win, followed by retaliatory sanctions from other WTO members against the US.
AI as a technology isn't inherently military regardless of whether it has military applications - just like the computer isn't inherently military. At most it is an intensely dual-use technology, and honestly just like the computer that is completely underplaying just how radical a shift it's going to bring. It is a neutral tech that has broad applications across all areas of computing, the framing of it as being a "military technology" is kinda inherently and deliberately misleading, other than it being a box that the USG is notionally allowed to regulate.
And you can't just arbitrarily decide who "gets access to the computer" (or some other foundationally-disruptive technology) and who doesn't, in a global interconnected economy. This isn't the iron curtain where you have spheres of influence with no interdependence and you can bar the gates, that ship sailed with globalization and the 90s neoliberal "end of history" consensus. China has pushed through to multipatterned 7nm DUV nodes already, which is quite sufficient for solving the military side of the problem, if the US insists on making it a military problem.
Again though the real problem is this is an undeclared trade war, and the USG could not legally do the things it's trying to do if it just came out and admitted the de-facto regulatory standard it was actually applying, or the overall motives. Putting one over on a trade partner is not a valid reason for denying access to a foundationally-changing technology, according to the treaties the US has signed.
On the other hand... try and stop us. But the rest of the world is not deceived either, just like they're not deceived about any of the US's other realpolitik moves. we're doing it because we can, and because we think it's strategically desirable to stomp down at this moment of foundationally disruptive change.
If you mean banana republic to mean literally every government ever throughout all of history then you’re right. Regulations are imperfect, laws are interpreted, and we don’t live in technopurist nirvana where everything is a literally interpreted smart contract.
What I’m describing is how well functioning regulation occurs in practice. It’s messy and it does depend on people listening to the intent, conforming to the specifics, and not trying to get away with the opposite of the intent by minimally conforming. The fact that it’s not some idealized system escapes no one as undesirable, but at least in the 10,000 years we’ve been working on laws and regulation and governance this is as good as it’s gotten.
The federal government foreign policy is giving full bright scholarships to the best of the best Chinese students so that they get PhDs fully paid at the best US tech universities. And then kicking them out for loosing a lottery 3 times and sending them back to China so that they have no choice but working in products and companies that directly compete to the US.
Selling or not selling tech to China is the least of the US problems.
it's weird because the business purpose already streamlines products for capitalism and not some hard limit on resources and regulations try to do the same but with different inputs but the same artificial boundaries.
> She said traditionally Commerce drew a "cutline" and companies like Nvidia would create a new chip "just below" that line ... "That's not productive," Raimondo said
That seems ridiculous at first glance? I'm assuming a "cutline" here is what the maximum performance is allowed to be, and asking companies to add in a fuzzy "extra" bit there, instead of building right to the line is bizarre?
They want them to self-enforce the administrations decision to not enable Chinese AI work. I don’t think they want them to be taking the Chinese professional market into account at all, thus the snarky statement.
This kind of lawmaking seems obvious insane to logical programmer types, but it happens regularly in government. They don't have to implement anything in code, right?
Another example is anti-money laundering laws. They drew a cutline at $10k cash per deposit, so of course criminals started depositing $9,999 at a time. The legislative answer was to say that avoiding the $10k limit is a new crime called "structuring". So what's the actual limit then? There isn't one, you just have to not seem too suspicious. Attempting to follow the law whilst also using a lot of cash is itself a crime.
The other fun one is the suspicious activity reports banks are required to file if a customer does something, er, suspicious. What that means isn't defined anywhere. But eventually the regulators advised that knowing SARs exist is itself suspicious, so merely asking if the bank has filed a SAR on you can trigger the filing of a SAR on you.
This is already inevitable. China is on a mad rush to gain technological independence. They've already succeeded with in most industries and semiconductors and aviation are two of a small amount that remain temporarily out of China's grasp.
Anything that senator does won't change the trendline of the inevitable.
I believe it when I see it. Lot's of companies in the field now and if I need to take a bet I would say Intel has way better chances to succeed with Gaudi than small vendor x.
Because it's not just a chip, it's everything else too. There were 2 startups in Boston, may be 500 meters apart, doing optical computing. 4 years and there is still no product. "everything else" is a lot. It's hardware and software. Just brilliant idea is not enough.
That's why I provided the demo link, see the tokens/second. They are running LLaMA 70B at about 260T/s without quantization. This is the fastest LLaMA2 model
They are already succeeding, it's a matter of time.
They can already build functional smartphones without relying on Taiwan at all. It's not a huge technology leap from a smartphone CPU to a consumer GPU. Design and engineering leap, sure but nothing a dozen PhDs can't solve in a few years.
Centrally planned economies don't suffer too badly in Strategic Industries, especially not when trying to catch up to the free countries, as long as there's at least some level of capitalism involved. The goals are very clear and they can steal tech quite easily, so the lack of incentives to take risks and innovate don't matter. The USSR went from agricultural backwater to nuclear power with its own space station quite fast, largely by stealing tech using ideological true believers.
But of course things started to fall apart from there. As the true nature of the USSR became clear to westerners, ideologically motivated spies started to dry up and the KGB was forced to pay large bribes to get information. They focused on military info, and so their economy started to fall behind in other areas. By the end of the 1980s the gap in quality of life became so huge that Yeltsin famously put his head in his hands and cried after visiting an American supermarket.
China is somewhat centrally planned but more capitalist than the USSR was, and doesn't suffer so badly from the underproduction of consumer goods. Some people claim China is really just a capitalist dictatorship that cosplays as communist by this point, which isn't quite right; it's somewhere in the middle.
But the idea that they won't be able to catch up in advanced tech is just wrong. There are tons of patriotic Chinese people with access to high tech firms in the USA and these days you don't even need them, it's sufficient to just hack networks and steal documents in bulk. Sooner or later they will be the only country in the world that's technologically fully self sufficient. It will just have involved making enemies of everyone else.
There are two areas where they could really start to suffer after that:
1. Erratic decision making from the top that screw things up. Already a big problem for them.
2. If the west is able to properly tighten up corporate opsec and the flow of stealable ideas dries up.
Of course, China has lots of smart people who could come up with great ideas and firms if allowed, the prevalence of hard working Chinese in western companies shows that. But that involves taking risks in the expectation of reward. Look at what happened to Jack Ma. China doesn't reward success, it punishes it. The Great Firewall also makes everything way harder for them.
>U.S. Commerce Secretary Gina Raimondo, speaking in an interview with Reuters on Monday, said Nvidia "can, will and should sell AI chips to China because most AI chips will be for commercial applications."
>"What we cannot allow them to ship is the most sophisticated, highest-processing power AI chips, which would enable China to train their frontier models," she added.
>Raimondo said she spoke a week ago to Nvidia CEO Jensen Huang and he was "crystal clear. We don't want to break the rules. Tell us the rules, we'll work with you."
In plain English, that means Nvidia can, will, and should sell products to China below the arbitrary line the US and China have set for the kabuki theater.
Anyone who expects these sanctions to actually damage Chinese ambitions or progress or expects the US to win this cold war are either ignorant or delusional. China won this conflict before it even began.
Sad China mythos propaganda. They haven't won much, and are quickly following suit of other past economies like Japan. Conversion from pure industrialization to advanced economy is hard and the slow down is practically inevitable, especially when China started trying to push antiAmerican geopolitical aspirations againsts their neighbors in the China Sea.
Th only winner here is Mexico who looks to be the onshore of choice out of China supply chains.
I find it fascinating that the US government is being openly hostile like that. Surely they just realize that this will be held against the US for a long time?
> Surely they just realize that this will be held against the US for a long time?
Who can? There are plenty of things people around the world hold against the US. It's not like anyone can actually do something about it and that's all that really matters.
> It's not like anyone can actually do something about it and that's all that really matters.
In the long term they can, by using the US dollar less and less. Which is already happening. One day the US government will wake up unable to print money to pay for things from other countries, and the economy will collapse.
Many of these countries though, finance theirs debts and pay them back in USD, simply because the USD is still the de facto currency in global financial markets.
I believe that's an entirely different issue, tbh. There is no grand strategy or conspiracy behind dedollarization, just a path of least resistance for free and open trade.
I don't like assuming that someone is an idiot, but besides "shitting where you eat" the only explanation I can think of is that someone is betting on WWIII and that it will end as profitability as WWI and WWII.
People who live in the US are usually less aware of how a lot of the international community perceives them as a bully. His view is very common and he finds it completely foreign that the US is even perceived this way.
A lot of people in the US view the US as the heroic peace keeper, and that all actions the US takes against China are for "world security" rather then an attempt to keep the top spot as the #1 economic and military super power.
It's weird because while there is freedom of the press in the US, many US citizens are still strangely misinformed and irrationally patriotic.
I'm not even clear about the mechanism at work here but there is a sort of strange form of control of information at play here. You can sort of sense it in US news. It's genius really how the media can be controlled despite amendments in place ensuring freedom of the press.
For some reason I had a feeling they were referring to something different.
Anyway, I am aware of the whole bubble thing, although I don't think its unique for the US, at least no entirely.
It's quite similar to countries with their own sovereign media. That's why people often think that those other guys hand out talking points to their journalists, when it's actually all about hiring people with good political sense, who feel "the flow" and know what is politically correct.
The difference is that people with 3 digit IQ typically know how dive into foreign media (or even passively exposed to it), while people in the US dominated part of the world just kinda don't even care.
I wonder if it was different when the Fairness Doctrine was in place - probably not, cross-cultural fairness seems like something too complicated to scale well.
News media are private businesses. Control the flow of money or buy the business and you control the narrative.
It's well known that advertising dollars influence decisions, and it's also well known that western news media is owned by only a few huge companies that often are advertising subsidiary companies on the same platform.
Perhaps when China stops their massive human rights abuses and begins to act like a mature world leader, but until then, its turning out to be another 2 bit authoritarian regime who cannot hold a candle to what the U.S.A achieves on the world stage.
> Perhaps when China stops their massive human rights abuses
How do you figure it's not empty propaganda and that it's actually more massive compared to the US "well that's just life" incidents?
Because it's definitely propaganda (even if not an empty one). It's no coincidence that it ramped up when China became an economic competitor to the US. Even the "Uyghurs camps" narrative surged in media just right along with ICE detention camps scandal. And given how muslims were treated by the US after 9/11, there is definitely zero moral high-ground.
Cracking down on dissidents? Capitol riots challenge was failed miserably and was hardly handled any differently than what gets so criticized in "authoritarian" countries.
> cannot hold a candle to what the U.S.A achieves on the world stage.
Jokes about participating in wars all over the planet aside, what did the US actually achieve on the WORLD stage? Not it's influence on European neighborhood and the small corner of North Atlantics, but the WORLD. Cuz IMO, the world is totally burning right now.
It's really hard to figure out what you're implying specifically when you talk in slogans and manifestos, instead of like a normal down to earth person. But I can't shake the feeling that you kinda treat the rest of the world as ochlos to American demos, with an inherent supremacism similar to how 20 century commies went about their global revolution that no one had asked them for.
American exceptionalism is nothing new, and it isn't uncommon. However the approach that you are using to attempt to break through it is not going to be effective in the slightest.
For one, you can call literal propaganda for what it is, but propaganda isn't necessarily wrong on the facts. Pointing out that 'you didn't care about issue X until you did' is not going to get the self-reflection you hope for, because at the end of the day, the propaganda is pointing out a message that cannot be countered since what it points out is true.
Second, anything to do with 'one of your parties did a bad thing' is counterproductive and contrasting the legal ramifications of the riots is not even close to being on par -- the USA has a functioning (if sometimes overhanded and racist) justice system and the fact that the offending parties were prosecuted publicly and not taken to a secret camp and shot shows that even if the perps didn't get their just deserts, it was all above board and it was effective.
'The world' is, again, a dead end. You absolutely cannot win an argument that the USA was somehow not the stabilizing military factor in the west and didn't provide a large portion of the major world stage level achievements since the end of WWII. If it weren't for the $$$trillions/yr on patrolling oceans in carriers, flying sorties, launching sats, and just generally being the hall monitor and the bully, then a whole lot of things would be incredibly different. You can argue for better or worse, but hardly any American would argue that it is for worse, if they honestly accept that they like their modern lifestyle.
That's about how far high beams shine, so it's totally acceptable for cops to give people speeding tickets for doing 55 in a 55 at night with high beams on?
• Use fog lights if you have them.
• Never use your high-beam lights. Using high beam lights causes
glare, making it more difficult for you to see what’s ahead of
you on the road.
”
If you create a rule to accomplish a broader goal then this is a terrible argument because you end up getting rules lawyered and then you have to add sub-rules and sub-sub-rules and then exceptions to those and etc. Eventually you end up enforcing things that are non-sensical just because someone used a loop-hole once and you have to plug it. No one thinks this is a good outcome.
In your metaphor the town wants safe roads -- they don't care if someone is going 54 or 56, and the enforcement should reflect that. If you really think that anyone going over 55.1 should be pulled over no matter what, and that people driving with a blindfold on going 54.9 should not, then I don't know what to say .
Statistics show the least accidents occur at up to 15% over the speed limit, because that is the speed the good drivers go.
I would fully support a speed limit system based on driving skill. However, since we have a purely speed based system those driving the limit should not get tickets.
First of all, I was disagreeing with the metaphor on premise, and thus that it is a 'purely speed based system', because it is not. Emergency vehicles can drive faster, and you can argue with a cop or in court that you had a good reason to drive faster and be relieved of the offense.
If you really believe in only 'the letter of the law' then why do we have courts? It isn't just to determine guilt or innocence, since only some of the courts do that and only some of the time.
Crossing into personal attack will get you banned here, no matter how wrong someone is or you feel they are. We've had to warn you about this before, so please don't do it again.
Also, please don't post flamewar comments. You went way over the line here, even apart from the personal attack. You're as welcome here as anyone else is, but we need you to follow the rules and post in the intended spirit. I realize that's harder when representing a minority view, but there's not much we can do about that—we have to apply the rules more or less evenly.
What we don’t understand is why someone would spend so much time on HN complaining about “westerners” and lobbing ad hominems at anyone critical of China.
Please don't perpetuate flamewars on HN, no matter how wrong someone is or you feel they are. It's not what this site is for, and destroys what it is for.
Intel is well behind AMD and Nvidia in terms of GPU processing power and it might take them a while to catch up.
BUT - is there anything preventing Intel simply adding a ton of RAM to their cards? Why not make a 64GB or 128GB card - even with Intel's underpowered GPU, won't the AI folks all be loving it?
Surely that's not that hard to do and yet would make Intel GPUs appealing, would it not?
Please correct me as I have no technical expertise on this - maybe there is something about GPUs that means its not possible to add lots of RAM?
Nvidia and AMD are notoriously stingy with GPU RAM, and even if there's lots of RAM there's a ridiculously high price.
Is this an angle for Intel to get an advantage? And while they are at it, drop 8 or 16 video encoder cores on there too.
The 3060/4060 16GB is Nvidia's answer to "I'm a hobbyist/tiny business and I need an okay AI rig at a low price." If Intel could jam 32GB onto a card with comparable performance, and still compete on price, they might have a winner. They'd have my money, that's for sure.
I want to run Stable Diffusion at stupid resolutions, and I want it cheap.
I'm curious to know what prevents Intel putting 64GB or 128GB on a GPU - is there a technical limitation that makes it impossible to go to 128GB?
There would certainly be an awkward silence in response from Nvidia.
Given that Intel needs some angle to get ahead in the GPU wars I would have thought adding RAM would be an easy way to do that - especially given Nvidia and AMD seem to dish out RAM like its the most precious substance in the universe.
>especially given Nvidia and AMD seem to dish out RAM like its the most precious substance in the universe.
Well, it's only really nVidia. And the reason it's only nVidia doing this is because they don't want to self-cannibalize; if the 4080 had 24GB literally no prosumer or small organization wanting to play with the tech would even be considering the 4090 (and to a limited extent, the same is true of the 3080 vs. the 3090, but everyone was too busy mining Ethereum to notice that).
As far as the other manufacturers are concerned, there's no objective need for anything over 16GB on midrange cards (2070/3060/4060 equivalents) because that's the kind of GPU in the consoles and nobody releases exclusively for PC any more; you'll have to upgrade your card to have parity when the next generation comes out anyway so why provide more/drive costs up? It's not like AMD or Intel GPUs are in any way able to use that RAM for anything else, and why offer it now and kneecap themselves should they somehow get into a position where they actually can offer a good AI suite and want to sell 24GB cards for 2000+ dollars too?
And nVidia has taken full advantage of that fact, and that's why the 4080 only has 16GB even though the 7900XTX is performance-competitive with it and has the expected memory compliment. The 4080, including the Super, is a sub-par product because it kind of has to be.
Doesn't stable diffusion just start tiling things when you use large resolution sizes?
Not exactly tiled mind you, just "clearly repeating the same thing, with some variation".
Asking because when I was playing with it about some months ago, that's what it did. People explained (at the time) that it was because stable diffusion was trained on um... (from rough memory) 512x512 images.
It does do that but you can generate an image at the normal resolution and then scale it up to run more high-frequency sampling steps to add detail without losing the composition.
Not sure what you mean. I can tell you it pitches a fit if I tell it to go above ~1800x1800 on my 8GB card. I think automatic1111 has a flag to let it page stuff out to main RAM, but then everything is hideously slow.
There I read that Intel Gaudi is behind all NVIDIA products in all metrics except performance per dollar.
Since one can only access these in the Intel cloud, where Intel has essentially made the few of these that exist freely available, isn’t perf per dollar kind of a joke metric? I mean, if they are free to use, they have infinite perf per dollar. But if Intel were to sell them, you were to buy them, and then you were to pay for their power, then wouldn’t perf per dollar look completely different ?
The article seems to try really hard to avoid saying how much perf per watt these have and how expensive they are. Reads a bit like a marketing piece as part of some Intel / data bricks collab (all companies and vendors do these).
> wouldn’t perf per dollar look completely different
Isn't the cost of actually running the card dwarfed by the cost of the card itself? e.g. even if we look at an old card like the A100 at 100% 24/7 that's about 3500 kWh per year which is what $300-$500 depending on the location?
if we Intel could come out with a card that could do anything the A100 does at half the price but 4x power usage it could still be somewhat competitive.
That’d depend on how many of these cards you need to get the same perf as an A100. Most of the results being shown are batch 1 (use the cards to serve 1 user). But in practice you’d use a single A100 to serve thousands of users concurrently (you can’t do that with the Gaudi 2 though).
The article measures that Gaudi 2 is competitive at “latency for batch size 1”, but that isn’t really a metric anyone cares about.
So Intel would need to sell these for much less than half, and then perf per watt would need to be much better (the article measures that it is currently worse than A100).
Comparing these things is hard, “if Intel were to sell these” is already speculation, since they aren’t on sale.
The article is right that perf per dollar is better, but that’s only because Intel is not making money out of these. As a user that’s a red flag, because if that continues these will be discontinued, and then any investment I do right now in supporting these in my SW stack goes to waste.
No, because that's not the only part of the system that matters. This indeed would be useful and Nvidia sure is pushing for more memory (note that they are also continuing development on GPUDirect), but there are limitations. Strapping a SSD onto the side isn't really going to make things any better.
Nvidia's secret sauce is CUDA. AMD has been doing great, but they still got a long way to go here. They caught up on hardware, but that's maybe half the battle? If it weren't, you'd see ML people flocking to AMD for lower prices and simply due to availability.
It's not that these companies are stingy with RAM, it is that GPU RAM isn't the same RAM you throw into your motherboard. They do have differences.
If it were easy, trust me, Nvidia and AMD would both pile on RAM and charge out the wazoo for it. I mean... look at the price difference between an A100 40GB and A100 80GB. They're the same chipset and there's a 2x price difference. Though I should mention the 80GB version is faster memory too.
> look at the price difference between an A100 40GB and A100 80GB
Nvidia is effectively a monopoly. They can charge whatever they want regardless of cost (which is what they are doing according to their reported margins)
> If it were easy, trust me, Nvidia and AMD would both pile on RAM and charge out the wazoo for it. I mean... look at the price difference between an A100 40GB and A100 80GB. They're the same chipset and there's a 2x price difference. Though I should mention the 80GB version is faster memory too
No one nowadays throws a price based on costs + margin for R&D and little extra stay afloat. Those times seem to end somewhere early in 2000s. I wouldn't be surprised at all if that extra 40GB of faster RAM is in 30% cost increase range.
> BUT - is there anything preventing Intel simply adding a ton of RAM to their cards?
The market isn't demanding it.
For gaming customers, modern games don't need more VRAM, 2k-4k textures can fit into 12-16GB of VRAM and 8k textures mostly don't improve the look at a 4k monitor resolution, unless these textures are stretched over large objects. With upscaling technology, Nvidia customers aren't running at 4k native resolution, so 2k textures do fine for 1080p upscaled to 1440p or 1440p upscaled to 4k. Gaming customers want raster performance primarily, assuming VRAM is above 10-12GB.
We will see retail demand for more VRAM when monitor resolutions exceed 4k, and this won't happen on a 1-3 year timeframe.
For AI customers, there's already specialized cards like H100.
> For AI customers, there's already specialized cards like H100
I don't know anything about machine learning but I think that is the question.
Isn't nVidia professional tier like the H100 rare and expensive?
I remember someone here on HN started a sharing club where they said they'd pool resources together to buy these things and allow people to basically timeshare or rent them?
Is this like the situation where nVidia is basically writing your code for you by providing abstractions like it does with nVidia hair works and things like that on the gaming side? Would Intel have to basically develop its own answer to CUDA or could it gang up with AMD on this?
Yes H100s are very expensive (think ~$40k per unit), and very hard to obtain. For this reason they're basically all sent to cloud operators who can sign large bulk deals up front and then rent them out. At that point rubber really hits the road because it's where the hardware is basically auctioned off on an hour by hour basis, so you can easily pay $40k per month for access to the cards because there's so much money flooding into AI right now.
This is why OpenAI seem to have burned through 10s of billions of dollars of Azure compute credits in, like, a few years.
Yes CUDA is basically a giant developer platform made by NVIDIA for their cards. It's libraries, tools, languages, tutorials, and other ecosystem stuff like the fact that lots of people use it. Intel and AMD might or might not need their own answer to it. They need at least parts of an answer but CUDA is a lot more than AI, so if that's all you care about, you can skip parts of it.
Now AMD has been trying to develop a CUDA competitor for a long time but it's clear either their heart isn't in it, or they just really struggle to attract skilled dev plaform people. ROCm seems to suck so hard people would rather pay NVIDIA's monopoly pricing than deal with it, which is impressively bad.
Of course the reason NVIDIA is in this position is that they supported all kinds of R&D with their CUDA effort, and it turned out to be AI that hit big. If you just implements the bits of CUDA that seem most useful you'll always be behind.
> For AI customers, there's already specialized cards like H100.
Which is extremely expensive because Nvidia can charger whatever they want because they effectively are a monopoly. Since power or per card performance doesn't really matter that much yet Intel could in theory compete only on price as long as they manage to cobble something at least semi-decent together.
Yeah, if you tape out a new chip; but that's a pretty vacuous statement. The capacities of GDDR6 modules available do impose a meaningful constraint on the total RAM that any given GPU can be equipped with, and while Intel is still struggling to get any foothold in the GPU market at all, it would be a big risk for them to invest in bringing to market a chip with eg. 4x the memory controller to compute ratio typically used by GPUs just so they could claim to have the cheapest 128GB GPU.
At some point you run out of pins, and the routing gets harder and harder. GDDR6 has very hard constraints on signal path length/quality. It gets expensive very quickly.
It's always amusing how all these LLM or diffusion models work on some beefy Nvidia GPU's or Macbook Air M1's with 16GB of RAM because all that RAM can be utilised by the GPU or the Neural Engine.
Having a ton of VRAM doesn't really have much value without the speed to use it. After a point your calculation becomes heavy enough that you can get by with smaller VRAM and just accessing system memory, paying the data transfer or streaming cost.
> With 48GB GDDR6 memory, RTX 5880 gives data scientists, engineers, and creative professionals the large memory needed to work with large datasets and workloads like rendering, data science, and simulation.
There's some work being done in GPU-accelerated fluid/smoke simulations, like Axiom for Houdini[1] - but to be honest, it seems like the AI use case is very heavily implied. Most simulations are still being done via CPU/RAM.
This is a cut-down RTX A6000 (i.e. same generation) made to fit within the limits imposed by US sanctions against China. I think you're misinterpreting the performance numbers.
Doesn't seem to have pricing info anywhere obvious online, but yeah it probably is because "professional market" plus "its the only high end part they're allowed to get in China now". ;)
Shockingly, it was as far back as 2008 that AMD released the Radeon HD 4800 series that first broke the tflop barrier for GPUs.
As far as our individual perceptions of time though, I agree, it still doesn't feel like that long; but most of the advances in compute performance since then largely feel wasted so that probably has a lot to do with it IMO.
Unnecessary embellishments don't sell cards to businesses as well as they do to young adults?
Edit to add: also consider that other manufacturers besides nVidia need some way to differentiate their products when they're basically all selling the same thing.
Honestly, I'd much prefer a bland card that has actually good (and quiet) cooling to a "winter-edition" white MSI card with blue LEDs that sounds like a helicopter is taking off when you boot up a game.
It's shocking how bad fans on consumer GPUs are. I had to replace the fans on my 3090 with a custom 3D-printed shroud and Noctua fans because the high-pitched whine was just unbearable.
Yes, they phased it out 2018-2020 and have now released three generations of pro GPUs without it. They also dropped the Tesla branding for compute-only counterparts to the pro graphics Quadro parts.
A strange feeling of de javu. Once upon a time Nvidia made a card called Riva TNT, which was a great polished product for the needs of the time, and really pushed their competitor of the day, 3dfx. Now this today's card looks like nothing revolutionary, yet suspiciously close to what the market is seemingly asking for...
https://www.tomshardware.com/pc-components/gpus/nvidia-launc...