Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] An answer to: why is Nvidia making GPUs bigger when games don't need them?
21 points by stephanst on Oct 15, 2022 | hide | past | favorite | 63 comments
I've had this discussion a few times recently including with people who are pretty up-to-date engineers working in IT. It's true that the latest offerings from Nvidia both in the gaming and pro range are pretty mind-boggling in terms of power consumption, size, cost, total VRAM etc... but there is actually a very real need for this sort of thing driven by the ML community, maybe more so than most people expect.

Here is an example from a project I'm working on today. This is the console output of a model I'm training at the moment:

-----------------------------

I1015 20:04:51.426224 139830830814976 supervisor.py:1050] Recording summary at step 107041. INFO:tensorflow:global step 107050: loss = 1.0418 (0.453 sec/step)

I1015 20:04:55.421283 139841985250112 learning.py:506] global step 107050: loss = 1.0418 (0.453 sec/step) INFO:tensorflow:global step 107060: loss = 0.9265 (0.461 sec/step)

I1015 20:04:59.865883 139841985250112 learning.py:506] global step 107060: loss = 0.9265 (0.461 sec/step) INFO:tensorflow:global step 107070: loss = 0.7003 (0.446 sec/step)

I1015 20:05:04.328712 139841985250112 learning.py:506] global step 107070: loss = 0.7003 (0.446 sec/step) INFO:tensorflow:global step 107080: loss = 0.9612 (0.434 sec/step)

I1015 20:05:08.808678 139841985250112 learning.py:506] global step 107080: loss = 0.9612 (0.434 sec/step) INFO:tensorflow:global step 107090: loss = 1.7290 (0.444 sec/step)

I1015 20:05:13.288547 139841985250112 learning.py:506] global step 107090: loss = 1.7290 (0.444 sec/step)

-----------------------------

Check out that last line: with a batch size of 42 images (maximum I can fit on my GPU with 24Gb memory) I randomly get the occasional batch where total loss is more than double the moving average over the last 100 batches!

There's nothing fundamentally wrong with this, but it will throw a wrench in convergence of the model for a number of iterations, and it's probably not going to help in reaching the ideal ultimate state of the model within the number of iterations I have planned.

This is in part due to the fact that I have a fundamentally unbalanced dataset, and need to apply some pretty large label-wise weight rebalancing in the loss function to account for that... but this is the best representation of reality in the case I am working on!

--> The ideal solution RIGHT NOW would be to use a larger batch size, in order to minimise the possibility of getting these large outliers in the training set.

To get the best results in the short term I want to train this system (using this small backbone) with batches of 420 images instead of 42, which would require 240Gb of memory... so 3x Nvidia A100 GPUs for example!

--> Ultimately the next step is to make a version with a backbone that has 5x more parameters, and on the same dataset but scaled at 2x linear resolution... requiring probably around 500Gb of VRAM to have large enough batches to achieve good convergence on all classes! And bear in mind this is a relatively small model, which can be deployed on a stamp-size Intel Movidius TPU and run in real-time directly inside a tiny sensor. For non-realtime inference there are people out there working on models with 1000x more parameters!

So if anyone is wondering why NVIDIA keeps making more and more powerful GPUs, and wondering who could possibly need that much power... this is your answer: the design / development of these GPUs is now being pulled forward 90% by people who need these kinds of solutions for ML ops, and the gaming market is 10% of the real-world "need" for this type of power, where 10 years ago that ratio would have been reversed.




Your premise is not true. Gaming at high resolutions with high frame rates and full settings required powerful GPUs. VR especially demands high frame frame rates to remain immersive while also rendering the scene twice (each eye has a different perspective).

We actually have a long way to go before GPUs can really support fully immersive VR. The new 4090 is just barely enough to handle some racing sims like ACC in VR at high resolution and settings.

Even 3000 series cards couldn’t deliver playable frame rates at 4K with all of the graphical eye candy turned on (RayTracing, etc.) in games like Cyberpunk 2077. As with all games, you can simply play with lower settings and lower resolutions, but the full experience at 4K really does require something like a 4080 or 4090.

And of course, next generation games are being built to take full advantage of this new hardware. It doesn’t make any sense to suggest that GPU vendors should just stop making new progress and expect games to stay at current levels of advancement.


>in games like Cyberpunk 2077

There’s only the same handful of graphics-intensive games that are being brought in every online argument and tech review outlet, like Digital Foundry. I can understand it when viewing games from a pure technological standpoint. But if viewed as actual games, you play through demanding single-player games like Metro, Cyberpunk or Control once, and then move on. The vast majority of produced titles play perfectly fine with barely any noticeable difference between the latest and greatest €2000 GPU, and something from a few years back like GTX1080, or a Playstation 5.

I don’t think there are enough demanding games come out to warrant the purchase, especially if you count only the ones that are actually fun to play as a game, and not just admire raytracing lighting or reflections.


Agreed, my 3080ti struggles to sustain 144fps for my 144hz 4K monitor even in a game like Apex Legends with all settings at low. We definitely need more GPU evolution still!


Forgive me for my ignorance, but I genuinely don’t understand why you want a monitor of this quality at this frame rate. My understanding is most humans only see from 30-60 fps and 60hz, and with that I only ever optimized my equipment for that without stressing about higher graphics quality. Is my understanding wrong?


Everyone is different but it’s generally accepted in the gaming community that the jump from 60 to 144hz is hugely noticeable (and I agree with that, I wouldn't go back to 60hz even though you do adjust to lower refresh rates) but trails off after that except for competitive/pro scenarios. Beyond gaming, things like animations in Windows are noticeably far smoother as well.


First of all, humans can see flickering at 30 Hz and even at 60 Hz (there are different zones in retina with different "specs" some see color, some see motion). But even if it was not visible there is still a huge benefit of higher framerates: the latency between input and output. When you press a button the time to update the screen to reflect your press is 1.5x frame time on average at best (wait till the current frame ends, which will be 0.5 time on average and render the next frame with the input). In practice the output is buffered in many places and the latency can be as long as 10 frames. With 60 Hz that will give you ~166ms latency, which anyone can feel.


Humans can perceive frame rates far above 30-60.


Can you tell me more? Can you cite me sources? (I’m not asking to challenge you I genuinely want to educate myself on this. Like I said I was going off of a casual Google search that told me humans don’t generally notice beyond 60 fps so there’s no point in significantly higher monitors. If this is wrong I’d like sources because the sources will educate me further where the real info is about this!)


I blind tested my ex on my 144hz displays, setting variating combinations of refresh rates between two monitors where the options were 60, 120 and 144.

She could always tell the difference between 60 and 120/144, the way she saw the difference between 120 and 144 was comparing the pointer trail. She didn't notice that animations were smoother though, so that's definitely trained.

I hate myself so I run Linux on all my client machines, and if my display goes to 60 I notice by the first animation on my desktop (which is essentially only the start menu).

If you're curious it's easy to set different refresh rates on your monitors and try it out for yourself, the 60hz thing is bogus.

I also notice an awful lot of lag when movies have panoramic shots, they are less than 30hz. There are workarounds with computed frames in most modern tvs that help with this.


Here is a video from Linus that poses that question. https://www.youtube.com/watch?v=OX31kZbAXsA


It's wild to me that 144fps 4k is the bar now, a decade ago people hotly debated 60fps being significant over 30 (which it certainly is), but I kinda feel like after 60fps QHD the perceptible difference just isn't there, or certainly not worth the procurement and operational cost.


The difference between 60hz and 120hz is massive in gaming. I know this because I play an FPS that usually runs at 120Hz and I really notice when it drops to 60 when the action gets intense.

The difference between gaming at 4K and 1440p or even 1080p however is not that significant imo. Competitive (and many casual) gamers prefer a high frame rate to high resolution.

But as display technology improves people will just come to expect their games should run in native 4k resolution.


Apex is a horribly unoptimized resource hog, I don't think it's a good baseline for this sort of thing, but I get your more general point.


That's a bit missing the point of what he said.

You are correct about the hardware requirements - but his point was that 90% of the applications of this high end hardware ends up being in machine learning (and or crypto mining) and not the tiny ultra-high-end gaming market that can afford to pay several thousands for such cards. Most gaming "rigs" are nowhere near those specs.

Nobody would manufacture such cards in quantity beyond a few "show pieces" only for the few very rich enthusiasts that can afford them if there wasn't this non-gaming market, it simply wouldn't be economical.


The high end market has always been like that even before ML or mining were a thing on GPUs. While yes, the majority of the use nowadays is in those areas, they clearly do sell enough of them to be making a profit even without being the majority of sales (otherwise they would've never made them prior to GPU compute going mainstream), especially considering that the costs of making the high end cards aren't that crazy when the lower end cards are often just the same chip as the high end model but cut down. It's also worth considering that the same chips also go into the workstation cards.

On top of all that, the halo product effect also plays a role, where it's easier to sell even the lower end models when you can claim to have the fastest model of the generation at the high end.

Furthermore, the current generation's high end tends to become the next generation's mid/low end, so it's really just buying into a product that could last a gamer one extra generation, which offsets the costs (ignoring that $1600 every ~2-3 years isn't really that crazy for someone with stable employment).


> Thanks, this is exactly what I meant in a much better formulation :)


And your numbers for these assumptions are based on what?


A few months ago, I thought... ok, this is amazing, an i7 with 8 cores and 32 GB of RAM and an SSD... I'm set for the next decade!

Now I'm just trying to run Stable Diffusion to generate things an it's 30 seconds per iteration at only 512x512 pixels. It looks like I'll have a GPU soon enough if I want to do any development with neural networks.

What an amazing ride... heck, I still remember thinking back in 1980 that the 10 Megabyte Corvus hard drive my friend was installing for a business would NEVER be filled... do you have any idea how much typing that is? ;-)


When I bought my 386 sx/16, they were all out of 10mb drives, so I bought a 20mb, and my dad thought I was crazy, because that was more space than anyone could ever need. :)


> 10 Megabyte Corvus hard drive my friend was installing for a business would NEVER be filled... do you have any idea how much typing that is?

I’ve seen a single PDF which is the internal developer guide to a certain ARM based processor from years ago and that PDF was well over 1,000x in size to that hard drive. Yes, there were some pictures and charts, but most of that was just text, thousand upon thousand pages of detailed specs.


No one is saying games don't need bigger GPUs is this post some sort of bait?

Also Nvidia would probably go more into custom accelerators (tensor cores) as whole industry is moving more towards bigger VRAM and more tf32/bf16 units for AI training over standard fp32 gaming workflow and that's isn't focus of RTX class gaming cards.

And 4090/3090/2080ti is old Titan class card for prosumers mostly focused on 3d rendering/video editing etc. and also gaming for AI pros they are milking sweet money out of quadro lineup.

If you are rly limited by VRAM you should just go fully TPU not wait for some magic consumer VRAM increase as it won't happen for a few years probably (most games in 4k can't utilise 16gb.)


>No one is saying games don't need bigger GPUs is this post some sort of bait?

Not at the rate ML does, and people still do fine playing even latest games with smaller GPUs. Which is why the vast percentage of the buyers for these beefed up cards are ML people, not gamers.


> and people still do fine playing even latest games with smaller GPUs

At 1080p, which is what I was using 20 years ago.


If it ain't broken, don't fix it.

I'd rather have 640x420 with better gameplays than yet-another 4K shit


Why would increasing the monitor resolution decrease gameplay? It only should if your GPU isn’t capable.

I haven’t seen a relatively modern 1080p title that didn’t look better at 4k.


I have a 4K120 display I use on my computer. My 3070 can't play games from the past 7 years at this pixel clock. I'd be happier with 4K240.

Games absolutely need more vector horsepower.


Yeah, agree with 4K240. That would be awesome. But NVIDIA doesn't seem to think so, considering they skimped on DP 2.0 on a GPU that actually has the horsepower to do 4K240 (especially with DLSS 3).

120hz feels so choppy when you're used to 240hz.


> 120hz feels so choppy when you're used to 240hz.

Every time I read comments like this I feel really good for being able to be perfectly happy with my crappy 24'1080p60 monitor. I swear, every time I see a 4k display or, god forbid, (120|240)Hz display in the wild I just close my eyes immediatly and dread that I could no longer be happy with my lesser experience.

This is one expensive treadmill to walk, moreso with gpu manufactors gutting consumers for the last few years.


I don't think it's such an expensive treadmill. With 400 dollars[1] you can already get a quite decent 1440p, IPS, 240hz display. Which isn't much compared with what the rest of a PC costs.

[1] https://www.amazon.com/GIGABYTE-M27Q-Monitor-Display-Respons...


Nvidia doesn't think so simply because you are a tiny niche that has such "requirements" - and also the money to burn for it.

Don't get me wrong, there is nothing wrong with it - just you are simply not the market that is making them money anymore.


It's a $1600 GPU. It's already aimed at that niche.


Exactly.


This has also always been a chicken/egg scenario: even if you accept the premise that current graphics cards are adequate for current games (which many here do not), the next round of games will target more powerful graphics cards when they area available.

If you build it, they will come.


> the next round of games will target more powerful graphics cards when they area available.

I'm hoping the Steam Deck will save us from that, by presenting a fixed target that games have a good reason to optimize for.


The new RTX 4xxx generation still has the same amount of VRAM, so I don't know how it would help you that much.


Unfortunately yes... it looks like Nvidia have caught on to the fact that the consumer market doesn't need more VRAM ;)


If you need larger batch sizes but don't have the VRAM for it, have a look at gradient accumulation (https://kozodoi.me/python/deep%20learning/pytorch/tutorial/2...).

You can accumulate the gradients of multiple batches before doing the weight update step. This allows you to run effectively much larger batch sizes than your GPU would allow without it.


Yep, this is a very valid point and I need to look more into this... which means rebuilding a lot of my toolchain but I think it would ultimately be worth the time investment!


I don't think anyone was wondering and I don't think games are leaving GPU resources unused, but yes. Nevermind the datasets - language models alone barely fit in-memory on A100s.


Did you try playing at 8k with top-of-the-line graphics card? Did you ever see 60FPS on the highest level of detail? Did you ever want 240FPS to avoid artifacts during fast action game? Did you ever wonder how many graphics card are needed for the three-monitor setup? Did you ever want a ray-traced action game?

All these questions need much more powerful GPUs than current top-of-the-line. And game makers know that better graphics often sells the game.


Sorry and how many people actually have 8k displays to begin with? Or have a monitor that is actually capable of those 240Hz refresh rates?

Better graphics sells games, no doubt. But there is also a tradeoff between investing huge amounts of resources into developing/optimizing for what is a tiny ultra-high-end niche that can afford the bleeding edge hardware.

The OP is not disputing this - all he is saying that this stuff is not what drives the GPU market since years ago anymore.


>Did you try playing at 8k with top-of-the-line graphics card?

No, the same way he didn't try 200K / 1000fps. It's not really needed.

Even if some people are so extravagant as to got to having it, they're not enough to drive the sales of such cards at the moment, in fact few even have an 8K display or TV, so the combined expense for card and monitor makes it a non-starter anyway.

Which is why the vast majority of those cards goes to ML people.


None of this has ever been needed.

However, it’s fun as hell


Does the fun come from high resolution and fancy shaders though?

Or is this more like hi-fi zealotry?


120Hz is definitely more fun, for anything reaction based.


What do you mean "games don't need them"?


He means that the average gamer that isn't gaming in 4k at 120Hz and/or going crazy with a high end VR rig (these things are still a tiny minority comparatively) will not fully utilize even the capabilities of the current hardware.

Whereas the huge machine learning (and also cryptomining) markets can't get enough hardware resources. And those companies are buying these GPUs by truckloads, every month.

Gamers are a very vocal but ultimately completely irrelevant market for these manufacturers. They account only for a tiny part of their income.


Yeah, average gamer by definition doesn't by the most expensive hardware configuration available on the market. Just like how the average javascripter doesn't buy the 8000$ Mac Studio configuration.

But, by definition, those cards aren't meant for average gamers, but those that do play at 4K@120Hz and at which they're hugely and massively useful.

I can't quite fathom the baseline misunderstanding of the market here - isn't it obvious that the most expensive piece of hardware isn't meant for average customer?


He means "even though one can make a game that must have 8k/120fps and all the latest shaders and GPU tech" the average gamer doesn't run that, can't afford it at the prices of those Nvidia cards, and is not their target market.


It seems like I've touched a nerve with a lot of people on here by saying "games don't need more powerful GPUs"... let me explain.

My contention is not that it's impossible to use more power in games, or that developers are not working on games that will be able to fill all that compute power... but that higher quality graphics are no longer pulling the gaming industry forward like what was going on 20 years, and also that improvements in graphics are more and more about AI enhancements to image quality and less and less about having the power to push more vertices around in real time.

The market for video games doesn't really care about the top 0.5% of PC gamers that can afford to buy the latest GPU from Nvidia every 2 years. The PS4 is still outselling the PS5, if graphics power was the main criterion for the actual market of video gaming that could not happen...


Most new games will not achieve max resolution + refresh rate of modern screens so there is quite a long way to go to catch up to peak monitor quality.

Also game developers are making games on the basis of the average configuration, not the other way around.


> 42 images (maximum I can fit on my GPU with 24Gb memory)

Probably I'm completely wrong, but surely scaling down those images, at least for the initial iterations, will reduce noise, and give you vastly better batches and faster iteration leading to better training -- then as it converges you would crank up the resolution

Note that scaling doesn't have to be cubic, there are algorithms that crop less 'meaningful' pixels; giving you smaller files with the same resolution


42 image batch size is scaled to the input layer of my DNN, otherwise yes :)


Can someone ELI5 what's the main difference between Nvidia 3000s, 3060, 3070, 3080, 3080Ti, 4000s, etc? I am just very confused about it and honestly I don't want to spend hours to understand something that should be immediately clear.

You can downvote me for being lazy, yes. But my point is that there's a ton of BS marketing going on in the IT industry, and Nvidia is no exception. Look at the 3nm process, which isn't about gate distance anymore. Etc.


The first 1-2 numbers is the generation of Nvidia gpu. The latest 4 generations are 10xx, 20xx, 30xx, and 40xx. The last two numbers differentiate between gpus within a generation. The higher the number the better the gpu, xx70, xx80, and xx90 are common. So the newest gpus that Nvidia are putting out are the 4080 and the 4090, their newest generation and top two performing cards within that generation. In-between generations they usually release a ti variant of the card (3080ti) which is a moderately better version but not a new generation.


4xxx series is newest gen. As it goes they've made it more powerful and new technology to do so.

xxXx = Big X is "power". So 9 is highest, 8, 7, 6, 5 (3090/3080/3070) etc.

Bigger has more compute power, usually more RAM and more power requirements.

That's pretty much it. Figure out what you want to do, and then find a card that does that. Want ML? highest RAM you can get. Want high resolution gaming, high ram, high compute power.


On a simple level the more expensive cards have more transistors, more VRAM, and hence are faster and support running bigger ML models if your models have high VRAM requirements.

The newer generations also have a spate of new gaming, video, and ML technologies that make some aspects of all those things faster. TDLR: More expensive = better.


Where did you get your 90% for ML stat from? I suspect you just made it up becuase it just happens to fit what you experience personally.

I work with software for virtual studios. It's very similar to gaming in a lot of ways (even using Unreal Engine) and we really push GPUs to the limits. I'd still count this in the "for gaming" category as we;re still rendering 3D scene within a strict time limit.


the 90% / 10% image was not a real world stat, it's supposed to be an illustration... but not well worded I'll grant you that.

What you are doing is the same type of work that I would put in the "pro user" bucket like ML. Actually I do the same thing with the same cards to render synthetic dataset for my ML pipeline, including using UE :)


>when games don't need them

Sorry but you are totally wrong. 4k +144hz absolutley needs this kinds of performance. Same as VR


"4k +144hz absolutley needs this kinds of performance"

Yes, but the average gamer doesn't do "4k + 144hz" and even if they wanted to, they couldn't afford to care for those Nvidia GPUs at these prices (often not also the required monitor). So this doesn't explain the niche those cards are addressed to (and sales demographics confirm this).

It's a general statement, not an absolute one about what any conceivable game and obsessive gamers would want/uses.


Where does this idea that average gamers buy the most expensive model of the card on the market come from?

The average Mac user also isn't buying 8000$ Mac Studio with XDR display configuration. This isn't meant for "average users".


Which is also why those cards aren't addressed at the gamer market. Not even the "not average/must have the best" gamer market, because that's not big enough to sustain them: they're sold to ML and crypto




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: