It’s been interesting seeing the difference in architecture play out in benchmarks.
For context, there was a lot of hullabaloo a while ago when the Adreno 730 was posting super impressive benchmarks, outpacing Apple’s GPU and putting up a good fight against AMD and NVIDIA’s lower/mid range cards.
Since then, with the Snapdragon X, there’s been a bit of a deflation which has shown the lead flip dramatically when targeting more modern graphics loads. The Adreno now ranks behind the others when it comes to benchmarks that reflect desktop gaming, including being behind Apple’s GPU.
It’ll be interesting to see how Qualcomm moves forward with newer GPU architectures. Whether they’ll sacrifice their mobile lead in the pursuit of gaining ground for higher end gaming.
I'm not surprised the Adreno numbers didn't hold up as well as the rest of the Snapdragon benchmarks. Back in 2013 the Dolphin team blogged about their terrible experiences with the Adreno drivers and vendor support[1]. Ten years later in 2023, the same team blogged about how those same continuing issues led them to completely replace the official Adreno driver with a userland alternative[2].
As it stands today, the only credible names in ARM SOC GPUs seem to be Apple (M chips) & Nvidia (Tegra chips).
Kudos to the Dolphin website developers for keeping 10+ years of blogs & hyperlinks fully functional and properly tagged. They always produce great reading material!
I don't have any data. I'm speaking strictly with the knowledge that Tegra X1 powers the Nintendo Switch and that the Nintendo Switch has a broad base of engine support. Normally, if it were a bad platform to work on, I expect that we'd have heard about it by now from third party developers (e.g.: CELL architecture)
> Whether they’ll sacrifice their mobile lead in the pursuit of gaining ground for higher end gaming.
It's hard to imagine why they'd distract themselves with that, except perhaps with a token small-run demo for pure brand marketing purposes.
Because of Apple's strict vertical integration, there's so much market for them as the de facto manufacturer delivering parts to pretty much every competitor making products that want a high performance/power ratio.
Well it depends which space they want to be taken seriously in. Currently the 741 is very poor when compared to any dGPU or Apple. It only favourably compares to iGPUs.
I believe they have three options
1. Say it’s meant to be like an iGPU, and work on supporting dGPUs to complement it.
2. Say that they want to compete with dGPUs/Apple and risk losing their mobile crown. Which is what Apple did in exchange for one design across all products.
3. Say they want to have a split product portfolio. A more desktop focused GPU for Snapdragon X with a more mobile centric one for 8xx
I think it's going to be 3, but a split between mobile and laptop/desktop without any concern for competing with dGPUs. It makes no sense at all for them to.
If they can give good enough, on par or better with current iGPUs, with a lower power usage and potentially even fanless, they're going to sell a billion of them. They'll be in every Chromebook in the world.
They aren't gunning for Chromebook deployments...these are currently in business laptop models and AMD may have already beaten them in all fronts on some of these per dollar other than ultralight and video playback duration. Lenovo has a A model that can do 16 hours and light gaming. more importantly it runs x86 apps full speed.
I agree these likely will take over the sub $1000 market if given the chance but they are shooting at $1500-2000
Presumably the margins on Chromebook are terrible compared to those of mid to high end laptops. I don't blame them for wanting to start with the higher margin market and eventually work down.
They aren’t fighting for Chromebook territory with this though.
All their comparisons are to the MacBook Air and mid-high end windows laptops because that’s the market they’re gunning for. These are meant to be $1k-2k devices.
I grew up in San Diego and at the time being involved with technology meant living in Qualcomm’s shadow in one way or another.
So I tend to agree that being the reference mobile SoC vendor outside of Cupertino is pretty on brand for their absolute top priority. At Qualcomm if it doesn’t make dollars it doesn’t make sense as we used to say.
And good for them! After a brief flirtation with the idea of becoming a pure play CDMA patent troll they seem to have gotten back to their roots and started doing engineering again. It’s a cool company.
I think there is a market for a dual-world architecture, with loads of dark silicon in a low battery setting (and suffering performance being acceptable) and a high power mode, that is able to compete with regular desktop gpu architectures.
To me it seems as if the selling points of these latest snapdragon chips is high efficiency/battery life and competitive performance, so given the efficiency angle it makes less sense to try to make gaming machines out of them right now. Maybe in the future there will be a gaming oriented snapdragon less concerned about battery life.
HALF of X1's compute is F16 only which is absolutely wasted silicon for most desktop games.
Their entire tiling setup is great for simple mobile games, but (as shown in the article) is also an inefficient approach for desktop games.
64-wide SIMD works well in simple games and offers a FAR better theoretical compute per area, but when things get more complex, it's hard to keep everything filled. This is why Intel is 8-wide, Nvidia is 32-wide, and AMD is 32/32x2/64-wide (and is one reason why the second SIMD didn't improve performance like the theoretical flops said it should).
With the release of the M-series chips, Apple's GPUs stopped ramping up performance as quickly on the simple mobile benchmarks. This is very clear with A17 in Aztec not only falling behind the SD8gen3, but the SD8gen2 too. At the same time, GPU perf/watt has also lagged behind. However, when you switch to something like the somewhat more complex Solar Bay, the Apple GPU pulls ahead.
This is similar to the AMD/Nvidia swap from gaming to hard compute then slowly back to gaming after they split into server and gaming designs.
Assuming that a GPU size and node is similar between different GPUs, then different features which require silicon do it at the expense of other features. It’s always a balancing act.
That’s effectively the big rub between NVIDIA and AMD today with raytracing + tensor support vs pure raster+compute throughput.
Apple just went through a major GPU architecture change [1]. They focused a ton on maximizing for AAA game usage and followed the NVIDIA route to bias towards where they think things are going. At least according to the simplified architecture diagrams for both Apple graphics and Adreno, Apple has more raytracing silicon than Adreno.
It also supports stuff that doesn’t require silicon but does have effects on GPU design like mesh shading or their new dynamic caching that improves occupancy for high draw count games with large uber shaders.
Compared to Adreno that focused more on raw triangle throughput instead, but doesn’t scale as well with complexity. It performs much better on mobile benchmarks that fit that usage pattern, but falls behind with desktop benchmarks that follow Apple’s priorities.
I always thought the road to Ray-Tracing is wrong on mobile. At least in its current form or iteration. But then Apple decided to go with it, I would have expected they have something new. But turns out not.
Is there any reason why these ARM iGPUs are so much worse than iGPUs from Intel and AMD? My 11th gen Intel CPU's Xe graphics completely outpaces my M1 Mac's and something like a Ryzen 5 5600G destroys both.
> Overall, the M1’s GPU starts off very strong here. At both Normal and High settings it’s well ahead of any other integrated GPU, and even a discrete Radeon RX 560X. Only once we get to NVIDIA’s GTX 1650 and better does the M1 finally run out of gas.
No benchmarks, just based on personal usage. I think I found the issue after posting that comment though, which is macOS's unhelpful deprecation of openGL support. The games that I play on macOS used OpenGL and will probably never implement Metal which is a shame. They were Apple Silicon native though, no translation. Games in question are Factorio and RuneScape.
Ah yeah it’s possible individual games do perform poorly.
But in a general sense the integrated GPU in the M series processors is closer in competition to a low/mid discrete GPU than the integrated GPUs in other brands.
“In Adreno tradition, Adreno X1’s first level cache is a dedicated texture cache. Compute accesses bypass the L1 and go to the next level in the cache hierarchy. It’s quite different from current AMD, Nvidia, and Intel GPU architectures, which have a general purpose first level cache with significant capacity. On prior Adreno generations, the GPU-wide L2 cache would have to absorb all compute accesses. Adreno X1 takes some pressure off the L2 by adding 128 KB cluster caches.”
People have been tinkering with L1 cache conditionality since the L1i and L1d split in 1976 but the Qualcomm people are going hard on this and the jury seems out how it’s going to play.
The line between the L1 and the register file has been getting blurrier every year for over a decade and I increasingly have a heuristic around paying the most attention to L2 behavior until the profiles are in but I’m admittedly engaging in alchemy.
Can any serious chip people as opposed to an enthusiastic novice like myself weigh in on how the thinking is shaping up WRT this?
In practice, what gets labelled as the L1 cache in a GPU marketing diagram or 3rd party analysis might well not be that first level of a strict cache hierarchy. That means it’s hard to do any kind of cross-vendor or cross-architecture comparison about what they are or how they work. They’re highly implementation dependent.
In the GPUs I work on, there’s not really a blurred line between the actual L1 and the register file. There’s not even just one register file. Sometimes you also get an L3!
These kinds of implementation specific details are where GPUs find a lot of their PPA today, but they’re (arguably sadly) usually quite opaque to the programmer or enthusiastic architecture analyst.
Most games still don't use DX12 Ultimate features. Some use some ray tracing, but as the article says, this is expensive and should be left off for laptop devices anyway. As for mesh shaders, there is currently one (1) game I know of that uses them. Alan Wake part 2. I think the other features like sampler feedback are also not really used in practice.
If it supports Vulkan 1.2, then it basically supports most of DX12 as well. Very famously Intel's ARC GPUs had terrible DirectX drivers, but good enough Vulkan support that DXVK simply ran better: https://youtu.be/wktbj1dBPFY
As time goes on it feels like native and up-to-date DirectX drivers aren't necessary, even on Windows itself. The era of kowtowing to a d3d9.dll is over; the SPIR-V recompilation era has begun.
Depends on what you want to do. This GPU is impressive for a thin and light laptop with long battery life. It obviously doesn't compare well to large power hungry dedicated GPUs.
DirectX 12 not ultimate still supports most (every?) every game out there. As for "GPU for office work", that's a question left up to specific in-game benchmarks.
Re: the manual driver updates. Recently I put a clean Win11 install on an ASUS Meteor Lake laptop for someone, and Windows downloaded and installed all the latest drivers automatically (along with a bunch of fresh bloatware, natch). Maybe Qualcomm is working with Microsoft so their drivers will get updated the same way?
Yes - and it is certainly possible to export the "final", up-to-date set of drivers via DISM, then build an orthogonal set that you can recursively install via a single one-click pnputil batch file in Audit Mode (Ctrl-Shift-F3 at the top of OOBE).
This is the easiest way to validate benchmarks across neutral, bloatware-free OS versions (at least the ones supported by that SoC, anyway).
I wonder if there's performance being left on the table because of the way programs and games are designed. It's no secret Qualcomm's mobile chips will run like shit when you try to use desktop code on them, because they're designed differently. I wonder if we're seeing aspects of that here. It would explain why Qualcomm convinced their press team of impressive numbers that nobody in the real world has been able to replicate.
There was a whole comic about design differences when porting desktop style games qnd shaders to mobile (I can't find it for the life of me) which was a pretty good beginner's guide to porting that stuck with me.
With my own use case I've noticed very poor compute shader performance on the Snapdragon GPUs. Even worse the drivers are completely unpredictable. The same shader will sometimes run 2x slower for seemingly no good reason at all. I didn't realise games these days relied so much on compute shaders. It's no suprise it doesn't perform as well as it should.
Because you cannot compare between an Apple's iGPU and this chip, while using the same software stack. Because you cannot buy a laptop with this chip and use MacOS.
If they would compare it iwth an Apple iGPU, they'd be comparing two things: the hardware AND the OS, which makes it less clear what is contributing to your benckmark results.
> Because you cannot compare between an Apple's iGPU and this chip, while using the same software stack.
Apple Silicon hardware can run Linux (with unofficial GPU support as of late, although still lacking support for the NPU), and official support for Linux on Snapdragon laptop platforms is supposedly in the works. So we should be able to do a proper comparison as soon as official support is added for both platforms as part of a single mainline kernel release.
Generally this is a correct argument - to compare hardware one needs to use the same OS/software stack. But the argument works the other way around also, if there is no identical software stack possible does it really matter how raw hardware compares? The end user running a game or an application would experience hardware+OS rather than just hardware.
A lot of their testing is running custom OpenCL and Vulkan code, both of which are essentially unsupported on macOS (moltenvk exists, but kinda sucks and adds overhead that would make the comparisons invalid anyways).
This is a hardware deepdive by a couple of uni students and enthusiasts... Some people are interested in things that aren't as shallow as fluctuating performance leads !
Apple build GPUs for their own hardware and nothing else. They could even do without names, it's just another inevitable component that's inside the box.
ARM remains a shitty backwater of unsupportable crap ass nonsense being thrown over the wall.
Qualcomm bought Imageon from AMD in 2009. Sure, they've done some work, made some things somewhat better. But hearing that the graphics architecture is woefully out of date, with terrible compute performance is ghastly unsurprisingly. Trying to see thing thing run games is going to be a sad sad sad story. And that's only 50% the translation layers (which would be amazing if this were Linux and not a Windows or Android device).
For context, there was a lot of hullabaloo a while ago when the Adreno 730 was posting super impressive benchmarks, outpacing Apple’s GPU and putting up a good fight against AMD and NVIDIA’s lower/mid range cards.
Since then, with the Snapdragon X, there’s been a bit of a deflation which has shown the lead flip dramatically when targeting more modern graphics loads. The Adreno now ranks behind the others when it comes to benchmarks that reflect desktop gaming, including being behind Apple’s GPU.
It’ll be interesting to see how Qualcomm moves forward with newer GPU architectures. Whether they’ll sacrifice their mobile lead in the pursuit of gaining ground for higher end gaming.